pg流复制_异步


实验环境:

主备 主机名 IP地址 OS版本 PG版本
pg93 192.168.150.130 CentOS65 12.3
pg93s 192.168.150.132 CentOS65 12.3
  • 异步流复制

  • 参数配置:

在两个节点上安装好PG软件,并在主节点完成initdb
在postgresql.conf中配置并重启生效,检查主节点参数配置:

postgres@postgres:select name,setting,sourcefile from pg_settings where name in ('wal_level','archive_mode','archive_command','max_wal_senders','wal_keep_segments','hot_standby');
       name        |                            setting                             |       sourcefile        
-------------------+----------------------------------------------------------------+-------------------------
 archive_command   | test ! -f /archive/%f.tar.gz && tar -zcf /archive/%f.tar.gz %p | /pgdata/postgresql.conf
 archive_mode      | on                                                             | /pgdata/postgresql.conf
 hot_standby       | on                                                             | /pgdata/postgresql.conf
 max_wal_senders   | 10                                                             | /pgdata/postgresql.conf
 wal_keep_segments | 500                                                            | /pgdata/postgresql.conf
 wal_level         | replica                                                        | /pgdata/postgresql.conf
(6 rows)

归档命令视自己需要来修改配置,不需要一样
在pg_hba.conf添加策略:
host replication repuser 192.168.150.130/32 md5
host replication repuser 192.168.150.132/32 md5
reload一下生效
[postgres@pg93 pgdata]$ pg_ctl reload -D /pgdata/
server signaled
 
创建专用的流复制账号:
postgres@postgres:create user repuser REPLICATION LOGIN CONNECTION LIMIT 10 ENCRYPTED PASSWORD 'leadx12345';
CREATE ROLE
 

  • 数据拷贝(手动版)

将主库置为备份模式后,将主库数据文件打包传输到备库,然后结束主库的备份模式:

postgres@postgres:select pg_start_backup('leadx_stream');
 pg_start_backup 
-----------------
 1/8A000060
(1 row)
exit
[postgres@pg93 /]$ tar -czvf /home/postgres/pgdata.tar.gz pgdata --exclude=pgdata/pg_wal
.......
[postgres@pg93 ~]$ ls -lh /home/postgres/pgdata.tar.gz
-rw-rw-r-- 1 postgres postgres 295M Dec 20 04:45 /home/postgres/pgdata.tar.gz

postgres@postgres:select pg_stop_backup();
NOTICE:  all required WAL segments have been archived
 pg_stop_backup 
----------------
 1/8A000170
(1 row)
[postgres@pg93 ~]$ scp pgdata.tar.gz root@pg93s:/home/postgres/
root@pg93s's password: 
pgdata.tar.gz                                           100%  295MB  26.8MB/s   00:11 
  • 备配置

备节点解压,手动创建pg_wal目录,并添加备库参数到postgres.conf中(pg12开始,recovery.conf已经取消,所有参数并入posgresql.conf中,并新增两个文件recovery.signal和standby.signal来表示恢复模式还是备库模式,<两个文件都有时为standby>,standby_mode参数不再支持,trigger_file参数修改为prompt_trigger_file)

tar -xvf /home/postgres/pgdata.tar.gz 
--提示文件in the future是因为两个节点时间不一致,备节点时间晚了

--postgresql.conf添加/修改配置:
primary_conninfo = 'host=192.168.150.130 port=6000 user=repuser'
recovery_target_timeline = 'latest'
--pg12之前还需要将standby_mode设置为on,12之后该参数不再支持,通过standby.signal来标识
[postgres@pg93s pgdata]$ touch /pgdata/standby.signal

--配置.pgpass存储密码
[postgres@pg93s pgdata]$ vi ~/.pgpass
192.168.150.130:6000:replication:repuser:leadx12345
192.168.150.130:6000:replication:repuser:leadx12345
[postgres@pg93s pgdata]$ chmod 600 ~/.pgpass
启动备库:

pg_ctl start -D /pgdata

  • 问题1:报找不到pg_tblspc下的文件,原因是原库有表空间,指向非PGDATA,需要把对应的文件也拷过来(进备份模式);
  • 问题2:
    2020-10-06 13:32:11.246 CST [5142] LOG: invalid IP mask "md5": Name or service not known
    2020-10-06 13:32:11.246 CST [5142] CONTEXT: line 91 of configuration file "/pgdata/pg_hba.conf"
    原因:单个IP地址需要写成192.168.150.130/32,不能只写IP
  • 问题3:
    备报错:2020-10-06 13:40:42.564 CST [5181] FATAL: could not connect to the primary server: could not initiate GSSAPI security context: Unspecified GSS failure. Minor code may provide more information
    could not initiate GSSAPI security context: Credentials cache file '/tmp/krb5cc_501' not found
    FATAL: no pg_hba.conf entry for replication connection from host "192.168.150.132", user "repuser", SSL off
    主报错:2020-12-20 05:33:48.281 CST [2480] FATAL: no pg_hba.conf entry for replication connection from host "192.168.150.132", user "repuser", SSL off
    原因:修改pg_hba.conf后需要reload才生效
  • 问题4:2020-10-06 13:44:38.424 CST [5209] FATAL: could not connect to the primary server: fe_sendauth: no password supplied
    WARNING: password file "/home/postgres/.pgpass" has group or world access; permissions should be u=rw (0600) or less
    原因:.pgpass需要在/home/postgres目录下并且权限为600,另外primary_conninfo中配置的host需要和.pgpass中一样格式,不能一个是IP一个是主机名,不会自动转换

备启动成功后,主备查看发送和接收进程:

主:
[postgres@pg93 log]$ ps -fu postgres|grep walsender
postgres   2528   1992  0 05:38 ?        00:00:00 postgres: walsender repuser 192.168.150.132(50533) streaming 1/8D000148
备:
[postgres@pg93s pgdata]$ ps -fu postgres |grep wal
postgres   5250   5247  0 13:47 ?        00:00:01 postgres: walreceiver   streaming 1/8D000148

同时可以在视图上查询到同步状态:

--主节点执行
postgres@postgres:select usename,application_name,client_addr,sync_state from pg_stat_replication;
 usename | application_name |   client_addr   | sync_state 
---------+------------------+-----------------+------------
 repuser | walreceiver      | 192.168.150.132 | async
(1 row)
  • 数据同步测试
--主创建表并插入数据,备可以正常查看到
postgres@findb:create table reptest0126 as select * from pg_class ;
SELECT 451
postgres@findb:select count(*) from reptest0126;
 count 
-------
   451
(1 row)
[postgres@pg93s pgdata]$ psql findb
psql (12.3)
Type "help" for help.

findb=# select count(*) from reptest0126;
 count 
-------
   451
(1 row)
--一个事务同时执行多个DML,可正常同步
postgres@findb:begin;
BEGIN
postgres@findb:delete from reptest0126 where relname='tbl_json';
DELETE 1
postgres@findb:update reptest0126 set relname='tbl_mvcc66' where relname='tbl_mvcc';
UPDATE 1
postgres@findb:insert into reptest0126 select * from pg_class limit 10;
INSERT 0 10
postgres@findb:select count(*) from reptest0126;
 count 
-------
   460
(1 row)

postgres@findb:select relname from reptest0126 where relname in ('tbl_mvcc66','tbl_mvcc');
  relname   
------------
 tbl_mvcc66
 tbl_mvcc
(2 rows)

postgres@findb:commit;
COMMIT
findb=# select count(*) from reptest0126;
 count 
-------
   460
(1 row)

findb=# select relname from reptest0126 where relname in ('tbl_mvcc66','tbl_mvcc');
  relname   
------------
 tbl_mvcc66
 tbl_mvcc
(2 rows)
--如果备库的hot_standby参数为off,则备不支持只读查询(修改后重启生效)
[postgres@pg93s pgdata]$ psql
psql: error: could not connect to server: FATAL:  the database system is starting up

  • pg_basebackup初始化方式
--清理原环境
[postgres@pg93s ~]$ cd /pgdata/
[postgres@pg93s pgdata]$ cp pg_hba.conf postgresql*.conf standby.signal /home/postgres/backup/
[postgres@pg93s pgdata]$ ll /home/postgres/backup/
total 40
-rw------- 1 postgres postgres  4482 Oct  6 17:59 pg_hba.conf
-rw------- 1 postgres postgres   265 Oct  6 17:59 postgresql.auto.conf
-rw------- 1 postgres postgres 26703 Oct  6 17:59 postgresql.conf
-rw-rw-r-- 1 postgres postgres     0 Oct  6 17:59 standby.signal
[postgres@pg93s pgdata]$ pwd
/pgdata
[postgres@pg93s pgdata]$ rm -rf *
[postgres@pg93s pgdata]$
--basebackup
[postgres@pg93s pgdata]$ pg_basebackup -D /pgdata -Fp -Xs -v -P -h pg93 -p 6000 -U repuser
Password: 
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 1/8E000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_3167"
2283300/2283300 kB (100%), 2/2 tablespaces                                         
pg_basebackup: write-ahead log end point: 1/8E000138
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed

--修改参数
和上面一样修改,这里直接把备份的文件放回来即可
[postgres@pg93s pgdata]$ cp /home/postgres/backup/* .
[postgres@pg93s pgdata]$

--启动备,不出意外已正常同步
pg_ctl stop -D /pgdata/

pg_basebackup会先执行一个checkpoint检查点,然后执行一个在线热备并将WAL传送到备端:
参数:
-D 指定备份过来的数据存储的路径(需要不存在或存在但为空,pg_tblspc指向的表空间路径也不能存在文件,并且都需要有读写权限)
-F 指定备份出来的文件的格式,p表示plain,和原文件一样,存放布局也一样;t表示tar,会将数据库打包为tar文件,系统文件是base.tar,其它的是oid.tar,日志为pg_wal.tar
-X 指定备份过程中产生的日志怎么传送到备份端,f表示fetch,表示备份完成后再将日志一次性传到备份端,s表示stream,备份过程中会启动一个WAL sender来实时将增量日志传输到备份端(另外还有一个基准日志传输进程),生产环境通常建议使用stream方式
-v 表示verbose
-P 表示progress,用于显示数据文件和表空间的传输进度(百分比)
-z 表示gzip,用于将备份文件压缩成tar.gz
-Z 表示压缩等级,取值0-9

用例:
将数据库压缩备份后放在/backup下

[postgres@pg93s ~]$ pg_basebackup -D /backup/ -Ft -Xs -P -z -h 192.168.150.130 -p 6000 -U repuser
2283301/2283301 kB (100%), 2/2 tablespaces
[postgres@pg93s backup]$ ls -lh
total 296M
-rw------- 1 postgres postgres 980K Oct  6 18:20 16406.tar.gz
-rw------- 1 postgres postgres 295M Oct  6 18:21 base.tar.gz
-rw------- 1 postgres postgres  17K Oct  6 18:21 pg_wal.tar.gz

--备份过程中查看主库进程
--备前
postgres@postgres:select pid,usename,application_name,sync_state from pg_stat_replication;
 pid  | usename | application_name | sync_state 
------+---------+------------------+------------
 3696 | repuser | walreceiver      | async
(1 row)
--备中
postgres@postgres:select pid,usename,application_name,sync_state from pg_stat_replication;
 pid  | usename | application_name | sync_state 
------+---------+------------------+------------
 3696 | repuser | walreceiver      | async
 4056 | repuser | pg_basebackup    | async
 4061 | repuser | pg_basebackup    | async
(3 rows)