pg流复制_异步
实验环境:
主备 | 主机名 | IP地址 | OS版本 | PG版本 |
---|---|---|---|---|
主 | pg93 | 192.168.150.130 | CentOS65 | 12.3 |
备 | pg93s | 192.168.150.132 | CentOS65 | 12.3 |
-
异步流复制
-
参数配置:
在两个节点上安装好PG软件,并在主节点完成initdb
在postgresql.conf中配置并重启生效,检查主节点参数配置:
postgres@postgres:select name,setting,sourcefile from pg_settings where name in ('wal_level','archive_mode','archive_command','max_wal_senders','wal_keep_segments','hot_standby');
name | setting | sourcefile
-------------------+----------------------------------------------------------------+-------------------------
archive_command | test ! -f /archive/%f.tar.gz && tar -zcf /archive/%f.tar.gz %p | /pgdata/postgresql.conf
archive_mode | on | /pgdata/postgresql.conf
hot_standby | on | /pgdata/postgresql.conf
max_wal_senders | 10 | /pgdata/postgresql.conf
wal_keep_segments | 500 | /pgdata/postgresql.conf
wal_level | replica | /pgdata/postgresql.conf
(6 rows)
归档命令视自己需要来修改配置,不需要一样
在pg_hba.conf添加策略:
host replication repuser 192.168.150.130/32 md5
host replication repuser 192.168.150.132/32 md5
reload一下生效
[postgres@pg93 pgdata]$ pg_ctl reload -D /pgdata/
server signaled
创建专用的流复制账号:
postgres@postgres:create user repuser REPLICATION LOGIN CONNECTION LIMIT 10 ENCRYPTED PASSWORD 'leadx12345';
CREATE ROLE
-
数据拷贝(手动版)
将主库置为备份模式后,将主库数据文件打包传输到备库,然后结束主库的备份模式:
postgres@postgres:select pg_start_backup('leadx_stream');
pg_start_backup
-----------------
1/8A000060
(1 row)
exit
[postgres@pg93 /]$ tar -czvf /home/postgres/pgdata.tar.gz pgdata --exclude=pgdata/pg_wal
.......
[postgres@pg93 ~]$ ls -lh /home/postgres/pgdata.tar.gz
-rw-rw-r-- 1 postgres postgres 295M Dec 20 04:45 /home/postgres/pgdata.tar.gz
postgres@postgres:select pg_stop_backup();
NOTICE: all required WAL segments have been archived
pg_stop_backup
----------------
1/8A000170
(1 row)
[postgres@pg93 ~]$ scp pgdata.tar.gz root@pg93s:/home/postgres/
root@pg93s's password:
pgdata.tar.gz 100% 295MB 26.8MB/s 00:11
-
备配置
备节点解压,手动创建pg_wal目录,并添加备库参数到postgres.conf中(pg12开始,recovery.conf已经取消,所有参数并入posgresql.conf中,并新增两个文件recovery.signal和standby.signal来表示恢复模式还是备库模式,<两个文件都有时为standby>,standby_mode参数不再支持,trigger_file参数修改为prompt_trigger_file)
tar -xvf /home/postgres/pgdata.tar.gz
--提示文件in the future是因为两个节点时间不一致,备节点时间晚了
--postgresql.conf添加/修改配置:
primary_conninfo = 'host=192.168.150.130 port=6000 user=repuser'
recovery_target_timeline = 'latest'
--pg12之前还需要将standby_mode设置为on,12之后该参数不再支持,通过standby.signal来标识
[postgres@pg93s pgdata]$ touch /pgdata/standby.signal
--配置.pgpass存储密码
[postgres@pg93s pgdata]$ vi ~/.pgpass
192.168.150.130:6000:replication:repuser:leadx12345
192.168.150.130:6000:replication:repuser:leadx12345
[postgres@pg93s pgdata]$ chmod 600 ~/.pgpass
启动备库:
pg_ctl start -D /pgdata
- 问题1:报找不到pg_tblspc下的文件,原因是原库有表空间,指向非PGDATA,需要把对应的文件也拷过来(进备份模式);
- 问题2:
2020-10-06 13:32:11.246 CST [5142] LOG: invalid IP mask "md5": Name or service not known
2020-10-06 13:32:11.246 CST [5142] CONTEXT: line 91 of configuration file "/pgdata/pg_hba.conf"
原因:单个IP地址需要写成192.168.150.130/32,不能只写IP - 问题3:
备报错:2020-10-06 13:40:42.564 CST [5181] FATAL: could not connect to the primary server: could not initiate GSSAPI security context: Unspecified GSS failure. Minor code may provide more information
could not initiate GSSAPI security context: Credentials cache file '/tmp/krb5cc_501' not found
FATAL: no pg_hba.conf entry for replication connection from host "192.168.150.132", user "repuser", SSL off
主报错:2020-12-20 05:33:48.281 CST [2480] FATAL: no pg_hba.conf entry for replication connection from host "192.168.150.132", user "repuser", SSL off
原因:修改pg_hba.conf后需要reload才生效 - 问题4:2020-10-06 13:44:38.424 CST [5209] FATAL: could not connect to the primary server: fe_sendauth: no password supplied
WARNING: password file "/home/postgres/.pgpass" has group or world access; permissions should be u=rw (0600) or less
原因:.pgpass需要在/home/postgres目录下并且权限为600,另外primary_conninfo中配置的host需要和.pgpass中一样格式,不能一个是IP一个是主机名,不会自动转换
备启动成功后,主备查看发送和接收进程:
主:
[postgres@pg93 log]$ ps -fu postgres|grep walsender
postgres 2528 1992 0 05:38 ? 00:00:00 postgres: walsender repuser 192.168.150.132(50533) streaming 1/8D000148
备:
[postgres@pg93s pgdata]$ ps -fu postgres |grep wal
postgres 5250 5247 0 13:47 ? 00:00:01 postgres: walreceiver streaming 1/8D000148
同时可以在视图上查询到同步状态:
--主节点执行
postgres@postgres:select usename,application_name,client_addr,sync_state from pg_stat_replication;
usename | application_name | client_addr | sync_state
---------+------------------+-----------------+------------
repuser | walreceiver | 192.168.150.132 | async
(1 row)
-
数据同步测试
--主创建表并插入数据,备可以正常查看到
postgres@findb:create table reptest0126 as select * from pg_class ;
SELECT 451
postgres@findb:select count(*) from reptest0126;
count
-------
451
(1 row)
[postgres@pg93s pgdata]$ psql findb
psql (12.3)
Type "help" for help.
findb=# select count(*) from reptest0126;
count
-------
451
(1 row)
--一个事务同时执行多个DML,可正常同步
postgres@findb:begin;
BEGIN
postgres@findb:delete from reptest0126 where relname='tbl_json';
DELETE 1
postgres@findb:update reptest0126 set relname='tbl_mvcc66' where relname='tbl_mvcc';
UPDATE 1
postgres@findb:insert into reptest0126 select * from pg_class limit 10;
INSERT 0 10
postgres@findb:select count(*) from reptest0126;
count
-------
460
(1 row)
postgres@findb:select relname from reptest0126 where relname in ('tbl_mvcc66','tbl_mvcc');
relname
------------
tbl_mvcc66
tbl_mvcc
(2 rows)
postgres@findb:commit;
COMMIT
findb=# select count(*) from reptest0126;
count
-------
460
(1 row)
findb=# select relname from reptest0126 where relname in ('tbl_mvcc66','tbl_mvcc');
relname
------------
tbl_mvcc66
tbl_mvcc
(2 rows)
--如果备库的hot_standby参数为off,则备不支持只读查询(修改后重启生效)
[postgres@pg93s pgdata]$ psql
psql: error: could not connect to server: FATAL: the database system is starting up
-
pg_basebackup初始化方式
--清理原环境
[postgres@pg93s ~]$ cd /pgdata/
[postgres@pg93s pgdata]$ cp pg_hba.conf postgresql*.conf standby.signal /home/postgres/backup/
[postgres@pg93s pgdata]$ ll /home/postgres/backup/
total 40
-rw------- 1 postgres postgres 4482 Oct 6 17:59 pg_hba.conf
-rw------- 1 postgres postgres 265 Oct 6 17:59 postgresql.auto.conf
-rw------- 1 postgres postgres 26703 Oct 6 17:59 postgresql.conf
-rw-rw-r-- 1 postgres postgres 0 Oct 6 17:59 standby.signal
[postgres@pg93s pgdata]$ pwd
/pgdata
[postgres@pg93s pgdata]$ rm -rf *
[postgres@pg93s pgdata]$
--basebackup
[postgres@pg93s pgdata]$ pg_basebackup -D /pgdata -Fp -Xs -v -P -h pg93 -p 6000 -U repuser
Password:
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 1/8E000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_3167"
2283300/2283300 kB (100%), 2/2 tablespaces
pg_basebackup: write-ahead log end point: 1/8E000138
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed
--修改参数
和上面一样修改,这里直接把备份的文件放回来即可
[postgres@pg93s pgdata]$ cp /home/postgres/backup/* .
[postgres@pg93s pgdata]$
--启动备,不出意外已正常同步
pg_ctl stop -D /pgdata/
pg_basebackup会先执行一个checkpoint检查点,然后执行一个在线热备并将WAL传送到备端:
参数:
-D 指定备份过来的数据存储的路径(需要不存在或存在但为空,pg_tblspc指向的表空间路径也不能存在文件,并且都需要有读写权限)
-F 指定备份出来的文件的格式,p表示plain,和原文件一样,存放布局也一样;t表示tar,会将数据库打包为tar文件,系统文件是base.tar,其它的是oid.tar,日志为pg_wal.tar
-X 指定备份过程中产生的日志怎么传送到备份端,f表示fetch,表示备份完成后再将日志一次性传到备份端,s表示stream,备份过程中会启动一个WAL sender来实时将增量日志传输到备份端(另外还有一个基准日志传输进程),生产环境通常建议使用stream方式
-v 表示verbose
-P 表示progress,用于显示数据文件和表空间的传输进度(百分比)
-z 表示gzip,用于将备份文件压缩成tar.gz
-Z 表示压缩等级,取值0-9
用例:
将数据库压缩备份后放在/backup下
[postgres@pg93s ~]$ pg_basebackup -D /backup/ -Ft -Xs -P -z -h 192.168.150.130 -p 6000 -U repuser
2283301/2283301 kB (100%), 2/2 tablespaces
[postgres@pg93s backup]$ ls -lh
total 296M
-rw------- 1 postgres postgres 980K Oct 6 18:20 16406.tar.gz
-rw------- 1 postgres postgres 295M Oct 6 18:21 base.tar.gz
-rw------- 1 postgres postgres 17K Oct 6 18:21 pg_wal.tar.gz
--备份过程中查看主库进程
--备前
postgres@postgres:select pid,usename,application_name,sync_state from pg_stat_replication;
pid | usename | application_name | sync_state
------+---------+------------------+------------
3696 | repuser | walreceiver | async
(1 row)
--备中
postgres@postgres:select pid,usename,application_name,sync_state from pg_stat_replication;
pid | usename | application_name | sync_state
------+---------+------------------+------------
3696 | repuser | walreceiver | async
4056 | repuser | pg_basebackup | async
4061 | repuser | pg_basebackup | async
(3 rows)