网络故障导致的MGR节点ERROR
MySQL version: 8.0.18-commercial MySQL Enterprise Server
OS: RHEL 6.3
MGR
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 93801cf6-2dcf-11ea-9190-fa163ee804c9 | mysql08 | 3306 | ERROR | | 8.0.18 |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
1 row in set (0.00 sec)
ERROR 日志中报错如下:
2020-11-27T10:18:37.243674+08:00 0 [ERROR] [MY-011495] [Repl] Plugin group_replication reported: 'This server is not able to reach a majority of members in the group. This server will now block all updates. The server will remain blocked until contact with the majority is restored. It is possible to use group_replication_force_members to force a new group membership.'
2020-11-27T10:20:12.108719+08:00 0 [ERROR] [MY-011505] [Repl] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
2020-11-27T10:20:12.108792+08:00 0 [ERROR] [MY-011712] [Repl] Plugin group_replication reported: 'The server was automatically set into read only mode after an error was detected.'
意思差不多就是,我要连不上group 里面其他的库了,我要GG 了,我自己把自己read only了,,我就 ERROR 吧。。。
正常情况下,重启GROUP_REPLICATION 就好了
mysql> STOP GROUP_REPLICATION;
Query OK, 0 rows affected (1.00 sec)
mysql> START GROUP_REPLICATION;
Query OK, 0 rows affected (4.22 sec)
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 93801cf6-2dcf-11ea-9190-fa163ee804c9 | mysql08 | 3306 | RECOVERING | PRIMARY | 8.0.18 |
| group_replication_applier | e277e47a-e9a6-11e9-ae34-fa163e137c0e | mysql02 | 3306 | ONLINE | PRIMARY | 8.0.18 |
| group_replication_applier | e4ea91a6-e9a6-11e9-b7c0-fa163eb4f8e8 | mysql04 | 3306 | ONLINE | PRIMARY | 8.0.18 |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
过几分钟
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 93801cf6-2dcf-11ea-9190-fa163ee804c9 | mysql08 | 3306 | ONLINE | PRIMARY | 8.0.18 |
| group_replication_applier | e277e47a-e9a6-11e9-ae34-fa163e137c0e | mysql02 | 3306 | ONLINE | PRIMARY | 8.0.18 |
| group_replication_applier | e4ea91a6-e9a6-11e9-b7c0-fa163eb4f8e8 | mysql04 | 3306 | ONLINE | PRIMARY | 8.0.18 |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
牛逼了。。。就这么好了。。
但是,MGR 好像有个重试的次数 group_replication_recovery_retry_count
Command-Line Format | --group-replication-recovery-retry-count=# |
---|---|
System Variable | group_replication_recovery_retry_count |
Scope | Global |
Dynamic | Yes |
SET_VAR Hint Applies | No |
Type | Integer |
Default Value | 10 |
Minimum Value | 0 |
Maximum Value | 31536000 |
The value of this system variable can be changed while Group Replication is running, but the change only takes effect after you stop and restart Group Replication on the group member.
group_replication_recovery_retry_count specifies the number of times that the member that is joining tries to connect to the available donors for distributed recovery before giving up.
emmmm。。。。。默认重试10次都不行,看来这个网络真的是挫的不行啊。。调大些再看看。。。
mysql> set global group_replication_recovery_retry_count= 100;
Query OK, 0 rows affected (0.00 sec)
另外
group_replication_recovery_reconnect_interval 好像是重试时间,默认60s,暂时先不动了就。。。。