从零搭建MySQL InnoDB Cluster

发布时间:2020-05-13 浏览次数:373

MySQL InnoDB Cluster 是最新GA的MySQL高可用方案,利用MySQL Group Replication和MySQL Shell、MySQL Router可以轻松搭建强壮的高可用方案。

MySQL Shell 是新的mysql 客户端工具支持x protocol和mysql protocol,具备JavaScript和python可编程能力,作为搭建InnoDB Cluster管理工具。

MySQL Router 是访问路由转发中间件,提供应用程序访问的failover能力。

MySQL Group Replication 是最新GA的同步复制方式,具有以下特点:


  • 支持单主和多主模式

  • 基于Paxos算法,实现数据复制的一致性

  • 插件化设计,支持插件检测,新增节点小于集群当前节点主版本号,拒绝加入集群,大于则加入,但无法作为主节点

  • 没有第三方组件依赖

  • 支持全链路SSL通讯

  • 支持IP白名单

  • 不依赖网络多播

如何搭建MySQL InnoDB Cluster



安装mysql server和mysql-shell

rpm -ivh

yum install -y mysql-community-server mysql-shell

启动mysql服务,并修改root密码

[root@10-186-23-95 ~]# /etc/init.d/mysqld start

Initializing MySQL database: [ OK ]

Installing validate password plugin: [ OK ]

Starting mysqld: [ OK ]

[root@10-186-23-95 ~]# mysql -p$(awk '/temporary password/{print $NF}' /var/log/mysqld.log) -e "set password='Actionsky@888'"

如果系统python版本低于2.7,需要升级至2.7,mysqlprovsion命令需要

sh -c 'wget -qO-

http://people.redhat.com/bkabrda/scl_python27.repo >>

/etc/yum.repos.d/scl.repo'

yum install -y python27

scl enable python27 bash

python --version

配置Group Replication模式,dba.configureLocalInstance 会设置必要配置参数并持久化配置

[root@10-186-23-95 ~]# mysqlsh

Welcome to MySQL Shell 1.0.9

Copyright (c) 2016, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its

affiliates. Other names may be trademarks of their respective

owners.

Type '\help', '\h' or '\?' for help, type '\quit' or '\q' to exit.

Currently in JavaScript mode. Use \sql to switch to SQL mode and execute queries.

mysql-js> dba.configureLocalInstance('root@localhost:3306')

Please provide the password for 'root@localhost:3306':

Detecting the configuration file...

Found configuration file at standard location: /etc/my.cnf

Do you want to modify this file? [Y|n]:

Validating instance...

The configuration has been updated but it is required to restart the server.

{

"config_errors": [

{

"action": "restart",

"current": "OFF",

"option": "enforce_gtid_consistency",

"required": "ON"

},

{

"action": "restart",

"current": "OFF",

"option": "gtid_mode",

"required": "ON"

},

{

"action": "restart",

"current": "0",

"option": "log_bin",

"required": "1"

},

{

"action": "restart",

"current": "0",

"option": "log_slave_updates",

"required": "ON"

},

{

"action": "restart",

"current": "FILE",

"option": "master_info_repository",

"required": "TABLE"

},

{

"action": "restart",

"current": "FILE",

"option": "relay_log_info_repository",

"required": "TABLE"

},

{

"action": "restart",

"current": "OFF",

"option": "transaction_write_set_extraction",

"required": "XXHASH64"

}

],

"errors": [],

"restart_required": true,

"status": "error"

}

mysql-js> \q

重启mysql检查配置是OK

[root@10-186-23-95 ~]# service mysqld restart

Stopping mysqld: [ OK ]

Starting mysqld: [ OK ]

[root@10-186-23-95 ~]# mysqlsh

Welcome to MySQL Shell 1.0.9

Copyright (c) 2016, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its

affiliates. Other names may be trademarks of their respective

owners.

Type '\help', '\h' or '\?' for help, type '\quit' or '\q' to exit.

Currently in JavaScript mode. Use \sql to switch to SQL mode and execute queries.

mysql-js> dba.checkInstanceConfiguration('root@localhost:3306')

Please provide the password for 'root@localhost:3306':

Validating instance...

The instance 'localhost:3306' is valid for Cluster usage

{

"status": "ok"

}

注意:以上操作在每个数据库节点都需要做一遍。


创建InnoDB Cluster,执行dba.createCluster('mycluster')会包含以下操作

  • 在连接的实例上创建mysql.mysql_innodb_cluster_metadata存储元数据信息

  • 验证配置信息

  • 将此节点注册成seed节点

  • 创建必要的管理账号

  • 启动 Group Replication

mysql-js> \c root@10.186.23.95 #必须连接某个数据节点

Creating a Session to 'root@10.186.23.95'

Enter password:

Classic Session successfully established. No default schema selected.

mysql-js> dba.createCluster('mycluster')

A new InnoDB cluster will be created on instance 'root@10.186.23.95:3306'.


Creating InnoDB cluster 'mycluster' on 'root@10.186.23.95:3306'...

Adding Seed Instance...

Cluster successfully created. Use Cluster.addInstance() to add MySQL instances.

At least 3 instances are needed for the cluster to be able to withstand up to

one server failure.


<Cluster:mycluster>

mysql-js> var cluster=dba.getCluster('mycluster')

mysql-js> cluster.addInstance('root@10.186.23.96:3306') #添加其他节点

A new instance will be added to the InnoDB cluster. Depending on the amount of

data on the cluster this might take from a few seconds to several hours.

Please provide the password for 'root@10.186.23.96:3306':

Adding instance to the cluster ...

The instance 'root@10.186.23.96:3306' was successfully added to the cluster.

mysql-js> cluster.addInstance('root@10.186.23.94:3306')

A new instance will be added to the InnoDB cluster. Depending on the amount of

data on the cluster this might take from a few seconds to several hours.

Please provide the password for 'root@10.186.23.94:3306':

Adding instance to the cluster ...

The instance 'root@10.186.23.94:3306' was successfully added to the cluster.

mysql-js> cluster.status()

{

"clusterName": "mycluster",

"defaultReplicaSet": {

"name": "default",

"primary": "10.186.23.95:3306",

"status": "OK",

"statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",

"topology": {

"10.186.23.94:3306": {

"address": "10.186.23.94:3306",

"mode": "R/O",

"readReplicas": {},

"role": "HA",

"status": "ONLINE"

},

"10.186.23.95:3306": {

"address": "10.186.23.95:3306",

"mode": "R/W",

"readReplicas": {},

"role": "HA",

"status": "ONLINE"

},

"10.186.23.96:3306": {

"address": "10.186.23.96:3306",

"mode": "R/O",

"readReplicas": {},

"role": "HA",

"status": "ONLINE"

}

}

}

}

安装配置MySQL Router

[root@10-186-23-97 ~]# yum install -y mysql-router

[root@10-186-23-97 ~]# mysqlrouter --bootstrap root@10.186.23.94:3306 --user=mysqlrouter

Please enter MySQL password for root:

Bootstrapping system MySQL Router instance...

MySQL Router has now been configured for the InnoDB cluster 'mycluster'.

The following connection information can be used to connect to the cluster.

Classic MySQL protocol connections to cluster 'mycluster':

- Read/Write Connections: localhost:6446

- Read/Only Connections: localhost:6447

X protocol connections to cluster 'mycluster':

- Read/Write Connections: localhost:64460

- Read/Only Connections: localhost:64470

[root@10-186-23-97 ~]# mysqlsh --uri root@localhost:6446

Creating a Session to 'root@localhost:6446'

Enter password:

Classic Session successfully established. No default schema selected.

Welcome to MySQL Shell 1.0.9

Copyright (c) 2016, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its

affiliates. Other names may be trademarks of their respective

owners.

Type '\help', '\h' or '\?' for help, type '\quit' or '\q' to exit.

Currently in JavaScript mode. Use \sql to switch to SQL mode and execute queries.

mysql-js> \sql

Switching to SQL mode... Commands end with ;

mysql-sql> select @@hostname;

+----------------------------+

| @@hostname |

+----------------------------+

10-186-23-94.actionsky.com |

+----------------------------+

1 row in set (0.00 sec)


可能遇到的问题




  • 节点gtid 异常

如果节点在加入集群前,执行了写操作,加入集群时会报错

Please provide the password for 'root@10.186.23.96:3306':

Adding instance to the cluster ...

Cluster.addInstance: WARNING: Not running locally on the server and can not access its error log.

ERROR:

Group Replication join failed.

ERROR: Error joining instance to cluster: '10.186.23.96@3306' - Query failed. 3092 (HY000): The server is not configured properly to be an active member of the group. Please see more details on error log.. Query: START group_replication (RuntimeError)

节点的错误日志中会出现

2017-05-09T06:49:57.301003Z 0 [ERROR] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: 605da5eb-347d-11e7-b68b-bef8d5ac5be4:1,

cf2fe6ca-3460-11e7-aab3-bef8d5ac5be4:1-7 > Group transactions: 8399a91c-3483-11e7-b68b-bef8d5ac5be4:1-5,

cf2fe6ca-3460-11e7-aab3-bef8d5ac5be4:1-15'

解决办法,登录到此节点执行reset master


  • python版本异常

第一次安装时没有检查python版本,配置实例时出错,但未输出具体错误。

mysql-py> dba.configure_local_instance('root@localhost:3306')

Please provide the password for 'root@localhost:3306':

Detecting the configuration file...

Found configuration file at standard location: /etc/my.cnf

Do you want to modify this file? [Y|n]: Y

Validating instance...

The issues above can be fixed dynamically to get the server ready for InnoDB Cluster.

{

"errors": [],

"restart_required": false,

"status": "error"

}

后来通过查看mysqlsh的日志,发现错误原因

less ~/.mysqlsh/mysqlsh.log

...

2017-05-09 04:24:27: Error: DBA: mysqlprovision exited with error code (1) : ERROR: The __main__ gadget requires Python version 2.7.0 or higher and lower than 4.0.0. The version of Python detected was 2.6.6. You may need to install or redirect the execution of this utility to an environment that includes a compatible Python version.

...


  • 重启节点后需要手动重新加入集群

mysql-js> var cluster = dba.getCluster()

mysql-js> cluster.status()

{

...

"10.186.23.96:3306": {

"address": "10.186.23.96:3306",

"mode": "R/O",

"readReplicas": {},

"role": "HA",

"status": "(MISSING)"

}

}

}

}

mysql-js> cluster.rejoinInstance('root@10.186.23.96:3306')

mysql-js> cluster.status()

{

...

"10.186.23.96:3306": {

"address": "10.186.23.96:3306",

"mode": "R/O",

"readReplicas": {},

"role": "HA",

"status": "ONLINE"

}

}

}

}


  • 集群所有节点发生重启

当集群的所有节点都offline,直接获取集群信息失败,如何重新恢复集群

mysql-js> var cluster=dba.getCluster('mycluster')

Dba.getCluster: This function is not available through a session to a standalone instance (RuntimeError)

执行rebootClusterFromCompleteOutage命令,可恢复集群

mysql-js> dba.rebootClusterFromCompleteOutage('mycluster')

Reconfiguring the cluster 'mycluster' from complete outage...

The instance '10.186.23.96:3306' was part of the cluster configuration.

Would you like to rejoin it to the cluster? [y|N]: y

The instance '10.186.23.94:3306' was part of the cluster configuration.

Would you like to rejoin it to the cluster? [y|N]: y

The cluster was successfully rebooted.


  • 脑裂场景

当集群中有部分节点出现UNREACHABLE状态,此时集群无法做出决策,,会出现以下局面,此时只剩下一个活跃节点,此节点只能提供查询,无法写入,执行写入操作会hang住。

mysql-js> cluster.status()

{

"clusterName": "mycluster",

"defaultReplicaSet": {

"name": "default",

"primary": "10.186.23.94:3306",

"status": "NO_QUORUM",

"statusText": "Cluster has no quorum as visible from '10.186.23.94:3306' and cannot process write transactions. 2 members are not active",

"topology": {

"10.186.23.94:3306": {

"address": "10.186.23.94:3306",

"mode": "R/W",

"readReplicas": {},

"role": "HA",

"status": "ONLINE"

},

"10.186.23.95:3306": {

"address": "10.186.23.95:3306",

"mode": "R/O",

"readReplicas": {},

"role": "HA",

"status": "UNREACHABLE"

},

"10.186.23.96:3306": {

"address": "10.186.23.96:3306",

"mode": "R/O",

"readReplicas": {},

"role": "HA",

"status": "(MISSING)"

}

}

}

}

修复这种状态,需要执行forceQuorumUsingPartitionOf指定当前活跃节点(如果是多个则选择primary node),此时活跃节点可以提供读写操作,然后将其他节点加入此集群。

mysql-js> cluster.forceQuorumUsingPartitionOf('root@10.186.23.94:3306')

Restoring replicaset 'default' from loss of quorum, by using the partition composed of [10.186.23.94:3306]

Please provide the password for 'root@10.186.23.94:3306':

Restoring the InnoDB cluster ...

The InnoDB cluster was successfully restored using the partition from the instance 'root@10.186.23.94:3306'.

WARNING: To avoid a split-brain scenario, ensure that all other members of the replicaset are removed or joined back to the group that was restored.

mysql-js> cluster.status()

{

"clusterName": "mycluster",

"defaultReplicaSet": {

"name": "default",

"primary": "10.186.23.94:3306",

"status": "OK_NO_TOLERANCE",

"statusText": "Cluster is NOT tolerant to any failures. 2 members are not active",

"topology": {

"10.186.23.94:3306": {

"address": "10.186.23.94:3306",

"mode": "R/W",

"readReplicas": {},

"role": "HA",

"status": "ONLINE"

},

"10.186.23.95:3306": {

"address": "10.186.23.95:3306",

"mode": "R/O",

"readReplicas": {},

"role": "HA",

"status": "(MISSING)"

},

"10.186.23.96:3306": {

"address": "10.186.23.96:3306",

"mode": "R/O",

"readReplicas": {},

"role": "HA",

"status": "(MISSING)"

}

}

}

}

mysql-js> cluster.rejoinInstance('root@10.186.23.95:3306')

mysql-js> cluster.rejoinInstance('root@10.186.23.96:3306')


节点有哪状态

  • ONLINE - 节点状态正常。

  • OFFLINE - 实例在运行,但没有加入任何Cluster。

  • RECOVERING - 实例已加入Cluster,正在同步数据。

  • ERROR - 同步数据发生异常。

  • UNREACHABLE - 与其他节点通讯中断,可能是网络问题,可能是节点crash。

  • MISSING 节点已加入集群,但未启动group replication

集群有哪些状态

  • OK – 所有节点处于online状态,有冗余节点。

  • OK_PARTIAL – 有节点不可用,但仍有冗余节点。

  • OK_NO_TOLERANCE – 有足够的online节点,但没有冗余,例如:两个节点的Cluster,其中一个挂了,集群就不可用了。

  • NO_QUORUM – 有节点处于online状态,但达不到法定节点数,此状态下Cluster无法写入,只能读取。

  • UNKNOWN – 不是online或recovering状态,尝试连接其他实例查看状态。

  • UNAVAILABLE – 组内节点全是offline状态,但实例在运行,可能实例刚重启还没加入Cluster。

最后附一张集群状态图,restoreFromCompleteOutage在GA版本改成rebootClusterFromCompleteOutage。

从零搭建MySQL InnoDB Cluster-爱可生

上一篇: MySQL 客户端连不上(1045 错误)原因全解析

下一篇: tcmalloc解决MySQLd实例引发的cpu过高问题

产品试用 产品试用
400-820-6580 免费电话