参考资料:
http://www.clusterdb.com/mysql/replication-and-auto-failover-made-easy-with-mysql-utilities文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
环境介绍:
master: demoenv-trial-1
slaves: demoenv-trial-2 demoenv-trial-3文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
1. 安装 Percona Server,在所有服务器上:文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
$ sudo yum install http://www.percona.com/downloads/percona-release/percona-release-0.0-1.x86_64.rpm $ sudo yum install Percona-Server-shared-compat $ sudo yum install Percona-Server-server-56 $ sudo yum install http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm $ sudo yum install mysql-utilities
2. 配置 /etc/my.cnf,在所有服务器上:
注意:确保 server-id 不同且 report-host 与自身主机名相同文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
$ sudo vim /etc/my.cnf [mysqld] # basic setting datadir = /opt/mysql/data tmpdir = /opt/mysql/tmp socket = /opt/mysql/run/mysqld.sock port = 3306 pid-file = /opt/mysql/run/mysqld.pid # innodb setting default-storage-engine = INNODB innodb_file_per_table = 1 log-bin = /opt/mysql/binlogs/bin-log-mysqld log-bin-index = /opt/mysql/binlogs/bin-log-mysqld.index innodb_data_home_dir = /opt/mysql/data innodb_data_file_path = ibdata1:10M:autoextend innodb_log_group_home_dir = /opt/mysql/data binlog-do-db = testdb # server id server-id=1 # gtids setting binlog-format = ROW log-slave-updates = true gtid-mode = on enforce-gtid-consistency = true report-host = demoenv-trial-1 report-port = 3306 master-info-repository = TABLE relay-log-info-repository = TABLE sync-master-info = 1 # other settings [mysqld_safe] log-error = /opt/mysql/log/mysqld.log pid-file = /opt/mysql/run/mysqld.pid open-files-limit = 8192 [mysqlhotcopy] interactive-timeout
3. 创建所需目录,在所有服务器上:文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
$ sudo mkdir -p /opt/mysql/{data,tmp,run,binlogs,log} $ sudo chown mysql:mysql /opt/mysql/{data,tmp,run,binlogs,log}
4. 初始化数据库,在所有服务器上:文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
$ sudo -i # su - mysql $ mysql_install_db --user=mysql --datadir=/opt/mysql/data/ $ exit # exit $ sudo /etc/init.d/mysql start
5. 创建授权用户 root@'%' 以便通过 mysqlreplicate 来进行主从复制的配置,在所有服务器上:文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
$ mysql -uroot mysql> grant all on *.* to root@'%' identified by 'pass' with grant option; mysql> quit;
6. 创建复制所需的用户,在所有服务器上:文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
$ mysql -uroot mysql> grant replication slave on *.* to 'rpl'@'%' identified by 'rpl'; mysql> quit;
7. 配置主从复制,可选择任意一台服务器操作:文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
[dong.guo@demoenv-trial-1 ~]$ mysql -uroot mysql> use mysql; mysql> drop user root@'demoenv-trial-1'; mysql> quit; [dong.guo@demoenv-trial-1 ~]$ mysqlreplicate --master=root:pass@'demoenv-trial-1':3306 --slave=root:pass@'demoenv-trial-2':3306 --rpl-user=rpl:rpl # master on demoenv-trial-1: ... connected. # slave on demoenv-trial-2: ... connected. # Checking for binary logging on master... # set up replication... # ...done. [dong.guo@demoenv-trial-1 ~]$ mysqlreplicate --master=root:pass@'demoenv-trial-1':3306 --slave=root:pass@'demoenv-trial-3':3306 --rpl-user=rpl:rpl # master on demoenv-trial-1: ... connected. # slave on demoenv-trial-3: ... connected. # Checking for binary logging on master... # set up replication... # ...done.
8. 验证主从复制的数据完整性文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
[dong.guo@demoenv-trial-1 ~]$ mysql -uroot mysql> create database testdb; mysql> quit; [dong.guo@demoenv-trial-1 ~]$ mysql -uroot -ppass -h'demoenv-trial-2' -e 'show databases;' +-------------------+ | Database | +-------------------+ | information_schema| | mysql | | performance_schema| | test | | testdb | +-------------------+ [dong.guo@demoenv-trial-1 ~]$ mysql -uroot -ppass -h'demoenv-trial-3' -e 'show databases;' +-------------------+ | Database | +-------------------+ | information_schema| | mysql | | performance_schema| | test | | testdb | +-------------------+ [dong.guo@demoenv-trial-1 ~]$ mysqlrplshow --master=rpl:rpl@'demoenv-trial-1':3306 --discover-slaves-login=root:pass; # master on demoenv-trial-1: ... connected. # Finding slaves for master: demoenv-trial-1:3306 # Replication Topology Graph demoenv-trial-1:3306 (MASTER) | +--- demoenv-trial-2:3306 - (SLAVE) | +--- demoenv-trial-3:3306 - (SLAVE) [dong.guo@demoenv-trial-1 ~]$ mysqlrplcheck --master=root:pass@'demoenv-trial-1' --slave=root:pass@'demoenv-trial-2' # master on demoenv-trial-1: ... connected. # slave on demoenv-trial-2: ... connected. Test Description Status --------------------- Checking for binary logging on master [pass] Are there binlog exceptions? [WARN] +--------+--------+-----------+ | server | do_db | ignore_db | +--------+--------+-----------+ | master | testdb | | +--------+--------+-----------+ Replication user exists? [pass] Checking server_id values [pass] Checking server_uuid values [pass] Is slave connected to master? [pass] Check master information file [pass] Checking InnoDB compatibility [pass] Checking storage engines compatibility [pass] Checking lower_case_table_names setting [pass] Checking slave delay (seconds behind master) [pass] # ...done.
9. 关闭 master 测试 auto-failover文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
[dong.guo@demoenv-trial-1 ~]$ mysqlfailover --master=root:pass@'demoenv-trial-1':3306 --discover-slaves-login=root:pass --rediscover # Discovering slaves for master at demoenv-trial-1:3306 # Discovering slave at demoenv-trial-2:3306 # Found slave: demoenv-trial-2:3306 # Discovering slave at demoenv-trial-3:3306 # Found slave: demoenv-trial-3:3306 # Checking privileges. # Discovering slaves for master at demoenv-trial-1:3306 MySQL Replication Failover Utility Failover Mode = auto Next Interval = Sun Oct 20 06:58:52 2013 Master Information ------------------ Binary Log File Position Binlog_Do_DB Binlog_Ignore_DB bin-log-mysqld.00000 299 testdb GTID Executed Set 92df196b-3906-11e3-b6b6-000c290d14d7:1 Replication Health Status +-----------------+------+--------+-------+-----------+---------------+ | host | port | role | state | gtid_mode | health | +-----------------+------+--------+-------+-----------+---------------+ | demoenv-trial-1 | 3306 | MASTER | UP | ON | OK | | demoenv-trial-2 | 3306 | SLAVE | UP | ON | OK | | demoenv-trial-3 | 3306 | SLAVE | UP | ON | OK | +-----------------+------+--------+-------+-----------+---------------+ Q-quit R-refresh H-health G-GTID Lists U-UUIDs
然后,执行命令的终端就挂起了,需要另外开启一个终端来关闭master:文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
[dong.guo@demoenv-trial-1 ~]$ mysqladmin -uroot -ppass shutdown
接着,在刚刚挂起的终端上,可以看到:文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
Failover starting in 'auto' mode... # Candidate slave demoenv-trial-2:3306 will become the new master. # Checking slaves status (before failover). # Preparing candidate for failover. # Creating replication user if it does not exist. # Stopping slaves. # Performing STOP on all slaves. # Switching slaves to new master. # Disconnecting new master as slave. # Starting slaves. # Performing START on all slaves. # Checking slaves for errors. # Failover complete. # Discovering slaves for master at demoenv-trial-2:3306 Failover console will restart in 5 seconds. MySQL Replication Failover Utility Failover Mode = auto Next Interval = Sun Oct 20 07:01:25 2013 Master Information ------------------ Binary Log File Position Binlog_Do_DB Binlog_Ignore_DB bin-log-mysqld.00000 299 testdb GTID Executed Set 92df196b-3906-11e3-b6b6-000c290d14d7:1 Replication Health Status +-----------------+------+--------+-------+-----------+---------------+ | host | port | role | state | gtid_mode | health | +-----------------+------+--------+-------+-----------+---------------+ | demoenv-trial-2 | 3306 | MASTER | UP | ON | OK | | demoenv-trial-3 | 3306 | SLAVE | UP | ON | OK | +-----------------+------+--------+-------+-----------+---------------+ Q-quit R-refresh H-health G-GTID Lists U-UUIDs
服务器 'demoenv-trial-2' 变成了新的master。文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
10. 插入数据测试主从复制文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
[dong.guo@demoenv-trial-2 ~]$ mysql -uroot mysql> use testdb; Database changed mysql> CREATE TABLE `hostgroup` ( -> `hostgroup_id` tinyint(4) NOT NULL AUTO_INCREMENT, -> `hostgroup_name` char(20) DEFAULT NULL, -> `hostgroup_next` tinyint(4) NOT NULL, -> `colo_name` char(4) NOT NULL, -> PRIMARY KEY (`hostgroup_id`) -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; Query OK, 0 rows affected (0.10 sec) mysql> quit; [dong.guo@demoenv-trial-2 binlogs]$ mysql -uroot -ppass -h'demoenv-trial-3' testdb -e 'show tables;' +-----------------+ | Tables_in_testdb| +-----------------+ | hostgroup | +-----------------+
11. 尝试将旧的master恢复文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
[dong.guo@demoenv-trial-1 ~]$ sudo /etc/init.d/mysql start [dong.guo@demoenv-trial-1 ~]$ mysqlreplicate --master=root:pass@'demoenv-trial-2':3306 --slave=root:pass@'demoenv-trial-1':3306 # master on demoenv-trial-2: ... connected. # slave on demoenv-trial-1: ... connected. # Checking for binary logging on master... # setting up replication... # ...done.
在刚刚挂起的终端上,可以看到:文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
Replication Health Status +-----------------+------+--------+-------+-----------+---------------+ | host | port | role | state | gtid_mode | health | +-----------------+------+--------+-------+-----------+---------------+ | demoenv-trial-2 | 3306 | MASTER | UP | ON | OK | | demoenv-trial-1 | 3306 | SLAVE | UP | ON | OK | | demoenv-trial-3 | 3306 | SLAVE | UP | ON | OK | +-----------------+------+--------+-------+-----------+---------------+ Q-quit R-refresh H-health G-GTID Lists U-UUIDs
将旧的master恢复文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
[dong.guo@demoenv-trial-1 ~]$ mysqlrpladmin --master=root:pass@'demoenv-trial-2':3306 --new-master=root:pass@'demoenv-trial-1':3306 --demote-master --discover-slaves-login=root:pass switchover # Discovering slaves for master at demoenv-trial-2:3306 # Discovering slave at demoenv-trial-1:3306 # Found slave: demoenv-trial-1:3306 # Discovering slave at demoenv-trial-3:3306 # Found slave: demoenv-trial-3:3306 # Checking privileges. # Performing switchover from master at demoenv-trial-2:3306 to slave at demoenv-trial-1:3306. # Checking candidate slave prerequisites. # Checking slaves configuration to master. # Waiting for slaves to catch up to old master. # Stopping slaves. # Performing STOP on all slaves. # Demoting old master to be a slave to the new master. # Switching slaves to new master. # Starting all slaves. # Performing START on all slaves. # Checking slaves for errors. # Switchover complete. # # Replication Topology Health: +-----------------+------+--------+-------+-----------+---------------+ | host | port | role | state | gtid_mode | health | +-----------------+------+--------+-------+-----------+---------------+ | demoenv-trial-1 | 3306 | MASTER | UP | ON | OK | | demoenv-trial-2 | 3306 | SLAVE | UP | ON | OK | | demoenv-trial-3 | 3306 | SLAVE | UP | ON | OK | +-----------------+------+--------+-------+-----------+---------------+ # ...done.
在刚刚挂起的终端上,可以看到:文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
MySQL Replication Failover Utility Failover Mode = auto Next Interval = Sun Oct 20 07:30:07 2013 Master Information ------------------ Binary Log File Position Binlog_Do_DB Binlog_Ignore_DB bin-log-mysqld.00000 710 testdb GTID Executed Set 8a58172b-1efd-11e3-8cf1-000c2950fe0c:1 [...] Replication Health Status +-----------------+------+--------+-------+-----------+----------------------------------+ | host | port | role | state | gtid_mode | health | +-----------------+------+--------+-------+-----------+----------------------------------+ | demoenv-trial-2 | 3306 | MASTER | UP | ON | OK | | demoenv-trial-1 | 3306 | SLAVE | UP | WARN | Slave is not connected to master.| | demoenv-trial-3 | 3306 | SLAVE | UP | WARN | Slave is not connected to master.| +-----------------+------+--------+-------+-----------+----------------------------------+ Q-quit R-refresh H-health G-GTID Lists U-UUIDs
可以看到failover在手动恢复了旧的master之后已经停止了工作。文章源自运维生存时间-https://www.ttlsa.com/mysql/the-mysql-auto-failover-function-test/
按下Q,然后重启auto-failover:
[dong.guo@demoenv-trial-1 ~]$ mysqlfailover --master=root:pass@'demoenv-trial-1':3306 --discover-slaves-login=root:pass --rediscover Replication Health Status +-----------------+------+--------+-------+-----------+---------------+ | host | port | role | state | gtid_mode | health | +-----------------+------+--------+-------+-----------+---------------+ | demoenv-trial-1 | 3306 | MASTER | UP | ON | OK | | demoenv-trial-2 | 3306 | SLAVE | UP | ON | OK | | demoenv-trial-3 | 3306 | SLAVE | UP | ON | OK | +-----------------+------+--------+-------+-----------+---------------+ Q-quit R-refresh H-health G-GTID Lists U-UUIDs
它又可以工作了。
12. 对于 auto-failover 的简短总结:
它包含在mysql-utilities这个软件包中;
它只能工作在MySQL 5.6 版本上,因为需要GITDs的支持;
它可以自动选取一个slave作为新的master,当现有的master死掉以后;
但是它不能自动将旧的master恢复,并且在手动恢复了旧的master之后也会停止工作。
转自http://heylinux.com/archives/2776.html

1F