redis+Keepalived实现Redis高可用性

默北 Redis826,42314字数 5637阅读18分47秒阅读模式

目前,Redis还没有一个类似于MySQL Proxy或Oracle RAC的官方HA方案。
Redis作者有一个名为Redis Sentinel的计划(http://redis.io/topics/sentinel),据称将会有监控,报警和自动故障转移三大功能,非常不错。但可惜的是短期内恐怕还不能开发完成。
官方的redis集群方案还在开发阶段,3.0.0beta版本已经支持redis cluster功能。
因此,如何在出现故障时自动转移是一个需要解决的问题。
通过对网上一些资料的搜索,有建议采用HAProxy或Keepalived来实现的,事实上如果是做Failover而非负载均衡的话,Keepalived的效率肯定是超过HAProxy的,所以我决定采用Keepalived的方案。

环境介绍:
Master: 10.6.1.143
Slave: 10.6.1.144
Virtural IP Address (VIP): 10.6.1.200文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

设计思路:
当 Master 与 Slave 均运作正常时, Master负责服务,Slave负责Standby;
当 Master 挂掉,Slave 正常时, Slave接管服务,同时关闭主从复制功能;
当 Master 恢复正常,则从Slave同步数据,同步数据之后关闭主从复制功能,恢复Master身份,于此同时Slave等待Master同步数据完成之后,恢复Slave身份。
然后依次循环。文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

需要注意的是,这样做需要在Master与Slave上都开启本地化策略,否则在互相自动切换的过程中,未开启本地化的一方会将另一方的数据清空,造成数据完全丢失。文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

下面,是具体的实施步骤:
在Master和Slave上安装Keepalived
$ sudo apt-get install keepalived文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

首先,在Master上创建如下配置文件:文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

$ sudo vim /etc/keepalived/keepalived.conf
vrrp_script chk_redis { 
                script "/etc/keepalived/scripts/redis_check.sh"   ###监控脚本 
                interval 2                                        ###监控时间 
} 
vrrp_instance VI_1 { 
        state MASTER                            ###设置为MASTER
        interface eth0                          ###监控网卡    
        virtual_router_id 51
        priority 101                            ###权重值
        authentication { 
                     auth_type PASS             ###加密 
                     auth_pass redis            ###密码 
        } 
        track_script { 
                chk_redis                       ###执行上面定义的chk_redis
        } 
        virtual_ipaddress { 
             10.6.1.200                         ###VIP 
        }
        notify_master /etc/keepalived/scripts/redis_master.sh
        notify_backup /etc/keepalived/scripts/redis_backup.sh
        notify_fault  /etc/keepalived/scripts/redis_fault.sh
        notify_stop   /etc/keepalived/scripts/redis_stop.sh 
}

然后,在Slave上创建如下配置文件:文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

$ sudo vim /etc/keepalived/keepalived.conf
vrrp_script chk_redis { 
                script "/etc/keepalived/scripts/redis_check.sh"   ###监控脚本 
                interval 2                                        ###监控时间 
} 
vrrp_instance VI_1 { 
        state BACKUP                                ###设置为BACKUP 
        interface eth0                              ###监控网卡
        virtual_router_id 51 
        priority 100                                ###比MASTRE权重值低 
        authentication { 
                     auth_type PASS 
                     auth_pass redis                ###密码与MASTRE相同
        } 
        track_script { 
                chk_redis                       ###执行上面定义的chk_redis
        } 
        virtual_ipaddress { 
             10.6.1.200                         ###VIP 
        } 
        notify_master /etc/keepalived/scripts/redis_master.sh
        notify_backup /etc/keepalived/scripts/redis_backup.sh
        notify_fault  /etc/keepalived/scripts/redis_fault.sh
        notify_stop   /etc/keepalived/scripts/redis_stop.sh 
}

在Master和Slave上创建监控Redis的脚本文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

$ sudo mkdir /etc/keepalived/scripts
$ sudo vim /etc/keepalived/scripts/redis_check.sh
#!/bin/bash

ALIVE=`/opt/redis/bin/redis-cli PING`
if [ "$ALIVE" == "PONG" ]; then
  echo $ALIVE
  exit 0
else
  echo $ALIVE
  exit 1
fi

编写以下负责运作的关键脚本:
notify_master /etc/keepalived/scripts/redis_master.sh
notify_backup /etc/keepalived/scripts/redis_backup.sh
notify_fault /etc/keepalived/scripts/redis_fault.sh
notify_stop /etc/keepalived/scripts/redis_stop.sh文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

因为Keepalived在转换状态时会依照状态来呼叫:
当进入Master状态时会呼叫notify_master
当进入Backup状态时会呼叫notify_backup
当发现异常情况时进入Fault状态呼叫notify_fault
当Keepalived程序终止时则呼叫notify_stop文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

首先,在Redis Master上创建notity_master与notify_backup脚本:文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

$ sudo vim /etc/keepalived/scripts/redis_master.sh
#!/bin/bash

REDISCLI="/opt/redis/bin/redis-cli"
LOGFILE="/var/log/keepalived-redis-state.log"

echo "[master]" >> $LOGFILE
date >> $LOGFILE
echo "Being master...." >> $LOGFILE 2>&1

echo "Run SLAVEOF cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF 10.6.1.144 6379 >> $LOGFILE  2>&1
sleep 10 #延迟10秒以后待数据同步完成后再取消同步状态

echo "Run SLAVEOF NO ONE cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1
$ sudo vim /etc/keepalived/scripts/redis_backup.sh
#!/bin/bash

REDISCLI="/opt/redis/bin/redis-cli"
LOGFILE="/var/log/keepalived-redis-state.log"

echo "[backup]" >> $LOGFILE
date >> $LOGFILE
echo "Being slave...." >> $LOGFILE 2>&1

sleep 15 #延迟15秒待数据被对方同步完成之后再切换主从角色
echo "Run SLAVEOF cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF 10.6.1.144 6379 >> $LOGFILE  2>&1

接着,在Redis Slave上创建notity_master与notify_backup脚本:文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

$ sudo vim /etc/keepalived/scripts/redis_master.sh
#!/bin/bash

REDISCLI="/opt/redis/bin/redis-cli"
LOGFILE="/var/log/keepalived-redis-state.log"

echo "[master]" >> $LOGFILE
date >> $LOGFILE
echo "Being master...." >> $LOGFILE 2>&1

echo "Run SLAVEOF cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF 10.6.1.143 6379 >> $LOGFILE  2>&1
sleep 10 #延迟10秒以后待数据同步完成后再取消同步状态

echo "Run SLAVEOF NO ONE cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1
$ sudo vim /etc/keepalived/scripts/redis_backup.sh
#!/bin/bash

REDISCLI="/opt/redis/bin/redis-cli"
LOGFILE="/var/log/keepalived-redis-state.log"

echo "[backup]" >> $LOGFILE
date >> $LOGFILE
echo "Being slave...." >> $LOGFILE 2>&1

sleep 15 #延迟15秒待数据被对方同步完成之后再切换主从角色
echo "Run SLAVEOF cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF 10.6.1.143 6379 >> $LOGFILE  2>&1

然后在Master与Slave创建如下相同的脚本:文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

$ sudo vim /etc/keepalived/scripts/redis_fault.sh
#!/bin/bash

LOGFILE=/var/log/keepalived-redis-state.log

echo "[fault]" >> $LOGFILE
date >> $LOGFILE
$ sudo vim /etc/keepalived/scripts/redis_stop.sh
#!/bin/bash

LOGFILE=/var/log/keepalived-redis-state.log

echo "[stop]" >> $LOGFILE
date >> $LOGFILE

给脚本都加上可执行权限:
$ sudo chmod +x /etc/keepalived/scripts/*.sh文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

脚本创建完成以后,我们开始按照如下流程进行测试:
1.启动Master上的Redis
$ sudo /etc/init.d/redis start文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

2.启动Slave上的Redis
$ sudo /etc/init.d/redis start文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

3.启动Master上的Keepalived
$ sudo /etc/init.d/keepalived start文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

4.启动Slave上的Keepalived
$ sudo /etc/init.d/keepalived start文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

5.尝试通过VIP连接Redis:
$ redis-cli -h 10.6.1.200 INFO文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

连接成功,Slave也连接上来了。
role:master
slave0:10.6.1.144,6379,online文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

6.尝试插入一些数据:
$ redis-cli -h 10.6.1.200 SET Hello Redis
OK文章源自运维生存时间-https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/

从VIP读取数据
$ redis-cli -h 10.6.1.200 GET Hello
"Redis"

从Master读取数据
$ redis-cli -h 10.6.1.143 GET Hello
"Redis"

从Slave读取数据
$ redis-cli -h 10.6.1.144 GET Hello
"Redis"

下面,模拟故障产生:
将Master上的Redis进程杀死:
$ sudo killall -9 redis-server

查看Master上的Keepalived日志
$ tailf /var/log/keepalived-redis-state.log
[fault]
Thu Sep 27 08:29:01 CST 2012

同时Slave上的日志显示:
$ tailf /var/log/keepalived-redis-state.log
[master]
Fri Sep 28 14:14:09 CST 2012
Being master....
Run SLAVEOF cmd ...
OK
Run SLAVEOF NO ONE cmd ...
OK

然后我们可以发现,Slave已经接管服务,并且担任Master的角色了。
$ redis-cli -h 10.6.1.200 INFO
$ redis-cli -h 10.6.1.144 INFO
role:master

然后我们恢复Master的Redis进程
$ sudo /etc/init.d/redis start

查看Master上的Keepalived日志
$ tailf /var/log/keepalived-redis-state.log
[master]
Thu Sep 27 08:31:33 CST 2012
Being master....
Run SLAVEOF cmd ...
OK
Run SLAVEOF NO ONE cmd ...
OK

同时Slave上的日志显示:
$ tailf /var/log/keepalived-redis-state.log
[backup]
Fri Sep 28 14:16:37 CST 2012
Being slave....
Run SLAVEOF cmd ...
OK

可以发现目前的Master已经再次恢复了Master的角色,故障切换以及自动恢复都成功了。
转自:http://heylinux.com/archives/1942.html

weinxin
我的微信
微信公众号
扫一扫关注运维生存时间公众号,获取最新技术文章~
默北
  • 本文由 发表于 02/04/2014 09:11:17
  • 转载请务必保留本文链接:https://www.ttlsa.com/redis/redis-keepalived-achieve-high-availability/
  • Failover
  • keepalived
  • redis
  • 自动故障切换
  • 高可用性
评论  8  访客  8
    • dj
      dj 1

      大神,开启本地化策略指的是持久化么?

      • ︷老丁头ノ
        ︷老丁头ノ 9

        redis_master.sh执行的应该是 slaveof no one 明显写错了

        • DongLinNa
          DongLinNa 9

          当执行redis_master.sh这个脚本的时候,不需要执行Run SLAVEOF cmd这个了,因为此时你的从redis已经挂了,你也连不上,直接执行saveof no one就行
          当执行redis_backup.sh时候,不用执行sleep这个,直接执行slaveof ip port 这个就行,在同步期间 从库是不会提供服务的

          • DongLinNa
            DongLinNa 9

            keepalived应该配置成不抢占模式,当master恢复以后,安静的做从库就行,不能让他成为主库,因为它是没数据的,还有测试不完整,假如主库上的keepalived挂了,从库上的keepalived就成为了MASTER状态,然后从上的redis也提升为了master,这时候就有两个主库了,假如整个主库机器挂了呢 ?假如网络频繁波动呢

              • 布兜o
                布兜o 9

                @ DongLinNa 不会的,keepalived挂了VIP也转移了,master不会被访问到

              • 木平
                木平 9

                其实这个地方应该配置双从库比较好,这样的就不会在 MASTER重新起来之后,又来抢占 MASTER,如果是抢占了的话,就会出现一个情况。数据又得同步一次,不能不说这是个悲剧

                • 木平
                  木平 9

                  这里有个问题,不应该在BACKUP成为MATER状态时做 "SLAVE OF CMD",如果是这样的话:在主库down的时候,从库接管成为主库,那么此时会发生什么情况-》从库竟然执行了“SLAVE OF CMD”,这样从库之前的数据就全部丢失了。
                  所以 从库成为MASTER时,应该只要执行 "SLAVE NO ONE",就OK了

                    • 默北
                      默北

                      @ 木平 从库有执行slave no one呀

                  评论已关闭!