高可用实现方法汇总

Tags: 系统设计 

目录

摘要

高可用有两种场景: “单活”与”多活”。分析实际的场景,考虑是用”主备”,还是用”负载均衡”。

Keepalived

Keepalived

Keepalived是用于实施两台或者两组LVS服务器之间的热备, 当一组LVS服务器失活的时候,另一组LVS服务器自动接管工作。

keepalived is a userspace daemon for LVS cluster nodes healthchecks and LVS directors failover.

Keepalived的目的只是用于LVS,可以通过巧妙的配置,拓展Keepalived的应用范围。

Keepalive配置文件的语法: man keepalived.conf。 文献5中介绍了Keepalived使用和配置。

Keepalived注意事项

初次使用Keepalived的时候很容易遇到一个让人哭笑不得的大坑,所以非常有必要把这个单列为一章。

Keepalive不会对配置文件的格式进行检查,而且配置文件中的”{“与前面的字符之间必须至少有一个空格分隔。

vrrp_sync_group VG_REDIS {  !注意VG_REDIS后面必须有个空格,不然会被当作"VG_REDIS{"
	group {               
		VG_REDIS                    
	}                   
}

至少在1.2.7的版本中还是这样的。

使用参数”-d”(在/etc/sysconfig/keepalived中添加)时,keepalived会把配置文件的解析结果记录到日志文件(/var/log/messages)中,可以通过查看解析后的结果,来判断配置文件是否被正确解读。例如文献6中:

Sep 12 14:13:11 example-01 Keepalived: ------< Global definitions >------
Sep 12 14:13:11 example-01 Keepalived:  LVS ID = LVS_EXAMPLE_01
Sep 12 14:13:11 example-01 Keepalived:  Smtp server = 127.0.0.1
Sep 12 14:13:11 example-01 Keepalived:  Smtp server connection timeout = 100
Sep 12 14:13:11 example-01 Keepalived:  Email notification from = [email protected], [email protected]
Sep 12 14:13:11 example-01 Keepalived:  Email notification = [email protected]
Sep 12 14:13:11 example-01 Keepalived: ------< SSL definitions >------
Sep 12 14:13:11 example-01 Keepalived:  Using autogen SSL context
Sep 12 14:13:11 example-01 Keepalived: ------< VRRP Topology >------
Sep 12 14:13:11 example-01 Keepalived:  VRRP Instance = VI_1
Sep 12 14:13:11 example-01 Keepalived:    Want State = MASTER
Sep 12 14:13:11 example-01 Keepalived:    Runing on device = eth0
Sep 12 14:13:11 example-01 Keepalived:    Virtual Router ID = 51
Sep 12 14:13:11 example-01 Keepalived:    Priority = 150
Sep 12 14:13:11 example-01 Keepalived:    Advert interval = 1sec
Sep 12 14:13:11 example-01 Keepalived:    Preempt Active
Sep 12 14:13:11 example-01 Keepalived:    Authentication type = SIMPLE_PASSWORD
Sep 12 14:13:11 example-01 Keepalived:    Password = example
Sep 12 14:13:11 example-01 Keepalived:    VIP count = 1
Sep 12 14:13:11 example-01 Keepalived:      VIP1 = 192.168.1.11/32
Sep 12 14:13:11 example-01 Keepalived:  VRRP Instance = VI_GATEWAY
Sep 12 14:13:11 example-01 Keepalived:    Want State = MASTER
Sep 12 14:13:11 example-01 Keepalived:    Runing on device = eth1
Sep 12 14:13:11 example-01 Keepalived:    Virtual Router ID = 52
Sep 12 14:13:11 example-01 Keepalived:    Priority = 150
Sep 12 14:13:11 example-01 Keepalived:    Advert interval = 1sec
Sep 12 14:13:11 example-01 Keepalived:    Preempt Active
Sep 12 14:13:11 example-01 Keepalived:    Authentication type = SIMPLE_PASSWORD
Sep 12 14:13:11 example-01 Keepalived:    Password = example
Sep 12 14:13:11 example-01 Keepalived:    VIP count = 1
Sep 12 14:13:11 example-01 Keepalived:      VIP1 = 10.20.40.1/32
Sep 12 14:13:11 example-01 Keepalived: ------< VRRP Sync groups >------
Sep 12 14:13:11 example-01 Keepalived:  VRRP Sync Group = VG1, MASTER
Sep 12 14:13:11 example-01 Keepalived:    monitor = VI_1
Sep 12 14:13:11 example-01 Keepalived:    monitor = VI_GATEWAY
Sep 12 14:13:11 example-01 Keepalived: ------< LVS Topology >------
Sep 12 14:13:11 example-01 Keepalived:  System is compiled with LVS v1.0.4
Sep 12 14:13:11 example-01 Keepalived:  VIP = 192.168.1.11, VPORT = 22
Sep 12 14:13:11 example-01 Keepalived:    delay_loop = 10, lb_algo = rr
Sep 12 14:13:11 example-01 Keepalived:    protocol = TCP
Sep 12 14:13:11 example-01 Keepalived:    lb_kind = NAT
Sep 12 14:13:11 example-01 Keepalived:    RIP = 10.20.40.11, RPORT = 22, WEIGHT = 1
Sep 12 14:13:11 example-01 Keepalived: ------< Health checkers >------
Sep 12 14:13:11 example-01 Keepalived:  10.20.40.11:22
Sep 12 14:13:11 example-01 Keepalived:    Keepalive method = TCP_CHECK
Sep 12 14:13:11 example-01 Keepalived:    Connection timeout = 10
Sep 12 14:13:11 example-01 Keepalived:    Connection port = 22 

最简应用-可抢占

Simplest-preempt

最简应用-不可抢占

Simplest-nopreempt

最简应用-VRRP-script

可以通过vrrp-script自行定义存活的条件。

Simplest-VRRP-script

拓展场景-高可用的Redis

Keepalived的配置文件(man keepalived.conf)中,可以设置Master和Backup在进行角色转变时执行的脚本。如下:

notify_master /path/to_master.sh    
notify_backup /path/to_backup.sh   
notify_fault "/path/fault.sh VG_1"

我们可以丰富脚本中的内容,拓展Keepalived的应用范围。

我们需要找到一种方法使keepalived可以检测的是Redis进程的存活,而不是机器的存活,这可以利用上一节的中的vrrp-script。

最终效果:

两台物理机器为主备,主为master,备为slave,主死掉后,备成为master,主恢复后成为备。主备切换只会有瞬间的服务不可用,且不丢失数据。

              _|_
              \ /
         +-----'-----+            +-----------+  
         |   VIP     | keepalived |   VIP     |
         |   LVS1    +------------+   LVS2    |
         |  Master   |            |  Backup   |
         +-----------+            +-----------+   
        192.168.192.37            192.168.192.38

具体实现: Keepalived - Redis Master Slaver

常规用法-Keepalvied + LVS

2015-07-09 16:43:28 @ 暂时不需要这种场景,以后需要的时候,再补充。

如下图所示:

1. 两台LVS服务器,一主一备
2. 三台Machine拥有同样的VIP,且不会对VIP的Arp报文进行回应(这很重要!)
3. 同一时刻, LVS1和LVS2之间只有“Master”才会被设置VIP,“Master”会回应VIP的Arp报文(所以发送VIP的报文会被“Master”接收)

                    _|_
                    \ /
               +-----'-----+            +-----------+             VIP:192.168.1.10
               |   VIP     | keepalived |   VIP     |
               |   LVS1    +------------+   LVS2    |
               |  Master   |            |   Backup   |
               +-----------+            +-----------+   
                     |
          +----------+----------+---------------------+
         _|_                   _|_                   _|_
         \ /                   \ /                   \ /
   +------'-----+        +------'-----+        +------'-----+ 
   |    VIP     |        |    VIP     |        |    VIP     | 
   |  Machine1  |        |  Machine2  |        |  Machine3  | 
   +------------+        +------------+        +------------+ 
    192.168.1.11         192.168.1.12          192.168.1.13

LVS1与LVS2会互相检测对方的存在,根据配置文件中的内容(默认角色、优先级),决定谁作为Master。当Backup发现Master失活,就会自动转变成Master,并设置VIP,开始接管工作。

如果在配置文件中,设置了Master是”可以抢占的”,当拥有高优先级的Master复活后,将会抢回Master的身份。

问题: 如果Master和Backup之间的网络断开,是不是两个都会成为Master,导致VIP冲突?因该是会的。所以keepalived的应用前提: Master与Backup之间的网络要是可靠的。

HeartBeat & Pacemaker

2015-07-09 16:43:28 @ 暂时不需要这种场景,以后需要的时候,再补充,

HeartBeat解决结点的存活监测的问题。

Pacemaker

参考

  1. http://www.ibm.com/developerworks/cn/cloud/library/cl-cloudhighavailability/ “ 实现基于云的高可用性的实用方法”
  2. http://www.ibm.com/developerworks/cn/linux/l-haccmdb/ “实现复合应用程序的高可用性”
  3. http://www.ibm.com/developerworks/cn/linux/l-drbd/ “Distributed Replicated Block Device的高可用性”
  4. http://www.ibm.com/developerworks/cn/opensource/os-cn-RabbitMQ/index.html “高可用RabbitMQ集群自动化部署解决方案”
  5. http://keepalived.org/pdf/sery-lvs-cluster.pdf “负载均衡及服务器集群(lvs)”
  6. http://keepalived.org/LVS-NAT-Keepalived-HOWTO.html LVS-NAT-Keepalived-HOWTO”
  7. http://abcve.com/redis-keepalived/ “ Redis主从配置及使用Keepalived实现高可用”
  8. https://tobrunet.ch/2013/07/keepalived-check-and-notify-scripts/ “Keepalived Check and Notify Scripts”

系统设计

  1. 各大云厂商的 API 设计风格
  2. Google 是如何实践 RESTful API 设计的?
  3. Netflix 的异地多活设计: Active-Active for Multi-Regional Resiliency
  4. Facebook 的缓存系统实践经验《Scaling Memcache at Facebook》
  5. 多机数据系统的正确性与一致性
  6. 《大型网站技术架构: 核心原理与案例分析》阅读摘录
  7. 《分布式金融架构课》阅读笔记2: 线性一致的分布式数据系统的实现过程
  8. 《分布式金融架构课》阅读笔记1: 单机&多机并发/多副本读写正确性和一致性
  9. 《消息队列高手课》阅读笔记: Rabbit/Rocket/Kafka/模型/消息事务/保序等
  10. 《消息队列高手课》阅读笔记: Rabbit/Rocket/Kafka/模型/消息事务/保序等
  11. 《Redis核心技术与实践》阅读笔记: 数据类型/存储开销/Rehash/案例等
  12. 《Redis核心技术与实践》阅读笔记: 数据类型/存储开销/Rehash/案例等
  13. 《高并发系统设计40问》阅读笔记: 数据库/缓存/消息队列/分布式服务
  14. 《高并发系统设计40问》阅读笔记: 数据库/缓存/消息队列/分布式服务
  15. 《MySQL实战45讲》阅读笔记: 索引类型/数据可靠性/事务/间隙锁/临时表等
  16. 系统性能分析方法论: 统计图谱工具
  17. 张磊《深入剖析Kubernetes》专栏的阅读笔记
  18. 代理服务软件haproxy、nginx、envoy对比,以及开源的API网关项目对比
  19. 蓝绿部署、金丝雀发布(灰度发布)、A/B测试的准确定义
  20. 阿里巴巴的应用限流和服务降级是怎样实现的?|如何打造平台稳定能力
  21. 陈皓《左耳听风》专栏的阅读笔记(持续更新)
  22. 好雨云帮,一款不错的国产开源PaaS
  23. 怎样为软件的不同版本命名?
  24. 怎样选择开源项目的license?
  25. Glusterfs的架构
  26. 怎样设计一个企业级的PaaS平台?
  27. 几种常见的LDAP系统
  28. DNS SRV介绍(一种用DNS做服务发现的方法)
  29. DNS,DNS-Domain Name System
  30. 思科的网络设备
  31. 虚拟化技术汇总
  32. 认证与授权系统的汇总
  33. 高可用实现方法汇总
  34. 编译器汇总
  35. Linux系统的优化方法
  36. CentOS7的一些变化
  37. 分布式系统的一些知识
  38. 计算机编程语言的特性汇总
  39. 网络通信的一些基础知识
  40. PCIE总线的一些知识
  41. 操作系统的API
  42. 网卡的一些知识
  43. Linux系统的构建过程
  44. 数据结构与算法
  45. CPU的相关知识

推荐阅读

Copyright @2011-2019 All rights reserved. 转载请添加原文连接,合作请加微信lijiaocn或者发送邮件: [email protected],备注网站合作

友情链接:  系统软件  程序语言  运营经验  水库文集  网络课程  微信网文  发现知识星球