MySql-MHA

[TOC]

MHA介绍

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
    该软件由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)。MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署
在一台slave节点上。MHA Node运行在每台MySQL服务器上,MHA Manager会定时探测集群中的master节点,当master出现故障时,它可以自动将最新数据的slave提升为新的
master,然后将所有其他的slave重新指向新的master。整个故障转移过程对应用程序完全透明。

在MHA自动故障切换过程中,MHA试图从宕机的主服务器上保存二进制日志,最大程度的保证数据的不丢失,但这并不总是可行的。例如,如果主服务器硬件故障或无法通过
ssh访问,MHA没法保存二进制日志,只进行故障转移而丢失了最新的数据。使用MySQL 5.5的半同步复制,可以大大降低数据丢失的风险。MHA可以与半同步复制结合起来。如果
只有一个slave已经收到了最新的二进制日志,MHA可以将最新的二进制日志应用于其他所有的slave服务器上,因此可以保证所有节点的数据一致性。

目前MHA主要支持一主多从的架构,要搭建MHA,要求一个复制集群中必须最少有三台数据库服务器,一主二从,即一台充当master,一台充当备用master,另外一台充当从
库,因为至少需要三台服务器,出于机器成本的考虑,淘宝也在该基础上进行了改造,目前淘宝TMHA已经支持一主一从。

故障切换原理:
假如MySQL MHA架构运行正常,突然在上午1000一个复制组中的A的MySQL服务 down了!
因为AB复制是异步复制,所以可能有一些数据尚没有被B拉到其relay_log中,即AB数据不一致,MHA是怎样解决这种情况的呢?

1.mha_manager 使用scp命令将A当前binlog拷贝到mha_manager

2.待新A(选举:依据谁的relay_log新)产生后,mha_manager将旧A的binlog和新的relay_log做比对,并进行差异还原以保证新A和旧A数据的一致性

3.mha_manager将拿到的旧A的binlog去找复制组中其他B 做差异还原,保证数据的一致性

注意:mha_manager 每执行一次 failover 后,该进程自动退出。如果还想测试failover需要重新开启---开启前要将以下文件删掉:(lab03)/etc/mha/mha.failover.complete

环境准备

系统:centos7_1804
lab01: 10.0.0.11  master  mha4mysql-node
lab02: 10.0.0.12  slave1  mha4mysql-node
lab03: 10.0.0.13  slave2  mha4mysql-node mha4mysql-manager
lab04: 10.0.0.14  haproxy [node2, node3] (备读)
  VIP: 10.0.0.10  [node1, node2](主写)

配置步骤:
  1. 配置相互ssh密钥登陆
  2. 主从复制 master:[lab01]  slave:[lab02, lab03]
  3. 安装 mha node:[lab01, lab02, lab03]  manager:[lab03]
  4. 测试

配置ssh密钥登陆

生成密钥,安装自动交互工具所有节点执行

[ -f ~/.ssh/id_dsa ] || { ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa;}
[ -f /usr/bin/expect ] || { yum install -y expect;}

发送公钥到远程主机所有节点执行(实际上只需要控制节点能无需密码登陆从节点即可)

# 远程主机root密码
rootpass=123.com

SSH_KEY(){
/usr/bin/expect <<EOF
set timeout 30
spawn ssh-copy-id -i /root/.ssh/id_dsa.pub root@$1
expect {
    "(yes/no)?" { send "yes\r"; exp_continue }
    "password:" { send "$rootpass\r" }
}
expect eof
EOF
}

for ip in 10.0.0.{11..13};do
    SSH_KEY $ip
done

安装MySql

配置安装源所有节点

yum install -y https://dev.mysql.com/get/mysql80-community-release-el7-1.noarch.rpm    # 安装mysql软件安装源
yum repolist all | grep mysql                     # 查看所有 mysql 子存储库
yum install -y yum-utils                          # 安装 yum 配置工具管理 YUM 源
yum-config-manager --disable mysql80-community    # 禁用 mysql80 子存储库
yum-config-manager --enable mysql57-community     # 启用 mysql57 子存储库
yum repolist enabled | grep mysql                 # 查看已启用的 mysql 子存储库

安装 mysql-5.7 启动服务所有节点

yum install -y mysql-community-server
systemctl start mysqld

配置主从复制

lab01 操作; 开启二进制日志,设置 server-id,启动服务,更改root密码及授权,创建同步用户及授权

# 开启二进制日志(注意 server-id 唯一)
cp /etc/my.cnf{,.bak}
cat <<EOF  >>/etc/my.cnf

# binlog
server-id = 1
log-bin = mysql-bin
log-bin-index = master-bin.index
EOF

# 重启服务,跟随系统启动
systemctl restart mysqld
systemctl enable mysqld
systemctl status mysqld

# 更改root密码,开启root用户远程登陆授权(密码:Hc.1234456)
grep 'temporary password' /var/log/mysqld.log    # 查看初始化密码
def_pass=$(grep 'temporary password' /var/log/mysqld.log | awk '{print $NF}')
ChangePass="alter user 'root'@'localhost' identified by 'Hc.123456';"
mysql -uroot -p$def_pass -e "$ChangePass" --connect-expired-password
Author="grant all privileges on *.* to root@'%' identified by 'Hc.123456' with grant option;"
mysql -uroot -pHc.123456 -e "$Author"

# 创建同步账号,授权
mysql -uroot -pHc.123456
create user 'sync'@'10.0.0.%' identified by 'Hc.123456';
grant replication slave on *.* to 'sync'@'10.0.0.%' identified by 'Hc.123456';
flush privileges;
show master status;
exit

# mysql> show master status;  # 记录下二进制日志文件名及偏移量
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 |     1390 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+

lab02 操作; 开启二进制日志,设置 server-id,启动服务,更改root密码及授权,创建同步用户及授权,配置连接lab01

# 开启二进制日志(注意 server-id 唯一)
cp /etc/my.cnf{,.bak}
cat <<EOF  >>/etc/my.cnf

# binlog
server-id = 2
log-bin = mysql-bin
log-bin-index = master-bin.index
EOF

# 重启服务,跟随系统启动
systemctl restart mysqld
systemctl enable mysqld
systemctl status mysqld

# 更改root密码,开启root用户远程登陆授权(密码:Hc.1234456)
grep 'temporary password' /var/log/mysqld.log    # 查看初始化密码
def_pass=$(grep 'temporary password' /var/log/mysqld.log | awk '{print $NF}')
ChangePass="alter user 'root'@'localhost' identified by 'Hc.123456';"
mysql -uroot -p$def_pass -e "$ChangePass" --connect-expired-password
Author="grant all privileges on *.* to root@'%' identified by 'Hc.123456' with grant option;"
mysql -uroot -pHc.123456 -e "$Author"

# 创建同步账号,授权,配置连接lab01
mysql -uroot -pHc.123456
create user 'sync'@'10.0.0.%' identified by 'Hc.123456';
grant replication slave on *.* to 'sync'@'10.0.0.%' identified by 'Hc.123456';
flush privileges;
change master to master_host='10.0.0.11',  # lab01数据库 IP或主机名(需解析通)
master_port=3306,                          # lab01数据库端口号
master_user='sync',                        # 同步账号
master_password='Hc.123456',               # 同步账号的密码
master_log_file='mysql-bin.000001',        # lab01数据库执行 show master status; 获取的二进制日志文件名
master_log_pos=1390;                       # lab01数据库执行 show master status; 获取的二进制日志偏移量
start slave;                               # 启动slave
show slave status\G;                       # 查看同步状态
exit

lab03 操作; 开启二进制日志,设置 server-id,启动服务,更改root密码及授权,创建同步用户及授权,配置连接lab01

# 开启二进制日志(注意 server-id 唯一)
cp /etc/my.cnf{,.bak}
cat <<EOF  >>/etc/my.cnf

# binlog
server-id = 3
log-bin = mysql-bin
log-bin-index = master-bin.index
EOF

# 重启服务,跟随系统启动
systemctl restart mysqld
systemctl enable mysqld
systemctl status mysqld

# 更改root密码,开启root用户远程登陆授权(密码:Hc.1234456)
grep 'temporary password' /var/log/mysqld.log    # 查看初始化密码
def_pass=$(grep 'temporary password' /var/log/mysqld.log | awk '{print $NF}')
ChangePass="alter user 'root'@'localhost' identified by 'Hc.123456';"
mysql -uroot -p$def_pass -e "$ChangePass" --connect-expired-password
Author="grant all privileges on *.* to root@'%' identified by 'Hc.123456' with grant option;"
mysql -uroot -pHc.123456 -e "$Author"

# 创建同步账号,授权,配置连接lab01
mysql -uroot -pHc.123456
create user 'sync'@'10.0.0.%' identified by 'Hc.123456';
grant replication slave on *.* to 'sync'@'10.0.0.%' identified by 'Hc.123456';
flush privileges;
change master to master_host='10.0.0.11',  # lab01数据库 IP或主机名(需解析通)
master_port=3306,                          # lab01数据库端口号
master_user='sync',                        # 同步账号
master_password='Hc.123456',               # 同步账号的密码
master_log_file='mysql-bin.000001',        # lab01数据库执行 show master status; 获取的二进制日志文件名
master_log_pos=1390;                       # lab01数据库执行 show master status; 获取的二进制日志偏移量
start slave;                               # 启动slave
show slave status\G;                       # 查看同步状态
exit

安装 MHA

所有节点安装mha4mysql-node

# yum install -y https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.centos.noarch.rpm
curl http://117.50.70.211:88/attach_files/sql/12 -o mha4mysql-node-0.58-0.el7.centos.noarch.rpm
yum install -y mha4mysql-node-0.58-0.el7.centos.noarch.rpm

lab03安装mha4mysql-manager

# yum install -y https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
curl http://117.50.70.211:88/attach_files/sql/12 -o mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
yum install -y mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

配置 MHA(lab03)

创建配置文件

# 创建日志路径及文件
mkdir /etc/mha
touch /etc/mha/manager.log

# 创建配置文件
cat <<EOF   >/etc/mha/mha.cnf
[server default]
# mysql admin account and password
user=root
password=Hc.123456

# mha workdir  and worklog
manager_workdir=/etc/mha
manager_log=/etc/mha/manager.log

# mysql A/B account and pw
repl_user=sync
repl_password=Hc.123456

# check_mha_node time
ping_interval=1

# ssh account
ssh_user=root

# mha change VIP
#report_script=/etc/mha/send_report
#master_ip_failover_script=/etc/mha/master_ip_failover
#master_ip_online_change_script=/etc/mha/master_ip_online_change

[server1]
hostname=10.0.0.11
ssh_port=22
master_binlog_dir=/var/lib/mysql
candidate_master=1

[server2]
hostname=10.0.0.12
ssh_port=22
master_binlog_dir=/var/lib/mysql
candidate_master=1

[server3]
hostname=10.0.0.13
ssh_port=22
master_binlog_dir=/var/lib/mysql
candidate_master=2
EOF

创建服务器启动脚本

cat <<EOF   >/etc/mha/mha_start
#!/bin/bash
/usr/bin/nohup /usr/bin/masterha_manager --conf=/etc/mha/mha.cnf >/tmp/mha_manager.log 2>&1 &
EOF
chmod +x /etc/mha/mha_start

检测ssh密钥登陆配置

masterha_check_ssh --conf=/etc/mha/mha.cnf

检测数据库复制

masterha_check_repl --conf=/etc/mha/mha.cnf

测试

测试故障恢复;

1. lab03 启动 masterha_manager 服务
2. lab01 关闭 mysqld 服务
3. lab03 上验证当前 master
4. 恢复 lab01(slave)
5. 设置 lab01 为 master
6. lab03 启动 masterha_manager 服务

1.lab03启动 masterha_manager服务

/etc/mha/mha_start              # 启动 masterha_manager 服务
ps aux | grep masterha
tail -f /tmp/mha_manager.log    # 查看 masterha_manager 实时日志

2.lab01关闭mysql服务

systemctl stop mysqld

3.lab03验证

mysql -uroot -pHc.123456 -e 'show slave status\G;' | grep Master_Host

4.lab01恢复为slave

systemctl start mysqld
ssh 10.0.0.13  'grep -i change /etc/mha/manager.log'    # 查看管理节点上的恢复参数

mysql -uroot -pHc.123456
CHANGE MASTER TO MASTER_HOST='10.0.0.12',
MASTER_PORT=3306,
MASTER_LOG_FILE='mysql-bin.000001',    # 替换此处为 manager.log 对应参数
MASTER_LOG_POS=1390,                   # 替换此处为 manager.log 对应参数
MASTER_USER='sync',
MASTER_PASSWORD='Hc.123456';           # 替换此处为同步用户密码
start slave;
show slave status\G;

5.设置 lab01 为 master

# lab03 执行
rm -f /etc/mha/mha.failover.complete
/etc/mha/mha_start              # 启动 masterha_manager 服务
ps aux | grep masterha
tail -f /tmp/mha_manager.log    # 查看 masterha_manager 实时日志

# lab02关闭mysld服务(lab02执行)
systemctl stop mysqld

# lab03 验证(lab03执行)
mysql -uroot -pHc.123456 -e 'show slave status\G;' | grep Master_Host

# 恢复lab02(lab02执行)
systemctl start mysqld
ssh 10.0.0.13  'grep -i change /etc/mha/manager.log'    # 查看管理节点上的恢复参数

mysql -uroot -pHc.123456
CHANGE MASTER TO MASTER_HOST='10.0.0.11',
MASTER_PORT=3306,
MASTER_LOG_FILE='mysql-bin.000002',    # 替换此处为 manager.log 对应参数
MASTER_LOG_POS=154,                    # 替换此处为 manager.log 对应参数
MASTER_USER='sync',
MASTER_PASSWORD='Hc.123456';           # 替换此处为同步用户密码
start slave;
show slave status\G;

6.lab03启动masterha_manager服务

rm -f /etc/mha/mha.failover.complete
/etc/mha/mha_start              # 启动 masterha_manager 服务
ps aux | grep masterha
tail -f /tmp/mha_manager.log    # 查看 masterha_manager 实时日志

创建VIP切换脚本

lab03 当主从切换时触发脚本执行删除/添加 VIP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
cat <<'EOF'   >/etc/mha/master_ip_failover
#!/usr/bin/env perl

use strict;
use warnings FATAL => 'all';

use Getopt::Long;

my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);

my $vip = '10.0.0.10/24';
my $nic = 'ens32';
my $key = '6';
my $ssh_start_vip = "/sbin/ifconfig $nic:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig $nic:$key down";

GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {

my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}

sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}

sub stop_vip() {
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip
--orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
EOF

chmod +x /etc/mha/master_ip_failover

开启切换主从时触发脚本VIP

sed -i 's/#master_ip_failover_script/master_ip_failover_script/' /etc/mha/mha.cnf

节点快速恢复

systemctl start mysqld
mysql  -uroot -pHc.123456 -e 'stop slave;'
mysql  -uroot -pHc.123456 -e 'reset slave;'
echo  $(ssh 10.0.0.13  'grep -i change /etc/mha/manager.log' | awk -F 'be: ' '{print $2}' | tail -n 2) >/tmp/bin.log
sed -i 's/xxx/Hc.123456/' /tmp/bin.log
sql=$(cat /tmp/bin.log)
mysql  -uroot -pHc.123456 -e "$sql"
mysql  -uroot -pHc.123456 -e 'start slave;'
mysql  -uroot -pHc.123456 -e 'show slave status\G;'

slave集群

lab04安装 haproxy ; 代理 lab02 lab03 数据库读请求

yum -y install haproxy

配置haproxy

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
cp /etc/haproxy/haproxy.cfg{,.bak}
cat <<! >/etc/haproxy/haproxy.cfg
############ 全局配置 ############
global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
daemon
nbproc 1 # 进程数量
maxconn 4096 # 最大连接数
user haproxy # 运行用户
group haproxy # 运行组
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid

############ 默认配置 ############
defaults
log global
mode http # 默认模式{ tcp|http|health }
option httplog # 日志类别,采用httplog
option dontlognull # 不记录健康检查日志信息
retries 2 # 2次连接失败不可用
option forwardfor except 127.0.0.0/8 # 后端服务获得真实ip
option httpclose # 请求完毕后主动关闭http通道
option abortonclose # 服务器负载很高,自动结束比较久的链接
option redispatch
maxconn 4096 # 最大连接数
timeout connect 5m # 连接超时
timeout client 1m # 客户端超时
timeout server 31m # 服务器超时
timeout check 10s # 心跳检测超时
balance roundrobin # 负载均衡方式,轮询

########## 统计页面配置 ##########
listen stats
bind 0.0.0.0:1080
mode http
option httplog
log 127.0.0.1 local0 err
maxconn 10 # 最大连接数
stats refresh 30s
stats uri /admin #状态页面 http//ip:1080/admin 访问
stats realm Haproxy\ Statistics
stats auth admin:admin # 用户和密码:admin
stats hide-version # 隐藏版本信息
stats admin if TRUE # 设置手工启动/禁用

listen MYSQL-HA
mode tcp
option tcplog
bind 10.0.0.14:3306
balance leastconn
option mysql-check user root
server 10.0.0.12 10.0.0.12:3306 inter 1s rise 2 fall 2
server 10.0.0.13 10.0.0.13:3306 inter 1s rise 2 fall 2
!

配置日志

1
2
3
4
5
6
7
8
9
10
cat <<'!' >/etc/rsyslog.d/haproxy.conf
$ModLoad imudp
$UDPServerRun 514
$template Haproxy,"%rawmsg% \n"
local0.=info -/var/log/haproxy.log;Haproxy
local0.notice -/var/log/haproxy-status.log;Haproxy
!

# 重启日志服务
systemctl restart rsyslog

设置启动方式

# 允许没VIP时启动(如果没有配置此项且应用会导致 haproxy 服务无法启动)
echo "net.ipv4.ip_nonlocal_bind = 1" >>/etc/sysctl.conf
sysctl -p

服务管理

systemctl start haproxy     # 启动服务
systemctl enable haproxy    # 随系统启动
systemctl status haproxy    # 运行状态

配置手动切换主从

创建 VIP 切换脚本(使用 perl 脚本 1)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
cat <<'EOF'   >/etc/mha/master_ip_online_change
#!/usr/bin/env perl
use strict;
use warnings FATAL =>'all';

use Getopt::Long;

my $vip = '10.0.0.10/24'; # Virtual IP
my $nic = 'ens32';
my $key = "6";
my $ssh_start_vip = "/sbin/ifconfig $nic:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig $nic:$key down";
my $exit_code = 0;

my (
$command, $orig_master_is_new_slave, $orig_master_host,
$orig_master_ip, $orig_master_port, $orig_master_user,
$orig_master_password, $orig_master_ssh_user, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password, $new_master_ssh_user,
);

GetOptions(
'command=s' => \$command,
'orig_master_is_new_slave' => \$orig_master_is_new_slave,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'orig_master_user=s' => \$orig_master_user,
'orig_master_password=s' => \$orig_master_password,
'orig_master_ssh_user=s' => \$orig_master_ssh_user,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
'new_master_ssh_user=s' => \$new_master_ssh_user,
);


exit &main();

sub main {

#print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "\n\n\n***************************************************************\n";
print "Disabling the VIP - $vip on old master: $orig_master_host\n";
print "***************************************************************\n\n\n\n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {

# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
print "\n\n\n***************************************************************\n";
print "Enabling the VIP - $vip on new master: $new_master_host \n";
print "***************************************************************\n\n\n\n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
}

# A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}

# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
print
"Usage: master_ip_failover –command=start|stop|stopssh|status –orig_master_host=host –orig_master_ip=ip –orig_master_port=po
rt –new_master_host=host –new_master_ip=ip –new_master_port=port\n";
}
EOF

# 添加可执行权限
chmod +x /etc/mha/master_ip_online_change

创建 VIP 切换脚本(使用 shell 脚本 2)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
cat <<EOF   >/etc/mha/master_ip_online_change.sh
#/bin/bash
source /root/.bash_profile

vip='10.0.0.10/24' # Virtual IP
nic='ens32'
key=6

command=`echo "$1" | awk -F = '{print $2}'`
orig_master_host=`echo "$2" | awk -F = '{print $2}'`
new_master_host=`echo "$7" | awk -F = '{print $2}'`
orig_master_ssh_user=`echo "${12}" | awk -F = '{print $2}'`
new_master_ssh_user=`echo "${13}" | awk -F = '{print $2}'`

stop_vip="ssh root@$orig_master_host /sbin/ifconfig $nic:$key down"
start_vip="ssh root@$new_master_host /sbin/ifconfig $nic:$key $vip"

if [ $command = 'stop' ]
then
echo -e "\n\n\n***************************************************************\n"
echo -e "Disabling the VIP - $vip on old master: $orig_master_host\n"
$stop_vip
if [ $? -eq 0 ]
then
echo "Disabled the VIP successfully"
else
echo "Disabled the VIP failed"
fi
echo -e "***************************************************************\n\n\n\n"
fi

if [ $command = 'start' -o $command = 'status' ]
then
echo -e "\n\n\n***************************************************************\n"
echo -e "Enabling the VIP - $vip on new master: $new_master_host \n"
$start_vip
if [ $? -eq 0 ]
then
echo "Enabled the VIP successfully"
else
echo "Enabled the VIP failed"
fi
echo -e "***************************************************************\n\n\n\n"
fi
EOF

# 添加可执行权限
chmod +x /etc/mha/master_ip_online_change.sh

开启手动切换VIP功能 shell, perl 方式二选一(注意:手动切换时不要开启masterha_manager服务)

# shell 方式
shell='/etc/mha/master_ip_online_change.sh'
sed -i '23d' /etc/mha/mha.cnf
sed -i "22a master_ip_online_change_script=$shell" /etc/mha/mha.cnf

# perl 方式
perl='/etc/mha/master_ip_online_change'
sed -i '23d' /etc/mha/mha.cnf
sed -i "22a master_ip_online_change_script=$perl" /etc/mha/mha.cnf

切换10.0.0.11为主

masterha_master_switch --conf=/etc/mha/mha.cnf -master_state=alive --new_master_host=10.0.0.11 --new_master_port=3306 --orig_master_is_new_slave --running_updates_limit=10000

触发切换时发送邮件

安装mailx

yum install -y mailx

配置mail.rc

cat <<EOF    >/etc/mail.rc
# mail $(date '+%F %T')
set from=gaojing_2018@163.com smtp=smtp.163.com
set smtp-auth-user=gaojing_2018@163.com
set smtp-auth-password=password
set smtp-auth=login
EOF

systemctl restart postfix

测试配置

echo "服务器已恢复正常,恢复时间:$(date '+%F %T')"| mailx -s "MHA服务监控" 544025211@qq.com

触发发送邮件脚本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
cat <<'EOF'   >/etc/mha/send_report
#/bin/bash
source /root/.bash_profile

orig_master_host=`echo "$1" | awk -F = '{print $2}'`
new_master_host=`echo "$2" | awk -F = '{print $2}'`
new_slave_hosts=`echo "$3" | awk -F = '{print $2}'`
subject=`echo "$4" | awk -F = '{print $2}'`
body=`echo "$5" | awk -F = '{print $2}'`

#判断日志结尾是否有successfully,有则表示切换成功,成功与否都发邮件。
tac /etc/mha/manager.log | sed -n 2p | grep 'successfully' > /dev/null
if [ $? -eq 0 ]
then
echo -e "MHA $subject 主从切换成功\n master:$orig_master_host --> $new_master_host \n $body \n 当前从库:$new_slave_hosts" | mailx -s "MySQL实例宕掉,MHA $subject 切换成功" 544025211@qq.com
else
echo -e "MHA $subject 主从切换失败\n master:$orig_master_host --> $new_master_host \n $body" | mailx -s "MySQL实例宕掉,MHA $subject 切换失败" 544025211@qq.com
fi
EOF

# 添加可执行权限
chmod +x /etc/mha/send_report

开启切换触发邮件

sed -i 's/#report_script/report_script/' /etc/mha/mha.cnf

测试

管理节点(确认集群运行正常 主从从)

rm -f mha.failover.complete
masterha_manager --conf=/etc/mha/mha.cnf &
ps aux | grep mha
mysql  -uroot -pHc.123456 -e 'show slave status\G;'    # 查看主节点是谁

关闭主节点的mysqld服务

systemctl stop mysqld

状态

此时 master 已经切换到其他节点 VIP 已经转移 指定邮箱也收到邮件包含集群信息

说明

数据库写入使用VIP
数据库读使用 haproxy 代理的IP 或者使用VIP(haproxy可以不安装)

本文标题:MySql-MHA

文章作者:亦 漩

发布时间:2018年09月24日 - 01:09

最后更新:2018年09月27日 - 20:09

原始链接:https://home.onlycloud.xin/posts/mysql-mha.html

许可协议: 署名4.0国际 (CC BY 4.0) 转载请保留原文链接及作者。