一. 配置多个 master 节点
节点弄三个看看
安装 containerd
| 1
 | systemctl status docker containerd
 | 
 
修改 sandbox
| 1
2
3
4
5
6
7
8
 | # 导出默认配置,config.toml这个文件默认是不存在的
containerd config default > /etc/containerd/config.toml
grep sandbox_image  /etc/containerd/config.toml
sed -i "s#k8s.gcr.io/pause#registry.aliyuncs.com/google_containers/pause#g"       /etc/containerd/config.toml
# 或者
sed -i "s#registry.k8s.io/pause#registry.aliyuncs.com/google_containers/pause#g"       /etc/containerd/config.toml
# 检查
grep sandbox_image  /etc/containerd/config.toml
 | 
 
配置 containrd cgroup 驱动 systemd
| 1
2
3
 | sed -i 's#SystemdCgroup = false#SystemdCgroup = true#g' /etc/containerd/config.toml
# 应用所有更改后,重新启动containerd
systemctl restart containerd
 | 
 
master 节点加入到 k8s 集群
|  1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
 | 
CERT_KEY=`kubeadm init phase upload-certs --upload-certs|tail -1`
echo `kubeadm token create --print-join-command --ttl=0` " --control-plane --certificate-key $CERT_KEY --v=5"
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
 | 
 
查看节点信息
| 1
2
 | kubectl get nodes
kubectl get nodes -o wide
 | 
 
接下来更换 master 的 ip
使用 vip 来替换节点
在 master 节点傻瓜安装 keepalived
| 1
 | yum install keepalived -y
 | 
 
接下来配置文件
|  1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 | cat > /etc/keepalived/keepalived.conf << EOF
global_defs {
   notification_email {
     acassen@firewall.loc
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from fage@qq.com
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id NGINX_MASTER
}
vrrp_instance VI_1 {
    state MASTER
    interface ens33
    virtual_router_id 51 # VRRP 路由 ID实例,每个实例是唯一的
    priority 100    # 优先级,备服务器设置 90
    advert_int 1    # 指定VRRP 心跳包通告间隔时间,默认1秒
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    # 虚拟IP,这个ip来替换你的ip
    virtual_ipaddress {
        10.7.182.220/24
    }
    track_script {
        check_nginx
    }
}
EOF
 | 
 
|  1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 | cat > /etc/keepalived/keepalived.conf << EOF
global_defs {
   notification_email {
     acassen@firewall.loc
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from fage@qq.com
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id NGINX_MASTER
}
vrrp_instance VI_1 {
    state MASTER
    interface ens33
    virtual_router_id 51 # VRRP 路由 ID实例,每个实例是唯一的
    priority 90    # 优先级,备服务器设置 90
    advert_int 1    # 指定VRRP 心跳包通告间隔时间,默认1秒
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    # 虚拟IP
    virtual_ipaddress {
        10.7.182.220/24
    }
    track_script {
        check_nginx
    }
}
EOF
 | 
 
|  1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 | cat > /etc/keepalived/keepalived.conf << EOF
global_defs {
   notification_email {
     acassen@firewall.loc
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from fage@qq.com
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id NGINX_MASTER
}
vrrp_instance VI_1 {
    state MASTER
    interface ens33
    virtual_router_id 51 # VRRP 路由 ID实例,每个实例是唯一的
    priority 80    # 优先级,备服务器设置 80
    advert_int 1    # 指定VRRP 心跳包通告间隔时间,默认1秒
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    # 虚拟IP
    virtual_ipaddress {
        10.7.182.220/24
    }
    track_script {
        check_nginx
    }
}
EOF
 | 
 
设置开机启动
| 1
2
3
4
5
 | systemctl daemon-reload
systemctl restart keepalived && systemctl enable keepalived && systemctl status keepalived
# 查看
ip a
 | 
 
配置虚拟 ip
修改 master 上配置文件的 ip
|  1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
 | # 这里使用sed批量替换
cd /etc/kubernetes/
# 先查
grep -rn '10.7.182' *
# 替换ip
sed -i 's/10.7.182.110/10.7.182.220/g' `grep -rl  ./`
# 替换域名
sed -i 's/local-7-182-110/cluster-endpoint/g' `grep -rl  ./`
# 检查
grep -r '10.7.182' *
 | 
 
生成新的 amin 用 config 文件
| 1
2
3
4
5
6
7
8
 | cd /etc/kubernetes
mv admin.conf admin.conf_bak
# 使用如下命令生成新的admin.conf
kubeadm init phase kubeconfig admin --apiserver-advertise-address 192.168.182.220
# cluster-endpoint-》192.168.182.220
sed -i 's/192.168.182.220/cluster-endpoint/g' admin.conf
 | 
 
删除旧的证书,生成新的证书
| 1
2
3
4
5
6
7
8
9
 | cd /etc/kubernetes/pki
# 先备份
mv apiserver.key apiserver.key.bak
mv apiserver.crt apiserver.crt.bak
# 使用如下命令生成
kubeadm init phase certs apiserver --apiserver-advertise-address 10.7.182.220 --apiserver-cert-extra-sans "10.7.182.220,cluster-endpoint"
#  --apiserver-cert-extra-sans "10.7.182.220,cluster-endpoint":设置了这个,之后加入节点验证证书阶段就不会报错了。
 | 
 
重启 containerd,kubelet
|  1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
 | systemctl restart docker containerd kubelet
# 查看,可以看到master节点现在已经起来了
cd /etc/kubernetes
kubectl get nodes --kubeconfig=admin.conf
# 修改配置,后续可以使用kubectl get nodes查看K8S集群状态了
cd /etc/kubernetes
cp admin.conf ~/.kube/config
kubectl get nodes
 | 
 
查看 etcd
|  1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
 | # 查看etcd pod
kubectl get pods -n kube-system |grep etcd
# 登录
POD_NAME=`kubectl get pods -n kube-system |grep etcd|head -1|awk '{print $1}'`
kubectl exec -it $POD_NAME -n kube-system -- sh
## 配置环境
alias etcdctl='etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key'
## 查看 etcd 集群成员列表
etcdctl member list
 | 
 
解决 node 节点 NotReady 状态
拷贝 master 节点的证书到 node 对应的目录
| 1
2
 | scp /etc/kubernetes/pki/ca.crt local-7-182-111:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/ca.crt local-7-182-112:/etc/kubernetes/pki/
 | 
 
修改 node 的 kubelet.conf
| 1
 | sed -i 's/local-7-182-110/cluster-endpoint/g' /etc/kubernetes/kubelet.conf
 | 
 
重启 containerd 和 kubelet
| 1
 | systemctl restart  containerd kubelet
 | 
 
故障模式模拟测试
停止 keepalived
| 1
 | systemctl stop keepalived
 | 
 
查看 kubectl 命令是否正常
把 master 节点上的 keepalived 启动后,又票回到原先的节点
模拟节点挂了
手动给 master 关机,查看 k8s 集群是否正常。结果证明一个节点是正常的。
【建议】所以我们部署多 master 高可用节点数量必须大于等于 3,这样才能保证挂一个 master 节点,集群不会受影响。
发现两个 master 节点,挂了一个 master 节点是不可用的。原因:当两个 master 节点还是需要 2 个节点可用。只有大于 2 个 master 节点才允许挂 master 节点,我们可以查一下允许的容器。会发现 api-server 的容器一直在重启或者直接挂了,小伙伴也可以去查看一下日志。stackoverflow 答案