写于 2021-03,背景:K8s 1.21 GA 阶段,二进制高可用部署仍然是国内大厂主流。本文用编译版 Nginx 1.24 + Keepalived 2.2 实现 API Server 的统一入口 VIP 漂移。
一、为什么需要 VIP
3 master 集群时,apiserver 监听在 3 个不同 IP:6443:
1
2
3
| master1 → 192.168.139.133:6443
master2 → 192.168.139.134:6443
master3 → 192.168.139.135:6443
|
但 kubectl 只能配 1 个 --server:
1
2
| kubectl --server=https://192.168.139.133:6443 get nodes
# 如果 master1 挂了?
|
解决思路:在 master 节点前再加一层"虚拟 IP",3 个 master 通过 VRRP 协议竞选 VIP,VIP 永远在 1 个能用的 master 上。
1
2
3
4
5
| kubectl → VIP:6443 (192.168.139.150)
↓ VRRP
┌────┴────┬────────┐
master1 master2 master3
:6443 :6443 :6443
|
二、方案选型
| 方案 | 优点 | 缺点 |
|---|
| Nginx + Keepalived | 成熟稳定,二层/四层都能做 | 需要编译 |
| HAProxy + Keepalived | 性能比 Nginx 高 | 配置稍复杂 |
| Envoy + Keepalived | 支持高级路由 | 太重 |
| 公有云 SLB | 零运维 | 绑定云厂商 |
本文选 Nginx stream + Keepalived——最常见、配置清晰。
三、集群规划
| 主机名 | IP | 角色 |
|---|
| master1 | 192.168.139.133 | nginx + keepalived MASTER |
| master2 | 192.168.139.134 | nginx + keepalived BACKUP |
| master3 | 192.168.139.135 | nginx + keepalived BACKUP |
| VIP | 192.168.139.150 | 漂移地址 |
端口约定:
- apiserver:6443
- nginx stream 监听:8443(外部访问 VIP:8443 → 内部 apiserver 6443)
- keepalived VRRP:协议号 112
四、编译 Nginx(master1 + master2)
只在两台装 nginx + keepalived 即可(master3 不需要,节省资源)。如果所有 3 台都装,keepalived 仍能正常选举。
4.1 装编译依赖
1
| apt install -y make g++ openssl libssl-dev libpcre3 libpcre3-dev zlib1g-dev libgd-dev
|
4.2 编译 Nginx
1
2
3
4
5
6
7
8
9
10
11
12
13
| cd /k8s/softs
tar xf nginx-1.24.0.tar.gz && cd nginx-1.24.0
./configure \
--prefix=/usr/local/nginx \
--with-http_ssl_module \
--with-http_stub_status_module \
--with-pcre \
--with-http_gzip_static_module \
--with-stream \
--with-stream_ssl_module
make -j$(nproc) && make install
|
关键参数:**--with-stream**(TCP 四层代理),**--with-stream_ssl_module**(与 apiserver HTTPS 通信)
4.3 Nginx 主配置
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
| cat > /usr/local/nginx/conf/nginx.conf << 'EOF'
worker_processes 1;
events {
worker_connections 1024;
}
http {
include mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
sendfile on;
keepalive_timeout 65;
autoindex on;
gzip on;
gzip_buffers 32 4K;
gzip_comp_level 6;
gzip_min_length 1k;
gzip_types application/javascript text/css text/xml;
gzip_disable "MSIE [1-6]\.";
gzip_http_version 1.1;
gzip_vary on;
client_body_buffer_size 1024k;
client_max_body_size 2048m;
client_header_buffer_size 128k;
large_client_header_buffers 4 128k;
proxy_buffer_size 256k;
proxy_buffering on;
proxy_buffers 64 128k;
proxy_busy_buffers_size 512k;
include conf.d/*.conf;
}
stream {
log_format proxy '[$time_local] $protocol status:$status, '
'client: $remote_addr, upstream:"$upstream_addr", '
'bytes: $bytes_sent/$bytes_received '
'time: $session_time/$upstream_connect_time';
include stream/*.conf;
}
EOF
mkdir -p /usr/local/nginx/conf/{conf.d,stream}
|
4.4 stream 配置
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| cat > /usr/local/nginx/conf/stream/stream.conf << 'EOF'
upstream k8sapiserver {
hash $remote_addr consistent;
server 192.168.139.133:6443 max_fails=3 fail_timeout=30s;
server 192.168.139.134:6443 max_fails=3 fail_timeout=30s;
server 192.168.139.135:6443 max_fails=3 fail_timeout=30s;
}
server {
error_log logs/8443_error.log;
access_log logs/8443_access.log proxy;
listen 8443;
proxy_connect_timeout 1s;
proxy_pass k8sapiserver;
}
EOF
|
关键参数:
hash $remote_addr consistent:同一客户端始终连同一后端(会话保持)max_fails=3 fail_timeout=30s:3 次失败标记 down,30 秒后重试proxy_connect_timeout 1s:连不上后端立即失败,让 keepalived 切主
五、systemd 单元
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| [Unit]
Description=kube-apiserver nginx proxy
After=network.target network-online.target
Wants=network-online.target
[Service]
Type=forking
ExecStartPre=/usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf -p /usr/local/nginx -t
ExecStart=/usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf -p /usr/local/nginx
ExecReload=/bin/kill -s HUP $MAINPID
PrivateTmp=true
Restart=always
RestartSec=5
StartLimitInterval=0
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
|
启动:
1
2
3
4
| systemctl daemon-reload
systemctl enable --now kube-nginx
systemctl status kube-nginx
curl 192.168.139.133:8443 # 不报错即可
|
六、Keepalived 配置
6.1 编译 Keepalived(master1 + master2)
1
2
3
4
5
| cd /k8s/softs
tar xf keepalived-2.2.8.tar.gz && cd keepalived-2.2.8
./configure --prefix=/usr/local/keepalived
make -j$(nproc) && make install
mkdir -p /usr/local/keepalived/conf
|
6.2 健康检查脚本
1
2
3
4
5
6
7
8
9
10
11
12
| cat > /usr/local/keepalived/conf/check_nginx.sh << 'EOF'
#!/bin/bash
A=$(ps -C nginx --no-header | wc -l)
if [ $A -eq 0 ]; then
/usr/bin/systemctl start kube-nginx
sleep 2
if [ $(ps -C nginx --no-header | wc -l) -eq 0 ]; then
/usr/bin/systemctl stop kube-keepalived
fi
fi
EOF
chmod +x /usr/local/keepalived/conf/check_nginx.sh
|
逻辑:nginx 挂了 → 尝试重启 → 启不来就停 keepalived,让 VIP 漂到其他节点。
6.3 master1 配置(priority 100 = MASTER)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
| cat > /usr/local/keepalived/conf/kube-keepalived.conf << 'EOF'
global_defs {
router_id LVS_DEVEL
}
vrrp_script check_nginx {
script "/usr/local/keepalived/conf/check_nginx.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state MASTER
interface ens33
mcast_src_ip 192.168.139.133
virtual_router_id 51
priority 100
nopreempt
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.139.150
}
track_script {
check_nginx
}
}
EOF
|
6.4 master2 配置(priority 90 = BACKUP)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
| cat > /usr/local/keepalived/conf/kube-keepalived.conf << 'EOF'
global_defs {
router_id LVS_DEVEL
}
vrrp_script check_nginx {
script "/usr/local/keepalived/conf/check_nginx.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
mcast_src_ip 192.168.139.134
virtual_router_id 51
priority 90
nopreempt
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.139.150
}
track_script {
check_nginx
}
}
EOF
|
关键参数:
virtual_router_id 51:同一集群必须一致(不同集群必须不同!)priority 100 > 90:数字大 = 高优先级nopreempt:原 master 恢复后不抢回 VIP(避免脑裂)track_script:nginx 挂了直接降 priority 5,触发 VIP 漂移auth_pass K8SHA_KA_AUTH:明文 VRRP 认证,生产请用 16 位随机字符串
6.5 systemd 单元
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| [Unit]
Description=VRRP High Availability Monitor
After=network-online.target syslog.target
Wants=network-online.target
Documentation=https://keepalived.org/manpage.html
[Service]
Type=forking
KillMode=process
ExecStart=/usr/local/keepalived/sbin/keepalived -D -f /usr/local/keepalived/conf/kube-keepalived.conf
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
|
启动:
1
2
3
4
| systemctl daemon-reload
systemctl enable --now kube-keepalived
systemctl status kube-keepalived
journalctl -f -u kube-keepalived
|
七、验证
1
2
3
4
5
6
7
8
9
10
11
12
| # master1 上看 VIP 是否绑定
ip addr show ens33
# inet 192.168.139.133/24 ...
# inet 192.168.139.150/32 ... ← 看到这行说明 VIP 成功
# 验证 VIP 可访问 apiserver
curl -k https://192.168.139.150:8443/version
# {
# "major": "1",
# "gitVersion": "v1.28.5",
# ...
# }
|
配置 kubectl 使用 VIP:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| mkdir -p ~/.kube
cat > ~/.kube/config << EOF
apiVersion: v1
kind: Config
clusters:
- cluster:
server: https://192.168.139.150:8443
certificate-authority-data: <base64 编码的 ca.pem>
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: kubernetes-admin
name: default
current-context: default
users:
- name: kubernetes-admin
user:
client-certificate-data: <base64 编码的 admin.pem>
client-key-data: <base64 编码的 admin-key.pem>
EOF
kubectl get nodes
|
八、故障演练
8.1 模拟 master1 整机宕机
1
2
| # 在 master1 上
shutdown -h now
|
观察 master2:
1
2
3
4
5
6
7
| # master2 上
ip addr show ens33
# inet 192.168.139.134/24 ...
# inet 192.168.139.150/32 ... ← VIP 漂到 master2
# 验证 VIP 仍可访问
curl -k https://192.168.139.150:8443/version
|
预期 5~10 秒后 VIP 漂到 master2,kubectl 无感。
8.2 模拟 master1 nginx 进程挂
1
2
| # 在 master1 上
pkill nginx
|
观察:
- check_nginx.sh 检测到 nginx 死 → 尝试
systemctl start kube-nginx - 如果能起来:keepalived 继续,VIP 不漂
- 如果起不来(概率低):脚本
systemctl stop kube-keepalived,VIP 强制漂走
8.3 恢复 master1
1
2
| # master1
systemctl start kube-apiserver kube-nginx kube-keepalived
|
由于 nopreempt,VIP 不会抢回(继续在 master2)。这是有意设计——避免脑裂。
九、生产环境加固
关闭 mcast_src_ip 改 unicast(云环境 VRRP multicast 被禁用):
1
2
3
4
5
6
7
| vrrp_instance VI_1 {
...
unicast_src_ip 192.168.139.133
unicast_peer {
192.168.139.134
}
}
|
认证密码用随机 16 位:
1
| openssl rand -base64 12
|
keepalived 配 3 台而不是 2 台:3 节点可以容忍 1 节点挂,2 节点只容忍 0 节点挂(脑裂风险大)。
监控 VIP:把 192.168.139.150 加入 Prometheus blackbox_exporter 监控。
DNS 解析:把 kubernetes-api.example.com 解析到 VIP,kubectl 用域名而不是 IP。
十、常见坑
- virtual_router_id 不一致:3 节点 VRRP 选举失败
- 多播被防火墙挡:VRRP 协议号 112,iptables 默认会拦
- nopreempt 没配:master1 恢复时把 VIP 抢回,可能导致短暂脑裂
- check_nginx.sh 权限不足:keepalived 跑不了脚本,nginx 挂 VIP 不漂
- VIP 绑定到错的 interface:
interface ens33 配错网卡,VIP 上不去
十一、前置知识 / 下一步
前置:
下一步:
- K8s 集群插件(2021-09-15)—— CNI / CoreDNS / Metrics / Dashboard
- K8s 集群管理(2021-12-15)—— 升级、节点隔离、排错
参考资料