Featured image of post 高可用负载均衡:Nginx + Keepalived 编译与 VIP 漂移

高可用负载均衡:Nginx + Keepalived 编译与 VIP 漂移

kube-apiserver 4 层 Nginx stream 反代 + Keepalived 双 VIP 漂移,centos7 / Ubuntu 22.04 双版本编译

为什么需要 Nginx + Keepalived

K8s 1.28 集群 3 个 master 上的 kube-apiserver 监听 6443 端口,但是:

  1. worker 节点上的 kubelet 不会写死某个 master IP——它们需要一个虚拟入口
  2. kubectl 客户端也希望只配置一个 server 地址
  3. 多个 master 之间的负载需要均衡

业界主流方案是 Nginx 4 层 stream 反代 + Keepalived VIP 漂移

  • Nginx 用 stream 块做 TCP 4 层反代(不是 7 层 HTTP),把 10443 端口收到的请求分发到 3 个 master 的 6443
  • Keepalived 用 VRRP 协议做 VIP 漂移(抢占模式必须 nopreempt),3 个 master 只有一个是 MASTER,2 个 BACKUP

适用版本:nginx 1.24.0 / keepalived 2.2.8 / Ubuntu 22.04 部署位置:master1、master2(只在 2 台上部署,master3 上可装可不装,但 stream 配置要一样)


1. 编译安装 Nginx

1.1 编译依赖

1
apt install -y make g++ openssl libssl-dev libpcre3 libpcre3-dev zlib1g-dev libgd-dev

1.2 编译配置

关键参数:--with-stream(4 层反代)、--with-stream_ssl_module(TLS 透传)、--with-http_ssl_module(7 层 HTTPS)。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
cd /data/softs
tar xf nginx-1.24.0.tar.gz && cd nginx-1.24.0

./configure \
  --prefix=/usr/local/nginx \
  --with-http_ssl_module \
  --with-http_stub_status_module \
  --with-pcre \
  --with-http_gzip_static_module \
  --with-stream \
  --with-stream_ssl_module

make && make install

如果还要支持 --with-stream_ssl_preread_module(按 SNI 分流),需要二次编译:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
./configure \
  --prefix=/usr/local/nginx \
  --with-http_ssl_module \
  --with-http_stub_status_module \
  --with-pcre \
  --with-http_gzip_static_module \
  --with-stream \
  --with-stream_ssl_module \
  --with-stream_ssl_preread_module

make
# 备份旧的
mv /usr/local/nginx/sbin/nginx /usr/local/nginx/sbin/nginx_bak
# 替换
cp objs/nginx /usr/local/nginx/sbin/

1.3 解决 libpcre.so.1 not found

1
2
3
4
# 查依赖
ldd $(which /usr/local/nginx/sbin/nginx)
# libpcre.so.1 => not found
# 解决:编译时加 --with-pcre=../pcre-xxx 源码路径,或安装 libpcre3-dev

2. Nginx 配置

2.1 主配置 nginx.conf

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
worker_processes  1;
events {
    worker_connections  1024;
}
http {
    include       mime.types;
    default_type  application/octet-stream;
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    sendfile        on;
    keepalive_timeout  65;
    autoindex       on;
    gzip            on;
    gzip_buffers    32 4K;
    gzip_comp_level 6;
    gzip_min_length 1k;
    gzip_types      application/javascript text/css text/xml;
    gzip_disable    "MSIE [1-6]\.";
    gzip_http_version 1.1;
    gzip_vary       on;

    client_body_buffer_size    1024k;
    client_max_body_size       2048m;
    client_header_buffer_size  128k;
    large_client_header_buffers 4 128k;

    proxy_buffer_size    256k;
    proxy_buffering       on;
    proxy_buffers        64 128k;
    proxy_busy_buffers_size 512k;

    fastcgi_buffers              6 512k;
    fastcgi_buffer_size          512k;
    fastcgi_busy_buffers_size    512k;
    fastcgi_temp_file_write_size 512k;
    fastcgi_intercept_errors     on;

    include conf.d/*.conf;
}
stream {
    log_format proxy '[$time_local] $protocol status:$status, bytes client:$bytes_sent/$bytes_received $session_time, '
                 'client: $remote_addr, upstream:"$upstream_addr", '
                 'bytes upstream:$upstream_bytes_sent $upstream_bytes_received $upstream_connect_time';
    include stream/*.conf;
}

2.2 stream 反代 kube-apiserver

stream/stream.conf

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
upstream k8sapiserver {
    hash $remote_addr consistent;
    server <master1-ip>:6443 max_fails=3 fail_timeout=30s;
    server <master2-ip>:6443 max_fails=3 fail_timeout=30s;
    server <master3-ip>:6443 max_fails=3 fail_timeout=30s;
}
server {
    error_log  logs/10443_error.log;
    access_log logs/10443_access.log proxy;
    listen 10443;
    proxy_connect_timeout 1s;
    proxy_pass k8sapiserver;
}

2.3 测试页 conf.d/default.conf

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
server {
    listen       10088;
    server_name  localhost;
    error_log    logs/10088_error.log;
    access_log   logs/10088_access.log;

    location / {
        root   html;
        index  index.html index.htm;
    }
    error_page 500 502 503 504 /50x.html;
    location = /50x.html {
        root html;
    }
}

2.4 systemd 单元

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
[Unit]
Description=kube-apiserver nginx proxy
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=forking
ExecStartPre=/usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf -p /usr/local/nginx -t
ExecStart=/usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf -p /usr/local/nginx
ExecReload=/usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf -p /usr/local/nginx -s reload
PrivateTmp=true
Restart=always
RestartSec=5
StartLimitInterval=0
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

启动:

1
2
3
4
5
6
7
8
9
systemctl daemon-reload
systemctl enable --now nginx.service
systemctl status nginx.service
journalctl -f -u nginx.service

# 验证(区分两台机器的页面)
sed -i "s#Welcome to nginx#Welcome to nginx master1#g" /usr/local/nginx/html/index.html
curl <master1-ip>:10088
curl <master2-ip>:10088

3. Keepalived 编译与配置

3.1 编译安装

1
2
3
4
5
6
7
8
cd /data/softs
tar xf keepalived-2.2.8.tar.gz && cd keepalived-2.2.8

./configure --prefix=/usr/local/keepalived
make && make install
cd .. && rm -rf keepalived-2.2.8

mkdir -p /usr/local/keepalived/conf

3.2 健康检查脚本

check_nginx.sh:每 2 秒检查一次 nginx,挂了先重启;重启失败就让 keepalived 主动降权让 VIP 漂走。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
cat << "EOF" > /usr/local/keepalived/conf/check_nginx.sh
#!/bin/bash
url="http://127.0.0.1:10088"
code=`curl -s -o /dev/null -w %{http_code} $url`
if [ $code -ne 200 ]; then
  /usr/bin/systemctl start nginx
  sleep 2
  code=`curl -s -o /dev/null -w %{http_code} $url`
  if [ $code -ne 200 ]; then
    /usr/bin/systemctl stop keepalived
  fi
fi
EOF

chmod +x /usr/local/keepalived/conf/check_nginx.sh

必须:脚本用 HTTP 200 判断,不能ps -C nginx——多 worker 进程下 ps 不可靠。

3.3 keepalived.conf(master1)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
global_defs {
    router_id LVS_DEVEL
    vrrp_skip_check_adv_addr
    vrrp_garp_interval 0
    vrrp_gna_interval 0
}
vrrp_script check_nginx {
    script "/usr/local/keepalived/conf/check_nginx.sh"
    interval 2
    weight -2
}
vrrp_instance VI_1 {
    state MASTER
    interface eno1              # 网卡名按机器实际改
    mcast_src_ip <master1-ip>
    virtual_router_id 51
    priority 100
    advert_int 2
    nopreempt                   # **关键**:不抢占,主恢复后 VIP 不会自动抢回
    authentication {
        auth_type PASS
        auth_pass K8SHA_KA_AUTH
    }
    virtual_ipaddress {
        <vip-internal>
        <vip-public>
    }
    track_script {
        check_nginx
    }
}

3.4 keepalived.conf(master2)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
global_defs { router_id LVS_DEVEL ... }
vrrp_script check_nginx { script ".../check_nginx.sh"; interval 2; weight -2; }
vrrp_instance VI_1 {
    state BACKUP                # BACKUP 角色
    interface eno1
    mcast_src_ip <master2-ip>
    virtual_router_id 51        # 必须和 master1 一致
    priority 80                 # 比 MASTER 低
    advert_int 2
    nopreempt
    authentication { auth_type PASS; auth_pass K8SHA_KA_AUTH; }
    virtual_ipaddress { <vip-internal> <vip-public> }
    track_script { check_nginx; }
}

3.5 keepalived.conf(master3,可选)

1
2
3
4
state BACKUP
priority 60
virtual_router_id 51
... (其余同 master2)

3.6 systemd 单元

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
[Unit]
Description=VRRP High Availability Monitor
After=network-online.target syslog.target
Wants=network-online.target
Documentation=https://keepalived.org/manpage.html

[Service]
Type=forking
KillMode=process
ExecStart=/usr/local/keepalived/sbin/keepalived -D -f /usr/local/keepalived/conf/keepalived.conf
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target

启动:

1
2
3
4
5
systemctl daemon-reload
systemctl enable --now keepalived
systemctl restart keepalived
systemctl status keepalived
journalctl -f -u keepalived

打包已安装的 keepalived 用于其他节点:

1
2
3
cd /usr/local
tar -zcvf keepalived.tar.gz keepalived
scp keepalived.tar.gz master2:/usr/local/

4. 跨机房 VIP(Unicast 模式)

如果两个机房之间是 L3 路由(不是同 L2),VRRP 多播会被丢,需要切到单播模式:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
vrrp_instance VI_1 {
    state MASTER
    interface eno1
    virtual_router_id 51
    priority 100
    advert_int 1
    unicast_src_ip <this-node-ip>
    unicast_peer {
       <peer-node-ip>          # 另一机房的对端 IP
    }
    authentication { auth_type PASS; auth_pass 1111; }
    virtual_ipaddress { <vip-cross-dc> }
}

unicast_src_ip + unicast_peer 显式指定对端,绕过 VRRP 多播。


5. 验证

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# 验证 VIP
curl <vip-internal>:10088

# 停 master1 模拟故障
ssh master1 "systemctl stop keepalived"
# VIP 应该在 1-3 秒内漂到 master2

# 启动 master1 keepalived
ssh master1 "systemctl start keepalived"
# 因为 nopreempt,VIP 不会回到 master1,仍在 master2 上

6. 常见问题

6.1 ping 不通 VIP

ARP 缓存未刷新。在 client 上:

1
2
# 清理 ARP 缓存
arp -n | awk '/^[1-9]/{print "arp -d " $1}' | sh

或等 30 秒(ARP 默认老化时间)。

6.2 不同网段 ping 不通

防火墙或路由限制。检查 iptables -L -nip route

6.3 VIP 漂移后不能自动启动

健康检查脚本没成功。先手动跑:

1
bash /usr/local/keepalived/conf/check_nginx.sh

6.4 master3 也想加 VIP 漂移

把 master3 也装 nginx + keepalived,配置 state BACKUP + priority 60 即可。


7. 小结

Nginx stream 4 层 + Keepalived VIP 是 K8s 1.20+ 时代最稳的 apiserver HA 方案。几个关键点

  1. nginx 必须 --with-stream 编译
  2. keepalived 必须 nopreempt(否则主备反复切换时丢包)
  3. 健康检查用 HTTP 200 不用 ps
  4. 跨机房用 unicast_src_ip + unicast_peer 替代多播

下一步:etcd 3.5 集群二进制部署与 TLS 证书签发

使用 Hugo 构建
主题 StackJimmy 设计