Featured image of post 容器运行时:Docker + cri-dockerd 二进制部署实战

容器运行时:Docker + cri-dockerd 二进制部署实战

K8s 1.24+ 不再内置 dockershim,cri-dockerd 离线部署 + containerd 单元文件 + 镜像加速

为什么需要 cri-dockerd

Kubernetes 1.24(2022-05)正式移除了内置的 dockershim,也就是说 1.24+ 集群如果还想用 Docker 作为容器运行时,必须额外装 cri-dockerd 这个适配器——它把 K8s 的 CRI(Container Runtime Interface)请求转译成 Docker API。

适用版本:Kubernetes 1.28.5 / Docker 24.0.7 / cri-dockerd 0.3.9 / Ubuntu 22.04 部署位置:所有 master 和 worker 节点


1. Docker 离线安装

1.1 下载包

1
2
3
4
5
# 官方下载页
https://download.docker.com/linux/static/stable/x86_64/

# 具体版本
https://download.docker.com/linux/static/stable/x86_64/docker-24.0.7.tgz

1.2 解压到所有节点

1
2
3
4
5
6
7
8
cd /data/softs
tar xf docker-*.tgz

# 文件夹里包含以下二进制
ls docker
# containerd  containerd-shim-runc-v2  ctr  docker  dockerd  docker-init  docker-proxy  runc

cp docker/* /usr/local/bin/

1.3 containerd 单元文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
cat << "EOF" > /etc/systemd/system/containerd.service
[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target local-fs.target

[Service]
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/containerd
Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=1048576
TasksMax=infinity
OOMScoreAdjust=-999

[Install]
WantedBy=multi-user.target
EOF

systemctl enable --now containerd.service
systemctl status containerd.service

1.4 docker.service 单元

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
cat << "EOF" > /etc/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket containerd.service

[Service]
Type=notify
ExecStart=/usr/local/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
StartLimitBurst=3
StartLimitInterval=60s
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Delegate=yes
KillMode=process
OOMScoreAdjust=-500

[Install]
WantedBy=multi-user.target
EOF

1.5 docker.socket 单元

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
cat << "EOF" > /etc/systemd/system/docker.socket
[Unit]
Description=Docker Socket for the API

[Socket]
ListenStream=/var/run/docker.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker

[Install]
WantedBy=sockets.target
EOF

1.6 daemon.json 配置

镜像加速、容器日志轮转、systemd cgroup driver:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
mkdir -p /etc/docker

cat << "EOF" > /etc/docker/daemon.json
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": { "max-size": "50m", "max-file": "1" },
  "registry-mirrors": [
    "https://<your-mirror>.mirror.aliyuncs.com",
    "https://docker.m.daocloud.io",
    "https://hub-mirror.c.163.com",
    "https://mirror.baidubce.com",
    "https://docker.nju.edu.cn",
    "https://docker.mirrors.sjtug.sjtu.edu.cn",
    "https://dockerproxy.com"
  ],
  "insecure-registries": ["<private-registry>:13001"]
}
EOF

关键点exec-opts 必须用 native.cgroupdriver=systemd,否则 kubelet 与 docker 的 cgroup driver 不一致会导致节点 NotReady。

1.7 启动 Docker

1
2
3
4
5
groupadd docker
systemctl daemon-reload
systemctl enable --now docker.socket
systemctl enable --now docker.service
docker info

2. cri-dockerd 部署

2.1 下载与安装

1
https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.9/cri-dockerd-0.3.9.amd64.tgz
1
2
3
cd /data/softs
tar xf cri-dockerd-*.amd64.tgz
cp cri-dockerd/cri-dockerd /usr/local/bin/

2.2 cri-docker.service 单元

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
cat << "EOF" > /etc/systemd/system/cri-docker.service
[Unit]
Description=CRI Interface for Docker Application Container Engine
Documentation=https://docs.mirantis.com
After=network-online.target firewalld.service docker.service
Wants=network-online.target
Requires=cri-docker.socket

[Service]
Type=notify
ExecStart=/usr/local/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=<private-registry>:13001/base/google_containers/pause:3.9
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
StartLimitBurst=3
StartLimitInterval=60s
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Delegate=yes
KillMode=process

[Install]
WantedBy=multi-user.target
EOF

2.3 cri-docker.socket 单元

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
cat << "EOF" > /etc/systemd/system/cri-docker.socket
[Unit]
Description=CRI Docker Socket for the API
PartOf=cri-docker.service

[Socket]
ListenStream=%t/cri-dockerd.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker

[Install]
WantedBy=sockets.target
EOF

2.4 启动 cri-dockerd

1
2
3
4
5
systemctl daemon-reload
systemctl enable --now cri-docker.socket
systemctl enable --now cri-docker.service
systemctl restart cri-docker.service
systemctl status cri-docker.service

2.5 验证

1
2
systemctl status containerd.service docker.socket docker.service cri-docker.socket cri-docker.service
journalctl -f -u containerd.service

3. 常见问题

3.1 containerd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT

容器运行时配置损坏:

1
2
3
4
5
6
7
8
rm -rf /var/lib/containerd
systemctl restart containerd

rm -rf /var/run/docker.sock
systemctl restart docker.socket

rm -rf /run/cri-dockerd.sock
systemctl restart cri-docker.socket

3.2 节点 NotReady 且 RuntimeReady=False

kubelet 报 validate CRI v1 runtime API 失败。原因是 cri-dockerd 启动顺序问题,在 systemd 中加 After= 依赖。

3.3 cgroup driver 不一致

1
2
3
4
5
6
docker info | grep cgroup
# Cgroup Driver: systemd

# kubelet 也要 systemd
cat /etc/kubernetes/kubelet-conf.yml | grep cgroupDriver
# cgroupDriver: systemd

4. 小结

容器运行时是 K8s 集群所有组件的"地基"——etcd、apiserver、kubelet 都要通过 CRI socket 与运行时通信。最容易踩的坑

  1. cri-dockerd 没装或 pause:3.9 镜像拉不到
  2. cgroup driver 不一致
  3. systemd-resolved 占用 53 端口(前置篇已关,但加新节点时容易忘)

下一步:高可用负载均衡:Nginx + Keepalived 编译与 VIP 漂移

使用 Hugo 构建
主题 StackJimmy 设计