为什么需要 cri-dockerd
Kubernetes 1.24(2022-05)正式移除了内置的 dockershim,也就是说 1.24+ 集群如果还想用 Docker 作为容器运行时,必须额外装 cri-dockerd 这个适配器——它把 K8s 的 CRI(Container Runtime Interface)请求转译成 Docker API。
适用版本:Kubernetes 1.28.5 / Docker 24.0.7 / cri-dockerd 0.3.9 / Ubuntu 22.04
部署位置:所有 master 和 worker 节点
1. Docker 离线安装
1.1 下载包
1
2
3
4
5
| # 官方下载页
https://download.docker.com/linux/static/stable/x86_64/
# 具体版本
https://download.docker.com/linux/static/stable/x86_64/docker-24.0.7.tgz
|
1.2 解压到所有节点
1
2
3
4
5
6
7
8
| cd /data/softs
tar xf docker-*.tgz
# 文件夹里包含以下二进制
ls docker
# containerd containerd-shim-runc-v2 ctr docker dockerd docker-init docker-proxy runc
cp docker/* /usr/local/bin/
|
1.3 containerd 单元文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
| cat << "EOF" > /etc/systemd/system/containerd.service
[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target local-fs.target
[Service]
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/containerd
Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=1048576
TasksMax=infinity
OOMScoreAdjust=-999
[Install]
WantedBy=multi-user.target
EOF
systemctl enable --now containerd.service
systemctl status containerd.service
|
1.4 docker.service 单元
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
| cat << "EOF" > /etc/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket containerd.service
[Service]
Type=notify
ExecStart=/usr/local/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
StartLimitBurst=3
StartLimitInterval=60s
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Delegate=yes
KillMode=process
OOMScoreAdjust=-500
[Install]
WantedBy=multi-user.target
EOF
|
1.5 docker.socket 单元
1
2
3
4
5
6
7
8
9
10
11
12
13
| cat << "EOF" > /etc/systemd/system/docker.socket
[Unit]
Description=Docker Socket for the API
[Socket]
ListenStream=/var/run/docker.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker
[Install]
WantedBy=sockets.target
EOF
|
1.6 daemon.json 配置
镜像加速、容器日志轮转、systemd cgroup driver:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| mkdir -p /etc/docker
cat << "EOF" > /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": { "max-size": "50m", "max-file": "1" },
"registry-mirrors": [
"https://<your-mirror>.mirror.aliyuncs.com",
"https://docker.m.daocloud.io",
"https://hub-mirror.c.163.com",
"https://mirror.baidubce.com",
"https://docker.nju.edu.cn",
"https://docker.mirrors.sjtug.sjtu.edu.cn",
"https://dockerproxy.com"
],
"insecure-registries": ["<private-registry>:13001"]
}
EOF
|
关键点:exec-opts 必须用 native.cgroupdriver=systemd,否则 kubelet 与 docker 的 cgroup driver 不一致会导致节点 NotReady。
1.7 启动 Docker
1
2
3
4
5
| groupadd docker
systemctl daemon-reload
systemctl enable --now docker.socket
systemctl enable --now docker.service
docker info
|
2. cri-dockerd 部署
2.1 下载与安装
1
| https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.9/cri-dockerd-0.3.9.amd64.tgz
|
1
2
3
| cd /data/softs
tar xf cri-dockerd-*.amd64.tgz
cp cri-dockerd/cri-dockerd /usr/local/bin/
|
2.2 cri-docker.service 单元
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
| cat << "EOF" > /etc/systemd/system/cri-docker.service
[Unit]
Description=CRI Interface for Docker Application Container Engine
Documentation=https://docs.mirantis.com
After=network-online.target firewalld.service docker.service
Wants=network-online.target
Requires=cri-docker.socket
[Service]
Type=notify
ExecStart=/usr/local/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=<private-registry>:13001/base/google_containers/pause:3.9
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
StartLimitBurst=3
StartLimitInterval=60s
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Delegate=yes
KillMode=process
[Install]
WantedBy=multi-user.target
EOF
|
2.3 cri-docker.socket 单元
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| cat << "EOF" > /etc/systemd/system/cri-docker.socket
[Unit]
Description=CRI Docker Socket for the API
PartOf=cri-docker.service
[Socket]
ListenStream=%t/cri-dockerd.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker
[Install]
WantedBy=sockets.target
EOF
|
2.4 启动 cri-dockerd
1
2
3
4
5
| systemctl daemon-reload
systemctl enable --now cri-docker.socket
systemctl enable --now cri-docker.service
systemctl restart cri-docker.service
systemctl status cri-docker.service
|
2.5 验证
1
2
| systemctl status containerd.service docker.socket docker.service cri-docker.socket cri-docker.service
journalctl -f -u containerd.service
|
3. 常见问题
3.1 containerd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
容器运行时配置损坏:
1
2
3
4
5
6
7
8
| rm -rf /var/lib/containerd
systemctl restart containerd
rm -rf /var/run/docker.sock
systemctl restart docker.socket
rm -rf /run/cri-dockerd.sock
systemctl restart cri-docker.socket
|
3.2 节点 NotReady 且 RuntimeReady=False
kubelet 报 validate CRI v1 runtime API 失败。原因是 cri-dockerd 启动顺序问题,在 systemd 中加 After= 依赖。
3.3 cgroup driver 不一致
1
2
3
4
5
6
| docker info | grep cgroup
# Cgroup Driver: systemd
# kubelet 也要 systemd
cat /etc/kubernetes/kubelet-conf.yml | grep cgroupDriver
# cgroupDriver: systemd
|
4. 小结
容器运行时是 K8s 集群所有组件的"地基"——etcd、apiserver、kubelet 都要通过 CRI socket 与运行时通信。最容易踩的坑:
cri-dockerd 没装或 pause:3.9 镜像拉不到- cgroup driver 不一致
- systemd-resolved 占用 53 端口(前置篇已关,但加新节点时容易忘)
下一步:高可用负载均衡:Nginx + Keepalived 编译与 VIP 漂移。