kubelet + kube-proxy 是什么
kubelet 是每个 Node 上的"代理",负责:
- 接收 apiserver 的 Pod spec
- 调用容器运行时(cri-dockerd)启动容器
- 上报节点状态、心跳、资源 metrics
kube-proxy 是每个 Node 上的"网络代理",负责:
- 监听 apiserver 的 Service / Endpoints 变化
- 维护节点上 iptables 或 IPVS 规则
- 实现 ClusterIP、NodePort 的流量转发
适用版本:Kubernetes 1.28.5 / IPVS
部署位置:所有 master + worker 节点
1. kubelet 部署
1.1 systemd 单元
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| cat << "EOF" > /etc/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/kubernetes/kubernetes
[Service]
ExecStart=/usr/local/bin/kubelet \
--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.kubeconfig \
--kubeconfig=/etc/kubernetes/kubelet.kubeconfig \
--config=/etc/kubernetes/kubelet-conf.yml \
--container-runtime-endpoint=unix:///run/cri-dockerd.sock \
--node-labels=node.kubernetes.io/node=
Restart=always
RestartSec=5
StartLimitInterval=0
[Install]
WantedBy=multi-user.target
EOF
|
1.2 kubelet-conf.yml(关键)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
| apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
address: 0.0.0.0
port: 10250
readOnlyPort: 10255
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 2m0s
enabled: true
x509:
clientCAFile: /etc/kubernetes/pki/ca.pem
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 5m0s
cacheUnauthorizedTTL: 30s
cgroupDriver: systemd # 必须与 docker daemon.json 一致
cgroupsPerQOS: true
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
containerLogMaxFiles: 5
containerLogMaxSize: 10Mi
contentType: application/vnd.kubernetes.protobuf
cpuCFSQuota: true
cpuManagerPolicy: none
cpuManagerReconcilePeriod: 10s
enableControllerAttachDetach: true
enableDebuggingHandlers: true
enforceNodeAllocatable:
- pods
eventBurst: 10
eventRecordQPS: 5
evictionHard:
imagefs.available: 15%
memory.available: 100Mi
nodefs.available: 10%
nodefs.inodesFree: 5%
evictionPressureTransitionPeriod: 5m0s
failSwapOn: true # 强制要求 swap 已禁
fileCheckFrequency: 20s
hairpinMode: promiscuous-bridge
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 20s
imageGCHighThresholdPercent: 85
imageGCLowThresholdPercent: 80
imageMinimumGCAge: 2m0s
iptablesDropBit: 15
iptablesMasqueradeBit: 14
kubeAPIBurst: 10
kubeAPIQPS: 5
makeIPTablesUtilChains: true
maxOpenFiles: 1000000
maxPods: 110
nodeStatusUpdateFrequency: 10s
oomScoreAdj: -999
podPidsLimit: -1
registryBurst: 10
registryPullQPS: 5
resolvConf: /etc/resolv.conf
rotateCertificates: true
runtimeRequestTimeout: 2m0s
serializeImagePulls: true
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 4h0m0s
syncFrequency: 1m0s
volumeStatsAggPeriod: 1m0s
|
1.3 复制到所有节点
1
2
3
4
5
6
| for NODE in master1 master2 master3 worker1 worker2 worker3 worker4 worker5 worker6 worker7 worker8 worker9; do
scp /etc/kubernetes/kubelet-conf.yml $NODE:/etc/kubernetes/
scp /etc/kubernetes/bootstrap-kubelet.kubeconfig $NODE:/etc/kubernetes/
scp /etc/kubernetes/kube-proxy.kubeconfig $NODE:/etc/kubernetes/
scp /etc/systemd/system/kubelet.service $NODE:/etc/systemd/system/
done
|
1.4 启动 kubelet
1
2
3
4
5
| systemctl daemon-reload
systemctl enable --now kubelet.service
systemctl restart kubelet.service
systemctl status kubelet.service
journalctl -f -u kubelet
|
1.5 命令补全
1
2
3
4
5
6
7
8
9
10
11
12
13
| cat << "EOF" >> /root/.bashrc
export KUBECONFIG=/etc/kubernetes/admin.kubeconfig
alias k=kubectl
source <(kubectl completion bash)
EOF
source /root/.bashrc
# 查看节点
kubectl get node
# 看容器运行时
kubectl describe node | grep Runtime
|
2. kube-proxy 部署
2.1 分发 kubeconfig
1
2
3
| for NODE in master1 master2 master3; do
scp /etc/kubernetes/kube-proxy.kubeconfig $NODE:/etc/kubernetes/
done
|
2.2 kube-proxy.service
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| [Unit]
Description=Kubernetes Kube Proxy
Documentation=https://github.com/kubernetes/kubernetes
After=network.target
[Service]
ExecStart=/usr/local/bin/kube-proxy \
--config=/etc/kubernetes/kube-proxy.yaml \
--v=2
Restart=always
RestartSec=10s
[Install]
WantedBy=multi-user.target
|
2.3 kube-proxy.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
| apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
clientConnection:
acceptContentTypes: ""
burst: 10
contentType: application/vnd.kubernetes.protobuf
kubeconfig: /etc/kubernetes/kube-proxy.kubeconfig
qps: 5
clusterCIDR: 172.218.0.0/12,fc00:2222::/112
configSyncPeriod: 15m0s
conntrack:
max: null
maxPerCore: 32768
min: 131072
tcpCloseWaitTimeout: 1h0m0s
tcpEstablishedTimeout: 24h0m0s
enableProfiling: false
healthzBindAddress: 0.0.0.0:10256
hostnameOverride: ""
iptables:
masqueradeAll: false
masqueradeBit: 14
minSyncPeriod: 0s
syncPeriod: 30s
ipvs:
masqueradeAll: true
minSyncPeriod: 5s
scheduler: "rr"
syncPeriod: 30s
kind: KubeProxyConfiguration
metricsBindAddress: 127.0.0.1:10249
mode: "ipvs"
nodePortAddresses: null
oomScoreAdj: -999
portRange: ""
udpIdleTimeout: 250ms
|
2.4 启动
1
2
3
4
5
| for NODE in master1 master2 master3 worker1 worker2 worker3 worker4 worker5 worker6 worker7 worker8 worker9; do
ssh $NODE "systemctl daemon-reload"
ssh $NODE "systemctl enable --now kube-proxy.service"
ssh $NODE "systemctl restart kube-proxy.service"
done
|
2.5 验证 IPVS
1
2
| ipvsadm -ln
# 预期看到 10.96.0.0/12 的 Service IP 列表
|
3. 节点 Ready 验证
1
2
3
4
5
6
7
8
| # 等待 30 秒后
kubectl get node
# NAME STATUS ROLES AGE VERSION
# master1 Ready <none> 5m v1.28.5
# master2 Ready <none> 5m v1.28.5
# master3 Ready <none> 5m v1.28.5
# worker1 Ready <none> 3m v1.28.5
# ...
|
如果节点 NotReady:
1
2
| kubectl describe node <node-name>
# 看 Conditions 里的 Ready 状态
|
常见原因:
cgroupDriver 不一致 → kubelet 起不来- CRI socket 路径错 →
container-runtime-endpoint 检查 - kubelet.conf 里 clusterDNS 写错 → CoreDNS IP 必须是 10.96.0.10
4. 排错速查
| 现象 | 原因 | 解决 |
|---|
kubelet: failed to run Kubelet: validate service connection: validate CRI v1 runtime API | cri-dockerd 没起 | 启动 cri-docker.service |
kubelet: cgroup driver "cgroupfs" is different from docker "systemd" | docker cgroup driver 不一致 | 改 daemon.json + 重启 docker |
kubelet: node "master1" not found | apiserver 没有这个 node 对象 | kubectl get csr 看有没有待批准的证书 |
kubelet: unauthorized | bootstrap token 错 | 重新生成 bootstrap.secret.yaml |
failed to start CRI service: timeout | cri-dockerd socket 不存在 | 检查 /run/cri-dockerd.sock |
kubelet.service: Main process exited, code=exited, status=255 | kubelet-conf.yml 写错 | kubelet --config=... --v=4 看具体错 |
5. 小结
节点组件是 K8s 集群的"四肢":
- kubelet + cri-dockerd 缺一不可
- cgroupDriver 必须 kubelet 与 docker 一致(都是
systemd) - IPVS 模式比 iptables 在大规模集群下性能更好
- TLS Bootstrapping 用 token 代替每个 node 签证书
下一步:K8s 核心插件部署:Calico 网络 + CoreDNS + Metrics-server。