Kubelet启动失败
报警:kubelet down了,当时机房断电了,服务没能够正常启动。正常情况下,进入节点systemctl restart kubelet就可以了。但是这次怎么restart都启动失败。
# 1.docker排查
[root@k8s-node31 ~]# systemctl status docker -l
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Mon 2023-11-06 09:09:04 CST; 4min 20s ago
Docs: https://docs.docker.com
Nov 06 09:09:01 k8s-node31 systemd[1]: Failed to start Docker Application Container Engine.
Nov 06 09:09:01 k8s-node31 systemd[1]: Unit docker.service entered failed state.
Nov 06 09:09:01 k8s-node31 systemd[1]: docker.service failed. # 启动失败
Nov 06 09:09:04 k8s-node31 systemd[1]: docker.service holdoff time over, scheduling restart.
Nov 06 09:09:04 k8s-node31 systemd[1]: Stopped Docker Application Container Engine.
Nov 06 09:09:04 k8s-node31 systemd[1]: start request repeated too quickly for docker.service
Nov 06 09:09:04 k8s-node31 systemd[1]: Failed to start Docker Application Container Engine.
Nov 06 09:09:04 k8s-node31 systemd[1]: Unit docker.service entered failed state.
Nov 06 09:09:04 k8s-node31 systemd[1]: docker.service failed.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# 2. journalctl -xe查看异常日志
[root@k8s-node31 ~]# journalctl -xe
Nov 06 09:13:39 k8s-node31 systemd[1]: Unit docker.service entered failed state.
Nov 06 09:13:39 k8s-node31 systemd[1]: docker.service failed.
Nov 06 09:13:40 k8s-node31 systemd[1]: flanneld.service holdoff time over, scheduling restart.
Nov 06 09:13:40 k8s-node31 systemd[1]: Stopped Flanneld overlay address etcd agent.
-- Subject: Unit flanneld.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit flanneld.service has finished shutting down.
Nov 06 09:13:40 k8s-node31 systemd[1]: Starting Flanneld overlay address etcd agent...
-- Subject: Unit flanneld.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
--
--
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
--
-- Unit docker.service has failed.
--
-- The result is failed.
Nov 06 09:13:39 k8s-node31 systemd[1]: Unit docker.service entered failed state.
Nov 06 09:13:39 k8s-node31 systemd[1]: docker.service failed. # docker启动失败
Nov 06 09:13:40 k8s-node31 systemd[1]: flanneld.service holdoff time over, scheduling restart.
Nov 06 09:13:40 k8s-node31 systemd[1]: Stopped Flanneld overlay address etcd agent. # flanneld启动失败
-- Subject: Unit flanneld.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit flanneld.service has finished shutting down.
Nov 06 09:13:40 k8s-node31 systemd[1]: Starting Flanneld overlay address etcd agent...
-- Subject: Unit flanneld.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
--
--
-- Unit flanneld.service has begun starting up.
Nov 06 09:13:41 k8s-node31 flanneld[16267]: I1106 09:13:41.038990 16267 main.go:210] Could not find valid interface matching ens3f0: failed to find IPv4 address for interface # 问题:该网卡找不到ipv4地址
Nov 06 09:13:41 k8s-node31 flanneld[16267]: E1106 09:13:41.039062 16267 main.go:234] Failed to find interface to use that matches the interfaces and/or regexes provided
Nov 06 09:13:41 k8s-node31 systemd[1]: flanneld.service: main process exited, code=exited, status=1/FAILURE
Nov 06 09:13:41 k8s-node31 systemd[1]: Failed to start Flanneld overlay address etcd agent.
-- Subject: Unit flanneld.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit flanneld.service has failed.
--
-- The result is failed.
Nov 06 09:13:41 k8s-node31 systemd[1]: Unit flanneld.service entered failed state.
Nov 06 09:13:41 k8s-node31 systemd[1]: flanneld.service failed.
Nov 06 09:13:46 k8s-node31 systemd[1]: flanneld.service holdoff time over, scheduling restart.
Nov 06 09:13:46 k8s-node31 systemd[1]: Stopped Flanneld overlay address etcd agent.
-- Subject: Unit flanneld.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit flanneld.service has finished shutting down.
Nov 06 09:13:46 k8s-node31 systemd[1]: Starting Flanneld overlay address etcd agent...
-- Subject: Unit flanneld.service has begun start-up
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# 3.查看 ens3f0网卡地址,但是同事给这台机器做了kvm,网卡变成br0了
[root@k8s-node31 ~]# ifconfig
br0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.88.33.184 netmask 255.255.255.0 broadcast 10.88.33.255
ether b4:05:5d:b9:0e:d4 txqueuelen 1000 (Ethernet)
RX packets 1474909 bytes 1876206426 (1.7 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1355428 bytes 941456074 (897.8 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
ens3f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
ether b4:05:5d:b9:0e:d4 txqueuelen 1000 (Ethernet)
RX packets 2428529 bytes 1963491825 (1.8 GiB)
RX errors 0 dropped 331 overruns 0 frame 0
TX packets 1710315 bytes 964893494 (920.1 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0xbba20000-bba3ffff
ens3f1: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether b4:05:5d:b9:0e:d5 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0xbba00000-bba1ffff
flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 172.17.21.0 netmask 255.255.255.255 broadcast 0.0.0.0
ether f2:be:df:6c:48:98 txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# 4.修改flanneld.service 配置文件
[root@k8s-node31 ~]# cat /usr/lib/systemd/system/flanneld.service
[Unit]
Description=Flanneld overlay address etcd agent
After=network.target
After=network-online.target
Wants=network-online.target
After=etcd.service
Before=docker.service
[Service]
Type=notify
ExecStart=/opt/kubernetes/bin/flanneld \
-etcd-cafile=/opt/kubernetes/ssl/ca.pem \
-etcd-certfile=/opt/kubernetes/ssl/flanneld.pem \
-etcd-keyfile=/opt/kubernetes/ssl/flanneld-key.pem \
-etcd-endpoints=https://10.88.33.218:2379,https://10.88.33.219:2379,https://10.88.33.220:2379 \
-etcd-prefix=/kubernetes/network \
-iface=ens3f0 \ # 更换成br0
-ip-masq
ExecStartPost=/opt/kubernetes/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/subnet.env
Restart=always
RestartSec=5
LimitNOFILE=65536
StartLimitInterval=0
[Install]
WantedBy=multi-user.target
RequiredBy=docker.service
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# 5.依次重启flanneld docker kubelet
systemctl daemon-reload
systemctl restart flanneld
systemctl restart docker
systemctl restart kubelet
1
2
3
4
2
3
4
上次更新: 2025/04/25, 03:40:17