System Reboot Engineer System Reboot Engineer
首页
运维
编程

小布江

首页
运维
编程
  • Kubernetes

    • Ack开启nginx-ingress-controller公/私网双SLB
    • kube-apiserver请求异常
    • nginx-ingress-controller反向代理踩坑记
    • Kubelet启动失败
      • Kubelet证书到期轮转
      • nginx-ingress-controller开启tcp/udp
    • 日常

    • Prometheus

    • Ci

    • 运维
    • Kubernetes
    小布江
    2024-04-17
    目录

    Kubelet启动失败


    报警:kubelet down了,当时机房断电了,服务没能够正常启动。正常情况下,进入节点systemctl restart kubelet就可以了。但是这次怎么restart都启动失败。


    # 1.docker排查
    [root@k8s-node31 ~]# systemctl status docker -l
    ● docker.service - Docker Application Container Engine
       Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
       Active: failed (Result: start-limit) since Mon 2023-11-06 09:09:04 CST; 4min 20s ago
         Docs: https://docs.docker.com
    
    Nov 06 09:09:01 k8s-node31 systemd[1]: Failed to start Docker Application Container Engine.
    Nov 06 09:09:01 k8s-node31 systemd[1]: Unit docker.service entered failed state.
    Nov 06 09:09:01 k8s-node31 systemd[1]: docker.service failed. # 启动失败
    Nov 06 09:09:04 k8s-node31 systemd[1]: docker.service holdoff time over, scheduling restart.
    Nov 06 09:09:04 k8s-node31 systemd[1]: Stopped Docker Application Container Engine.
    Nov 06 09:09:04 k8s-node31 systemd[1]: start request repeated too quickly for docker.service
    Nov 06 09:09:04 k8s-node31 systemd[1]: Failed to start Docker Application Container Engine.
    Nov 06 09:09:04 k8s-node31 systemd[1]: Unit docker.service entered failed state.
    Nov 06 09:09:04 k8s-node31 systemd[1]: docker.service failed.
    
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    # 2. journalctl -xe查看异常日志
    [root@k8s-node31 ~]# journalctl -xe
    Nov 06 09:13:39 k8s-node31 systemd[1]: Unit docker.service entered failed state.
    Nov 06 09:13:39 k8s-node31 systemd[1]: docker.service failed.
    Nov 06 09:13:40 k8s-node31 systemd[1]: flanneld.service holdoff time over, scheduling restart.
    Nov 06 09:13:40 k8s-node31 systemd[1]: Stopped Flanneld overlay address etcd agent.
    -- Subject: Unit flanneld.service has finished shutting down
    -- Defined-By: systemd
    -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
    -- 
    -- Unit flanneld.service has finished shutting down.
    Nov 06 09:13:40 k8s-node31 systemd[1]: Starting Flanneld overlay address etcd agent...
    -- Subject: Unit flanneld.service has begun start-up
    -- Defined-By: systemd
    -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
    -- 
    -- 
    -- 
    -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
    -- 
    -- 
    -- Unit docker.service has failed.
    -- 
    -- The result is failed.
    Nov 06 09:13:39 k8s-node31 systemd[1]: Unit docker.service entered failed state. 
    Nov 06 09:13:39 k8s-node31 systemd[1]: docker.service failed. # docker启动失败
    Nov 06 09:13:40 k8s-node31 systemd[1]: flanneld.service holdoff time over, scheduling restart.
    Nov 06 09:13:40 k8s-node31 systemd[1]: Stopped Flanneld overlay address etcd agent. # flanneld启动失败
    -- Subject: Unit flanneld.service has finished shutting down
    -- Defined-By: systemd
    -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
    -- 
    -- Unit flanneld.service has finished shutting down.
    Nov 06 09:13:40 k8s-node31 systemd[1]: Starting Flanneld overlay address etcd agent...
    -- Subject: Unit flanneld.service has begun start-up
    -- Defined-By: systemd
    -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
    -- 
    -- 
    -- 
    -- Unit flanneld.service has begun starting up.
    Nov 06 09:13:41 k8s-node31 flanneld[16267]: I1106 09:13:41.038990   16267 main.go:210] Could not find valid interface matching ens3f0: failed to find IPv4 address for interface # 问题:该网卡找不到ipv4地址
    Nov 06 09:13:41 k8s-node31 flanneld[16267]: E1106 09:13:41.039062   16267 main.go:234] Failed to find interface to use that matches the interfaces and/or regexes provided
    Nov 06 09:13:41 k8s-node31 systemd[1]: flanneld.service: main process exited, code=exited, status=1/FAILURE
    Nov 06 09:13:41 k8s-node31 systemd[1]: Failed to start Flanneld overlay address etcd agent.
    -- Subject: Unit flanneld.service has failed
    -- Defined-By: systemd
    -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
    -- 
    -- Unit flanneld.service has failed.
    -- 
    -- The result is failed.
    Nov 06 09:13:41 k8s-node31 systemd[1]: Unit flanneld.service entered failed state.
    Nov 06 09:13:41 k8s-node31 systemd[1]: flanneld.service failed.
    Nov 06 09:13:46 k8s-node31 systemd[1]: flanneld.service holdoff time over, scheduling restart.
    Nov 06 09:13:46 k8s-node31 systemd[1]: Stopped Flanneld overlay address etcd agent.
    -- Subject: Unit flanneld.service has finished shutting down
    -- Defined-By: systemd
    -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
    -- 
    -- Unit flanneld.service has finished shutting down.
    Nov 06 09:13:46 k8s-node31 systemd[1]: Starting Flanneld overlay address etcd agent...
    -- Subject: Unit flanneld.service has begun start-up
    
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    # 3.查看 ens3f0网卡地址,但是同事给这台机器做了kvm,网卡变成br0了
    [root@k8s-node31 ~]# ifconfig 
    br0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 10.88.33.184  netmask 255.255.255.0  broadcast 10.88.33.255
            ether b4:05:5d:b9:0e:d4  txqueuelen 1000  (Ethernet)
            RX packets 1474909  bytes 1876206426 (1.7 GiB)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 1355428  bytes 941456074 (897.8 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
    ens3f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            ether b4:05:5d:b9:0e:d4  txqueuelen 1000  (Ethernet)
            RX packets 2428529  bytes 1963491825 (1.8 GiB)
            RX errors 0  dropped 331  overruns 0  frame 0
            TX packets 1710315  bytes 964893494 (920.1 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
            device memory 0xbba20000-bba3ffff  
    
    ens3f1: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
            ether b4:05:5d:b9:0e:d5  txqueuelen 1000  (Ethernet)
            RX packets 0  bytes 0 (0.0 B)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 0  bytes 0 (0.0 B)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
            device memory 0xbba00000-bba1ffff  
    
    flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
            inet 172.17.21.0  netmask 255.255.255.255  broadcast 0.0.0.0
            ether f2:be:df:6c:48:98  txqueuelen 0  (Ethernet)
            RX packets 0  bytes 0 (0.0 B)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 0  bytes 0 (0.0 B)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    # 4.修改flanneld.service 配置文件
    [root@k8s-node31 ~]# cat /usr/lib/systemd/system/flanneld.service
    [Unit]
    Description=Flanneld overlay address etcd agent
    After=network.target
    After=network-online.target
    Wants=network-online.target
    After=etcd.service
    Before=docker.service
    
    [Service]
    Type=notify
    ExecStart=/opt/kubernetes/bin/flanneld \
      -etcd-cafile=/opt/kubernetes/ssl/ca.pem \
      -etcd-certfile=/opt/kubernetes/ssl/flanneld.pem \
      -etcd-keyfile=/opt/kubernetes/ssl/flanneld-key.pem \
      -etcd-endpoints=https://10.88.33.218:2379,https://10.88.33.219:2379,https://10.88.33.220:2379 \
      -etcd-prefix=/kubernetes/network \
      -iface=ens3f0 \ # 更换成br0
      -ip-masq
    ExecStartPost=/opt/kubernetes/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/subnet.env
    Restart=always
    RestartSec=5
    LimitNOFILE=65536
    StartLimitInterval=0
    
    [Install]
    WantedBy=multi-user.target
    RequiredBy=docker.service
    
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    # 5.依次重启flanneld docker kubelet
    systemctl daemon-reload
    systemctl restart flanneld
    systemctl restart docker
    systemctl restart kubelet
    
    1
    2
    3
    4
    #Kubelet
    上次更新: 2025/04/25, 03:40:17
    nginx-ingress-controller反向代理踩坑记
    Kubelet证书到期轮转

    ← nginx-ingress-controller反向代理踩坑记 Kubelet证书到期轮转→

    最近更新
    01
    Harbor复制镜像
    04-15
    02
    CPU亲和
    04-10
    03
    开启telnet登录
    04-09
    更多文章>
    Theme by Vdoing
    • 跟随系统
    • 浅色模式
    • 深色模式
    • 阅读模式