请稍等 ...
×

采纳答案成功!

向帮助你的同学说点啥吧!感谢那些助人为乐的人

安装完 k8s集群后,我发现机器无法访问外网了和冒烟测试DNS失败

背景:

买的云厂商hk的三个节点,没有配置代理科学上网服务,因为hk没有墙。


第一个:安装完 k8s集群后,我发现宿主机无法访问外网了。


ping 普通网站不通,ping 什么网站都不通


https://img1.sycdn.imooc.com//szimg/61683de609ba511e10000466.jpg


是不是 安装脚本把机器的网络或者dns解析服务搞坏了?

然后 我在 宿主机 /etc/resolv.conf 加上了 nameserver 8.8.8.8 又可以了。



第二个:dns 冒烟测试失败

https://img1.sycdn.imooc.com//szimg/61684c6c09792db310000480.jpg


第三个:发现dns 有几个pod是失败的

https://img1.sycdn.imooc.com//szimg/61684cc909a740db10001000.jpg

访问其中两个报错信息如下

通过命令 kubectl describe pod/coredns-85967d65-jrnc2 -n kube-system 查看

Name:                 nodelocaldns-4qp7k

Namespace:            kube-system

Priority:             2000000000

Priority Class Name:  system-cluster-critical

Node:                 node-1/10.7.190.74

Start Time:           Thu, 14 Oct 2021 15:54:58 +0800

Labels:               controller-revision-hash=666697fc9

                      k8s-app=nodelocaldns

                      pod-template-generation=1

Annotations:          prometheus.io/port: 9253

                      prometheus.io/scrape: true

Status:               Running

IP:                   10.7.190.74

IPs:

  IP:           10.7.190.74

Controlled By:  DaemonSet/nodelocaldns

Containers:

  node-cache:

    Container ID:  containerd://614a8ba2ea831dc3a4d648f5f6e3f1c6b49914ff42e68187780b312f3326522f

    Image:         k8s.gcr.io/dns/k8s-dns-node-cache:1.16.0

    Image ID:      k8s.gcr.io/dns/k8s-dns-node-cache@sha256:9f78e4cc9ed4c6da3d79d8492d66cde3638be8dbcdab8c72957b1f582e8ce04f

    Ports:         53/UDP, 53/TCP, 9253/TCP

    Host Ports:    53/UDP, 53/TCP, 9253/TCP

    Args:

      -localip

      169.254.25.10

      -conf

      /etc/coredns/Corefile

      -upstreamsvc

      coredns

    State:          Waiting

      Reason:       CrashLoopBackOff

    Last State:     Terminated

      Reason:       Error

      Exit Code:    1

      Started:      Thu, 14 Oct 2021 23:46:52 +0800

      Finished:     Thu, 14 Oct 2021 23:46:53 +0800

    Ready:          False

    Restart Count:  97

    Limits:

      memory:  170Mi

    Requests:

      cpu:        100m

      memory:     70Mi

    Liveness:     http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10

    Readiness:    http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10

    Environment:  <none>

    Mounts:

      /etc/coredns from config-volume (rw)

      /run/xtables.lock from xtables-lock (rw)

      /var/run/secrets/kubernetes.io/serviceaccount from nodelocaldns-token-djcd4 (ro)

Conditions:

  Type              Status

  Initialized       True

  Ready             False

  ContainersReady   False

  PodScheduled      True

Volumes:

  config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      nodelocaldns

    Optional:  false

  xtables-lock:

    Type:          HostPath (bare host directory volume)

    Path:          /run/xtables.lock

    HostPathType:  FileOrCreate

  nodelocaldns-token-djcd4:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  nodelocaldns-token-djcd4

    Optional:    false

QoS Class:       Burstable

Node-Selectors:  <none>

Tolerations:     :NoScheduleop=Exists

                 :NoExecuteop=Exists

                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists

                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists

                 node.kubernetes.io/network-unavailable:NoSchedule op=Exists

                 node.kubernetes.io/not-ready:NoExecute op=Exists

                 node.kubernetes.io/pid-pressure:NoSchedule op=Exists

                 node.kubernetes.io/unreachable:NoExecute op=Exists

                 node.kubernetes.io/unschedulable:NoSchedule op=Exists

Events:

  Type     Reason   Age                      From     Message

  ----     ------   ----                     ----     -------

  Warning  BackOff  2m6s (x2231 over 7h52m)  kubelet  Back-off restarting failed container


通过命令 kubectl describe pod/nodelocaldns-4qp7k -n kube-system查看

Name:                 nodelocaldns-4qp7k

Namespace:            kube-system

Priority:             2000000000

Priority Class Name:  system-cluster-critical

Node:                 node-1/10.7.190.74

Start Time:           Thu, 14 Oct 2021 15:54:58 +0800

Labels:               controller-revision-hash=666697fc9

                      k8s-app=nodelocaldns

                      pod-template-generation=1

Annotations:          prometheus.io/port: 9253

                      prometheus.io/scrape: true

Status:               Running

IP:                   10.7.190.74

IPs:

  IP:           10.7.190.74

Controlled By:  DaemonSet/nodelocaldns

Containers:

  node-cache:

    Container ID:  containerd://614a8ba2ea831dc3a4d648f5f6e3f1c6b49914ff42e68187780b312f3326522f

    Image:         k8s.gcr.io/dns/k8s-dns-node-cache:1.16.0

    Image ID:      k8s.gcr.io/dns/k8s-dns-node-cache@sha256:9f78e4cc9ed4c6da3d79d8492d66cde3638be8dbcdab8c72957b1f582e8ce04f

    Ports:         53/UDP, 53/TCP, 9253/TCP

    Host Ports:    53/UDP, 53/TCP, 9253/TCP

    Args:

      -localip

      169.254.25.10

      -conf

      /etc/coredns/Corefile

      -upstreamsvc

      coredns

    State:          Waiting

      Reason:       CrashLoopBackOff

    Last State:     Terminated

      Reason:       Error

      Exit Code:    1

      Started:      Thu, 14 Oct 2021 23:46:52 +0800

      Finished:     Thu, 14 Oct 2021 23:46:53 +0800

    Ready:          False

    Restart Count:  97

    Limits:

      memory:  170Mi

    Requests:

      cpu:        100m

      memory:     70Mi

    Liveness:     http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10

    Readiness:    http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10

    Environment:  <none>

    Mounts:

      /etc/coredns from config-volume (rw)

      /run/xtables.lock from xtables-lock (rw)

      /var/run/secrets/kubernetes.io/serviceaccount from nodelocaldns-token-djcd4 (ro)

Conditions:

  Type              Status

  Initialized       True

  Ready             False

  ContainersReady   False

  PodScheduled      True

Volumes:

  config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      nodelocaldns

    Optional:  false

  xtables-lock:

    Type:          HostPath (bare host directory volume)

    Path:          /run/xtables.lock

    HostPathType:  FileOrCreate

  nodelocaldns-token-djcd4:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  nodelocaldns-token-djcd4

    Optional:    false

QoS Class:       Burstable

Node-Selectors:  <none>

Tolerations:     :NoScheduleop=Exists

                 :NoExecuteop=Exists

                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists

                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists

                 node.kubernetes.io/network-unavailable:NoSchedule op=Exists

                 node.kubernetes.io/not-ready:NoExecute op=Exists

                 node.kubernetes.io/pid-pressure:NoSchedule op=Exists

                 node.kubernetes.io/unreachable:NoExecute op=Exists

                 node.kubernetes.io/unschedulable:NoSchedule op=Exists

Events:

  Type     Reason   Age                       From     Message

  ----     ------   ----                      ----     -------

  Warning  BackOff  4m53s (x2231 over 7h55m)  kubelet  Back-off restarting failed container


这三者是否有什么联系?


正在回答 回答被采纳积分+3

3回答

刘果国 2021-10-22 09:21:36

先看coredns的问题,nodelocaldns是一个本地dns缓存,依赖coredns。coredns就这一行日志?

0 回复 有任何疑惑可以回复我~
  • 提问者 慕少8521559 #1
    是呀,只有这一行
    回复 有任何疑惑可以回复我~ 2021-10-22 15:05:24
  • 提问者 慕少8521559 #2
    看这个回答 https://github.com/k3s-io/k3s/issues/2919
    
    有可能我宿主机 /etc/resolv.conf 没有配nameserver 导致的吗?
    
    如果是,我怎么重启 coredns的服务?
    回复 有任何疑惑可以回复我~ 2021-10-22 15:10:15
  • 刘果国 回复 提问者 慕少8521559 #3
    kubectl delete pod xxx
    回复 有任何疑惑可以回复我~ 2021-10-23 10:33:42
刘果国 2021-10-15 15:43:52

恩,我这目前暂时没发现有部署集群影响宿主机访问外网的情况。应该是不会的。

dns的问题pod启动失败了,看看pod的完整启动日志。describe主要是在pending情况分析问题用。logs是pod crash情况用

0 回复 有任何疑惑可以回复我~
  • 提问者 慕少8521559 #1
    查看 coredns 的报错是 
    
    命令是
    kubectl logs coredns-85967d65-jrnc2 -n kube-system 
    
    提示信息如下:
    plugin/forward: no nameservers found
    回复 有任何疑惑可以回复我~ 2021-10-21 17:28:00
  • 提问者 慕少8521559 #2
    查看nodelocaldns报错
    
    命令是
    kubectl logs nodelocaldns-r58k2 -n kube-system
    
    提示信息如下:
    
    
    2021/10/21 09:28:04 [INFO] Starting node-cache image: 1.16.0
    2021/10/21 09:28:04 [INFO] Using Corefile /etc/coredns/Corefile
    2021/10/21 09:28:04 [ERROR] Failed to read node-cache coreFile /etc/coredns/Corefile.base - open /etc/coredns/Corefile.base: no such file or directory
    2021/10/21 09:28:04 [ERROR] Failed to sync kube-dns config directory /etc/kube-dns, err: lstat /etc/kube-dns: no such file or directory
    plugin/forward: no nameservers found
    [root@node-1 ~]# kubectl logs nodelocaldns-5dd8l -n kube-system
    2021/10/21 09:29:45 [INFO] Starting node-cache image: 1.16.0
    2021/10/21 09:29:45 [INFO] Using Corefile /etc/coredns/Corefile
    2021/10/21 09:29:45 [ERROR] Failed to read node-cache coreFile /etc/coredns/Corefile.base - open /etc/coredns/Corefile.base: no such file or directory
    2021/10/21 09:29:45 [ERROR] Failed to sync kube-dns config directory /etc/kube-dns, err: lstat /etc/kube-dns: no such file or directory
    plugin/forward: no nameservers found
    回复 有任何疑惑可以回复我~ 2021-10-21 17:32:08
提问者 慕少8521559 2021-10-14 22:30:53

test

0 回复 有任何疑惑可以回复我~
问题已解决,确定采纳
还有疑问,暂不采纳
意见反馈 帮助中心 APP下载
官方微信