k8s的dns解析失败问题

今天遇到一个坑:就是节点与coresdns的endpoint是不通的。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
 kubectl get endpoints coredns -n kube-system                                                                                                           
NAME      ENDPOINTS                                           AGE
coredns   10.234.238.2:53,10.234.238.2:53,10.234.238.2:9153   2y289d
 ping  10.234.238.2
PING 10.234.238.2 (10.234.238.2) 56(84) bytes of data.
^C
--- 10.234.238.2 ping statistics ---
304 packets transmitted, 0 received, 100% packet loss, time 302975ms

 traceroute 10.234.238.2
traceroute to 10.234.238.2 (10.234.238.2), 30 hops max, 60 byte packets
 1  * * *
 2  * * *
 3  * * *
 4  * * *
 5  * * *
 6  * * *
 7  * * *
 8  * * *
 9  * * *
10  * * *
11  * * *
12  * * *
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

注意看上面的现象是,podip不通coredns

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
 route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.7.15.254     0.0.0.0         UG    100    0        0 ens192
10.7.0.0        0.0.0.0         255.255.240.0   U     100    0        0 ens192
10.7.7.0        0.0.0.0         255.255.255.0   U     0      0        0 ens192
10.234.36.0     10.7.7.6        255.255.254.0   UG    0      0        0 tunl0
10.234.238.0    10.7.7.5        255.255.254.0   UG    0      0        0 tunl0
10.234.242.0    10.7.7.4        255.255.254.0   UG    0      0        0 tunl0
10.234.248.0    10.7.7.2        255.255.254.0   UG    0      0        0 tunl0
10.234.250.0    0.0.0.0         255.255.254.0   U     0      0        0 *
10.234.251.37   0.0.0.0         255.255.255.255 UH    0      0        0 cali8748803e93d
10.234.251.40   0.0.0.0         255.255.255.255 UH    0      0        0 cali73d8903582e
10.234.251.42   0.0.0.0         255.255.255.255 UH    0      0        0 calie17843e7288
10.234.251.43   0.0.0.0         255.255.255.255 UH    0      0        0 cali1fa7203314c
10.234.251.44   0.0.0.0         255.255.255.255 UH    0      0        0 calibb90a85a5f3
10.234.251.45   0.0.0.0         255.255.255.255 UH    0      0        0 cali6d1f80af7f7

使用ip route get 10.234.238.2之后获取到的路由信息如下:

1
2
10.234.238.2 via 10.7.7.5 dev tunl0 src 10.234.248.0
    cache expires 380sec mtu 1440

在这种情况下,网络就会是不通的,出现跳多次无法找到节点,nslookup就失败了。后面发现是这个路由的问题的·10.7.7.0·的路由,接下来删除这个路由试试

1
 ip route del 10.7.7.0/24 dev ens192

接下来再看看效果:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# traceroute 10.234.238.2
traceroute to 10.234.238.2 (10.234.238.2), 30 hops max, 60 byte packets
 1  10.234.238.0 (10.234.238.0)  0.365 ms  0.322 ms  0.310 ms
 2  10.234.238.2 (10.234.238.2)  0.687 ms  0.876 ms  0.885 ms
 nslookup kubernetes.default
;; Got recursion not available from 10.233.0.3, trying next server
Server:         10.233.0.3
Address:        10.233.0.3#53

Name:   kubernetes.default.svc.cluster.local
Address: 10.233.0.1
Licensed under CC BY-NC-SA 4.0
最后更新于 May 09, 2025 06:14 UTC
comments powered by Disqus
Built with Hugo
主题 StackJimmy 设计
Caret Up