环境
Client Version: v1.23.7-0.46+42c05a54746880
Server Version: v1.23.7-0.46+42c05a54746880
问题
检查一套已经存在很久k8s集群时,反馈调整 steamer-prometheus DaemonSet 后,无效,但是 Pod 已正常重建且状态正常:
[root@abc sre]# kubectl edit ds -n kube-system steamer-prometheus
daemonset.apps/steamer-prometheus edited
[root@abc sre]# kubectl get pods -n kube-system -l app=steamer-prometheus -w
NAME READY STATUS RESTARTS AGE
steamer-prometheus-rfm6d 1/1 Running 0 64s
[root@abc steamer]# kubectl get pod -n kube-system | grep prometheus
steamer-prometheus-adapter-7d985544-djrjl 1/1 Running 0 391d
steamer-prometheus-rfm6d 1/1 Running 1 (22m ago) 25m查看日志,没有发现异常:
[root@abc steamer]# kubectl logs -f -n kube-system steamer-prometheus-rfm6d
SUCCESS: /etc/prometheus/master/prometheus.yaml is valid
reload success
Completed loading of configuration file通过 NodePort 访问时一直无法连接:
[root@abc steamer]# curl -v http://172.27.200.161:19990/-/ready
* About to connect() to 172.27.200.161 port 19990 (#0)
* Trying 172.27.200.161...
^Ciptables 中已经存在 NodePort 转发规则:
[root@abc steamer]# iptables -t nat -L -n | grep 19990
KUBE-SVC-VUBGG4VPKG32A763 tcp -- 0.0.0.0/0 0.0.0.0/0 /* kube-system/steamer-prometheus:dns-tcp */ tcp dpt:19990
KUBE-MARK-MASQ tcp -- 0.0.0.0/0 0.0.0.0/0 /* kube-system/steamer-prometheus:dns-tcp */ tcp dpt:19990
[root@abc steamer]# iptables -t nat -L KUBE-SVC-VUBGG4VPKG32A763 -n -v
Chain KUBE-SVC-VUBGG4VPKG32A763 (2 references)
pkts bytes target prot opt in out source destination
189 11340 KUBE-MARK-MASQ tcp -- * * 0.0.0.0/0 0.0.0.0/0 /* kube-system/steamer-prometheus:dns-tcp */ tcp dpt:19990
1863 112K KUBE-SEP-LCRKIBON22UWKUVO all -- * * 0.0.0.0/0 0.0.0.0/0 /* kube-system/steamer-prometheus:dns-tcp */检查 Service 与 Endpoints,发现 Selector: <none>。
[root@abc steamer]# kubectl get endpoints -n kube-system steamer-prometheus -o wide
NAME ENDPOINTS AGE
steamer-prometheus 172.18.62.224:9090 391d
[root@abc steamer]# kubectl get pod -nkube-system -owide |grep prometheus
steamer-prometheus-adapter-7d985544-djrjl 1/1 Running 0 391d 172.18.62.225 172.27.200.161 <none> <none>
steamer-prometheus-rfm6d 1/1 Running 1 (35m ago) 38m 172.18.62.246 172.27.200.161 <none> <none>
[root@abc steamer]# kubectget endpoints -n kube-system steamer-prometheus -o wide
NAME ENDPOINTS AGE
steamer-prometheus 172.18.62.224:9090 391d
[root@abc steamer]# kubectl describe svc steamer-prometheus -n kube-system
Name: steamer-prometheus
Namespace: kube-system
Labels: <none>
Annotations: <none>
Selector: <none>
Type: NodePort
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.96.0.6
IPs: 10.96.0.6
Port: dns-tcp 9090/TCP
TargetPort: 9090/TCP
NodePort: dns-tcp 19990/TCP
Endpoints: <none>
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>因此虽然 iptables 已经创建了 NodePort 转发规则,但由于后端没有 Endpoint,流量最终无法转发到 Prometheus Pod。
增加 selector: app: steamer-prometheus 即可
kubectl get endpoints -n kube-system steamer-prometheus
kubectl edit svc steamer-prometheus -n kube-system
apiVersion: v1
kind: Service
metadata:
...
labels:
app: steamer-prometheus
name: steamer-prometheus
namespace: kube-system
...
spec:
...
ports:
- name: dns-tcp
nodePort: 19990
port: 9090
protocol: TCP
targetPort: 9090
selector:
app: steamer-prometheus
sessionAffinity: None
type: NodePort
status:
loadBalancer: {}