k8s删除驱逐的pod

问题现象

看到 k8s 集群中有 Evicted 状态的 pod,没有被清理

问题出现的原因是:

1
2
3
4
5
节点压力驱逐是 kubelet 主动终止 Pod 以回收节点上资源的过程。
kubelet 监控集群节点的 CPU、内存、磁盘空间和文件系统的 inode 等资源。 当这些资源中的一个或者多个达到特定的消耗水平, kubelet 可以主动地使节点上一个或者多个 Pod 失效,以回收资源防止饥饿。
在节点压力驱逐期间,kubelet 将所选 Pod 的 PodPhase 设置为 Failed。这将终止 Pod。
节点压力驱逐不同于 API 发起的驱逐。kubelet 并不理会你配置的 PodDisruptionBudget 或者是 Pod 的 terminationGracePeriodSeconds。
其实这个报错,我们不需要在意,直接删除掉就可以了。

如何解决这个问题了,通过 cronjob 即可

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
[root@k-m1 deletePodAuto]# cat ./*
apiVersion: v1
kind: Namespace
metadata:
  name: delete-evicted-pods
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: delete-evicted-pods
  namespace: delete-evicted-pods
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: delete-evicted-pods
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "watch", "list", "delete"]
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: delete-evicted-pods
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: delete-evicted-pods
subjects:
  - kind: ServiceAccount
    name: delete-evicted-pods
    namespace: delete-evicted-pods
apiVersion: batch/v1
kind: CronJob
metadata:
  name: delete-evicted-pods
  namespace: delete-evicted-pods
spec:
  schedule: "*/30 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: delete-evicted-pods
          containers:
          - name: kubectl-runner
            image: bitnami/kubectl:1.21.8
            imagePullPolicy: IfNotPresent
            command:
            - /bin/sh
            - -c
            - kubectl get pods --all-namespaces -o go-template='{{range .items}} {{if (eq .status.phase "Failed" )}} {{.metadata.name}}{{" "}} {{.metadata.namespace}}{{" "}} {{.metadata.creationTimestamp}}{{" "}} {{.status.reason}} {{"\n"}}{{end}} {{end}}' | while read epod namespace ct reason; do if [ x"$reason" = x"Evicted" -a $((`date +%s`-`date -d "$ct" +%s`)) -gt 259200 ];then echo "`date "+%Y-%m-%d %H:%M:%S"` delete $namespace $reason $epod "; kubectl -n $namespace delete pod $epod; fi; done;
          restartPolicy: OnFailure

image-20230508152624130

参考文档:

https://www.jianshu.com/p/19dcf715bb28

https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/

https://blog.51cto.com/u_14035463/5627073?u_atoken=fd281fa7-1e39-4f77-8bd8-c4f4b8ffd2f6&u_asession=01y8psPltWzPK5p6RiQxddbDelv2N3qnB3tdszlh3Ehn-S86RJTe_COA-SN20WrPWBX0KNBwm7Lovlpxjd_P_q4JsKWYrT3W_NKPr8w6oU7K95IP-MAYWFd-S6-lI-0YTWleIiiCxI4QtK681bDG6EW2BkFo3NEHBv0PZUm6pbxQU&u_asig=05_iqjE2ctFye6sIp-0lih0VzOtGEK9m3DziW902mWQB4mPsrKL6FAuitXevfFP-FkMSNmfmaolNbWhUz7j8CMigKCl7oMi_IvuFytWGXL9nCi-CAF53TS7fmG_UpptH6OMqxwGGjtYUBBdFJH09ywh9dr7mWQyN8sybPPqch61D_9JS7q8ZD7Xtz2Ly-b0kmuyAKRFSVJkkdwVUnyHAIJzYV7jaqxF4E_L8INKazfHBgxpeoklXQxEDsnXSLQ8v_5nHu_wmj5Aatvj_bWQkaX_e3h9VXwMyh6PgyDIVSG1W9ymCZWvyaLuDOU4CMntmgvKtBp_GYAoTfNrX2yA44vCjFnzFTzHJCmDHBQYearOE-dCrWleGmgQr3hrlLAvQ3bmWspDxyAEEo4kbsryBKb9Q&u_aref=1GjJSbPxuvTQ7lm4FYdqee%2FYKdw%3D

Licensed under CC BY-NC-SA 4.0
最后更新于 Jan 06, 2025 05:52 UTC
comments powered by Disqus
Built with Hugo
主题 StackJimmy 设计
Caret Up