This alert calculates the average cgroup CPU utilization
over the past 10 minutes in a Kubernetes cluster. If you receive this alert at the warning or critical levels, it means that your cgroup is heavily utilizing the available CPU resources.
In Kubernetes, cgroups
are a Linux kernel feature that helps to limit and isolate the resource usage (CPU, memory, disk I/O, etc.) of a collection of processes. The cgroup CPU utilization
measures the percentage of available CPU resources consumed by the processes within a cgroup.
Check the alert message for the specific cgroup that is causing high CPU utilization.
To find the processes within the cgroup with high CPU usage, you can use systemd-cgtop
on the Kubernetes nodes:
systemd-cgtop -m -1 -p -n10
Use kubectl top
to get an overview of the resource usage in your Kubernetes cluster:
kubectl top nodes
kubectl top pods
Examine the events and logs of the Kubernetes cluster and the specific resources that are causing the high CPU utilization.
kubectl get events --sort-by='.metadata.creationTimestamp'
kubectl logs <pod-name> -n <namespace> --timestamps -f
You may need to scale your cluster by adding more resources, adjusting the resource limits, or optimizing the application code to minimize CPU usage.