This alert, load_cpu_number
, calculates the base trigger point for load average alarms, which helps identify when the system is overloaded. The alert checks the maximum number of CPUs in the system over the past 1 minute. If there is only one CPU, the trigger is set at 2.
The term system load average
on a Linux machine measures the number of threads that are currently working and those waiting to work (CPU, disk, uninterruptible locks). In simpler terms, the load average measures the number of threads that aren't idle.
An overloaded system is when the demand on the system's resources (CPUs, disks, etc.) is higher than its capacity to handle tasks. This can lead to increased wait times, slower processing, and in worst cases, system crashes.
Use the uptime
command in the terminal to see the current load average:
uptime
Use vmstat
(or vmstat 1
, to set a delay between updates in seconds) to get a report on system statistics:
The procs
column shows:
r: The number of runnable processes (running or waiting for run time).
b: The number of processes blocked waiting for I/O to complete.
a. Use top
to see the processes that are the main CPU consumers:
top -o +%CPU -i
b. Use iotop
to monitor Disk I/O usage (install it if not available):
sudo iotop