This alert is triggered when a systemd service unit
enters the failed state
. If you receive this alert, it means that a critical service on your system has stopped working, and it requires immediate attention.
A systemd service unit
is a simply stated, a service configuration file that describes how a specific service should be controlled and managed on a Linux system. It includes information about service dependencies, the order in which it should start, and more. Systemd is responsible for managing these services and making sure they are functioning as intended.
When a systemd service unit
enters the failed state
, it indicates that the service has encountered a fault, such as an incorrect configuration file, crashing, or failing to start due to other dependencies. When this occurs, the service is rendered non-functional, and you should troubleshoot the issue to restore normal functionality.
Use the following command to list all failed service units:
systemctl --state=failed
Take note of the failed service unit name as you will use it in the next steps.
Use the following command to investigate the status and any error messages:
systemctl status <failed_service_unit>
Replace <failed_service_unit>
with the name of the failed service unit you identified earlier.
Use the following command to inspect the logs for any clues:
journalctl -u <failed_service_unit> --since "1 hour ago"
Adjust the --since
parameter to view logs from a specific timeframe.
Based on the information gathered from the status and logs, try to resolve the issue causing the failure. This can involve updating configuration files, installing missing dependencies, or addressing issues with other services that the failed service unit depends on.
Once the issue has been addressed, restart the service to restore functionality:
systemctl start <failed_service_unit>
Verify that the service has started successfully:
systemctl status <failed_service_unit>