README.md 18 KB

Netdata Logging

This document describes how Netdata generates its own logs, not how Netdata manages and queries logs databases.

Log sources

Netdata supports the following log sources:

  1. daemon, logs generated by Netdata daemon.
  2. collector, logs generated by Netdata collectors, including internal and external ones.
  3. access, API requests received by Netdata
  4. health, all alert transitions and notifications

Log outputs

For each log source, Netdata supports the following output methods:

  • off, to disable this log source
  • journal, to send the logs to systemd-journal.
  • syslog, to send the logs to syslog.
  • system, to send the output to stderr or stdout depending on the log source.
  • stdout, to write the logs to Netdata's stdout.
  • stderr, to write the logs to Netdata's stderr.
  • filename, to send the logs to a file.

For daemon and collector the default is journal when systemd-journal is available. To decide if systemd-journal is available, Netdata checks:

  1. stderr is connected to systemd-journald
  2. /run/systemd/journal/socket exists
  3. /host/run/systemd/journal/socket exists (/host is configurable in containers)

If any of the above is detected, Netdata will select journal for daemon and collector sources.

All other sources default to a file.

Log formats

Format Description
journal journald-specific log format. Automatically selected when logging to systemd-journal.
logfmt logs data as a series of key/value pairs. The default when logging to any output other than journal.
json logs data in JSON format.

Log levels

Each time Netdata logs, it assigns a priority to the log. It can be one of this (in order of importance):

Level Description
emergency a fatal condition, Netdata will most likely exit immediately after.
alert a very important issue that may affect how Netdata operates.
critical a very important issue the user should know which, Netdata thinks it can survive.
error an error condition indicating that Netdata is trying to do something, but it fails.
warning something unexpected has happened that may or may not affect the operation of Netdata.
notice something that does not affect the operation of Netdata, but the user should notice.
info the default log level about information the user should know.
debug these are more verbose logs that can be ignored.

Logs Configuration

In netdata.conf, there are the following settings:

[logs]
	# logs to trigger flood protection = 1000
	# logs flood protection period = 60
	# facility = daemon
	# level = info
	# daemon = journal
	# collector = journal
	# access = /var/log/netdata/access.log
	# health = /var/log/netdata/health.log
  • logs to trigger flood protection and logs flood protection period enable logs flood protection for daemon and collector sources. It can also be configured per log source.
  • facility is used only when Netdata logs to syslog.
  • level defines the minimum log level of logs that will be logged. This setting is applied only to daemon and collector sources. It can also be configured per source.

Configuring log sources

Each for the sources (daemon, collector, access, health), accepts the following:

source = {FORMAT},level={LEVEL},protection={LOG}/{PERIOD}@{OUTPUT}

Where:

  • {FORMAT}, is one of the log formats,
  • {LEVEL}, is the minimum log level to be logged,
  • {LOGS} is the number of logs to trigger flood protection configured per output,
  • {PERIOD} is the equivalent of logs flood protection period configured per output,
  • {OUTPUT} is one of the `log outputs,

All parameters can be omitted, except {OUTPUT}. If {OUTPUT} is the only given parameter, @ can be omitted.

Logs rotation

Netdata comes with logrotate configuration to rotate its log files periodically.

The default is usually found in /etc/logrotate.d/netdata.

Sending a SIGHUP to Netdata, will instruct it to re-open all its log files.

Log Fields

Netdata exposes the following fields to its logs:

journal logfmt json Description
_SOURCE_REALTIME_TIMESTAMP time time the timestamp of the event
SYSLOG_IDENTIFIER comm comm the program logging the event
ND_LOG_SOURCE source source one of the log sources
PRIORITY
numeric
level
text
level
numeric
one of the log levels
ERRNO errno errno the numeric value of errno
INVOCATION_ID - - a unique UUID of the Netdata session, reset on every Netdata restart, inherited by systemd when available
CODE_LINE - - the line number of of the source code logging this event
CODE_FILE - - the filename of the source code logging this event
CODE_FUNCTION - - the function name of the source code logging this event
TID tid tid the thread id of the thread logging this event
THREAD_TAG thread thread the name of the thread logging this event
MESSAGE_ID msg_id msg_id see message IDs
ND_MODULE module module the Netdata module logging this event
ND_NIDL_NODE node node the hostname of the node the event is related to
ND_NIDL_INSTANCE instance instance the instance of the node the event is related to
ND_NIDL_CONTEXT context context the context the event is related to (this is usually the chart name, as shown on netdata dashboards
ND_NIDL_DIMENSION dimension dimension the dimension the event is related to
ND_SRC_TRANSPORT src_transport src_transport when the event happened during a request, this is the request transport
ND_SRC_IP src_ip src_ip when the event happened during an inbound request, this is the IP the request came from
ND_SRC_PORT src_port src_port when the event happened during an inbound request, this is the port the request came from
ND_SRC_CAPABILITIES src_capabilities src_capabilities when the request came from a child, this is the communication capabilities of the child
ND_DST_TRANSPORT dst_transport dst_transport when the event happened during an outbound request, this is the outbound request transport
ND_DST_IP dst_ip dst_ip when the event happened during an outbound request, this is the IP the request destination
ND_DST_PORT dst_port dst_port when the event happened during an outbound request, this is the port the request destination
ND_DST_CAPABILITIES dst_capabilities dst_capabilities when the request goes to a parent, this is the communication capabilities of the parent
ND_REQUEST_METHOD req_method req_method when the event happened during an inbound request, this is the method the request was received
ND_RESPONSE_CODE code code when responding to a request, this this the response code
ND_CONNECTION_ID conn conn when there is a connection id for an inbound connection, this is the connection id
ND_TRANSACTION_ID transaction transaction the transaction id (UUID) of all API requests
ND_RESPONSE_SENT_BYTES sent_bytes sent_bytes the bytes we sent to API responses
ND_RESPONSE_SIZE_BYTES size_bytes size_bytes the uncompressed bytes of the API responses
ND_RESPONSE_PREP_TIME_USEC prep_ut prep_ut the time needed to prepare a response
ND_RESPONSE_SENT_TIME_USEC sent_ut sent_ut the time needed to send a response
ND_RESPONSE_TOTAL_TIME_USEC total_ut total_ut the total time needed to complete a response
ND_ALERT_ID alert_id alert_id the alert id this event is related to
ND_ALERT_EVENT_ID alert_event_id alert_event_id a sequential number of the alert transition (per host)
ND_ALERT_UNIQUE_ID alert_unique_id alert_unique_id a sequential number of the alert transition (per alert)
ND_ALERT_TRANSITION_ID alert_transition_id alert_transition_id the unique UUID of this alert transition
ND_ALERT_CONFIG alert_config alert_config the alert configuration hash (UUID)
ND_ALERT_NAME alert alert the alert name
ND_ALERT_CLASS alert_class alert_class the alert classification
ND_ALERT_COMPONENT alert_component alert_component the alert component
ND_ALERT_TYPE alert_type alert_type the alert type
ND_ALERT_EXEC alert_exec alert_exec the alert notification program
ND_ALERT_RECIPIENT alert_recipient alert_recipient the alert recipient(s)
ND_ALERT_VALUE alert_value alert_value the current alert value
ND_ALERT_VALUE_OLD alert_value_old alert_value_old the previous alert value
ND_ALERT_STATUS alert_status alert_status the current alert status
ND_ALERT_STATUS_OLD alert_value_old alert_value_old the previous alert value
ND_ALERT_UNITS alert_units alert_units the units of the alert
ND_ALERT_SUMMARY alert_summary alert_summary the summary text of the alert
ND_ALERT_INFO alert_info alert_info the info text of the alert
ND_ALERT_DURATION alert_duration alert_duration the duration the alert was in its previous state
ND_ALERT_NOTIFICATION_TIMESTAMP_USEC alert_notification_timestamp alert_notification_timestamp the timestamp the notification delivery is scheduled
ND_REQUEST request request the full request during which the event happened
MESSAGE msg msg the event message

Message IDs

Netdata assigns specific message IDs to certain events:

  • ed4cdb8f1beb4ad3b57cb3cae2d162fa when a Netdata child connects to this Netdata
  • 6e2e3839067648968b646045dbf28d66 when this Netdata connects to a Netdata parent
  • 9ce0cb58ab8b44df82c4bf1ad9ee22de when alerts change state
  • 6db0018e83e34320ae2a659d78019fb7 when notifications are sent

You can view these events using the Netdata systemd-journal.plugin at the MESSAGE_ID filter, or using journalctl like this:

# query children connection
journalctl MESSAGE_ID=ed4cdb8f1beb4ad3b57cb3cae2d162fa

# query parent connection
journalctl MESSAGE_ID=6e2e3839067648968b646045dbf28d66

# query alert transitions
journalctl MESSAGE_ID=9ce0cb58ab8b44df82c4bf1ad9ee22de

# query alert notifications
journalctl MESSAGE_ID=6db0018e83e34320ae2a659d78019fb7