README.md 19 KB

Netdata Logging

This document describes how Netdata generates its own logs, not how Netdata manages and queries logs databases.

Log sources

Netdata supports the following log sources:

  1. daemon, logs generated by Netdata daemon.
  2. collector, logs generated by Netdata collectors, including internal and external ones.
  3. access, API requests received by Netdata
  4. health, all alert transitions and notifications

Log outputs

For each log source, Netdata supports the following output methods:

  • off, to disable this log source
  • journal, to send the logs to systemd-journal.
  • syslog, to send the logs to syslog.
  • system, to send the output to stderr or stdout depending on the log source.
  • stdout, to write the logs to Netdata's stdout.
  • stderr, to write the logs to Netdata's stderr.
  • filename, to send the logs to a file.

For daemon and collector the default is journal when systemd-journal is available. To decide if systemd-journal is available, Netdata checks:

  1. stderr is connected to systemd-journald
  2. /run/systemd/journal/socket exists
  3. /host/run/systemd/journal/socket exists (/host is configurable in containers)

If any of the above is detected, Netdata will select journal for daemon and collector sources.

All other sources default to a file.

Log formats

Format Description
journal journald-specific log format. Automatically selected when logging to systemd-journal.
logfmt logs data as a series of key/value pairs. The default when logging to any output other than journal.
json logs data in JSON format.

Log levels

Each time Netdata logs, it assigns a priority to the log. It can be one of this (in order of importance):

Level Description
emergency a fatal condition, Netdata will most likely exit immediately after.
alert a very important issue that may affect how Netdata operates.
critical a very important issue the user should know which, Netdata thinks it can survive.
error an error condition indicating that Netdata is trying to do something, but it fails.
warning something unexpected has happened that may or may not affect the operation of Netdata.
notice something that does not affect the operation of Netdata, but the user should notice.
info the default log level about information the user should know.
debug these are more verbose logs that can be ignored.

Logs Configuration

In netdata.conf, there are the following settings:

[logs]
	# logs to trigger flood protection = 1000
	# logs flood protection period = 60
	# facility = daemon
	# level = info
	# daemon = journal
	# collector = journal
	# access = /var/log/netdata/access.log
	# health = /var/log/netdata/health.log
  • logs to trigger flood protection and logs flood protection period enable logs flood protection for daemon and collector sources. It can also be configured per log source.
  • facility is used only when Netdata logs to syslog.
  • level defines the minimum log level of logs that will be logged. This setting is applied only to daemon and collector sources. It can also be configured per source.

Configuring log sources

Each for the sources (daemon, collector, access, health), accepts the following:

source = {FORMAT},level={LEVEL},protection={LOG}/{PERIOD}@{OUTPUT}

Where:

  • {FORMAT}, is one of the log formats,
  • {LEVEL}, is the minimum log level to be logged,
  • {LOGS} is the number of logs to trigger flood protection configured per output,
  • {PERIOD} is the equivalent of logs flood protection period configured per output,
  • {OUTPUT} is one of the `log outputs,

All parameters can be omitted, except {OUTPUT}. If {OUTPUT} is the only given parameter, @ can be omitted.

Logs rotation

Netdata comes with logrotate configuration to rotate its log files periodically.

The default is usually found in /etc/logrotate.d/netdata.

Sending a SIGHUP to Netdata, will instruct it to re-open all its log files.

Log Fields

All fields exposed by Netdata | journal | logfmt | json | Description | |:--------------------------------------:|:------------------------------:|:------------------------------:|:---------------------------------------------------------------------------------------------------------:| | `_SOURCE_REALTIME_TIMESTAMP` | `time` | `time` | the timestamp of the event | | `SYSLOG_IDENTIFIER` | `comm` | `comm` | the program logging the event | | `ND_LOG_SOURCE` | `source` | `source` | one of the [log sources](#log-sources) | | `PRIORITY`
numeric | `level`
text | `level`
numeric | one of the [log levels](#log-levels) | | `ERRNO` | `errno` | `errno` | the numeric value of `errno` | | `INVOCATION_ID` | - | - | a unique UUID of the Netdata session, reset on every Netdata restart, inherited by systemd when available | | `CODE_LINE` | - | - | the line number of of the source code logging this event | | `CODE_FILE` | - | - | the filename of the source code logging this event | | `CODE_FUNCTION` | - | - | the function name of the source code logging this event | | `TID` | `tid` | `tid` | the thread id of the thread logging this event | | `THREAD_TAG` | `thread` | `thread` | the name of the thread logging this event | | `MESSAGE_ID` | `msg_id` | `msg_id` | see [message IDs](#message-ids) | | `ND_MODULE` | `module` | `module` | the Netdata module logging this event | | `ND_NIDL_NODE` | `node` | `node` | the hostname of the node the event is related to | | `ND_NIDL_INSTANCE` | `instance` | `instance` | the instance of the node the event is related to | | `ND_NIDL_CONTEXT` | `context` | `context` | the context the event is related to (this is usually the chart name, as shown on netdata dashboards | | `ND_NIDL_DIMENSION` | `dimension` | `dimension` | the dimension the event is related to | | `ND_SRC_TRANSPORT` | `src_transport` | `src_transport` | when the event happened during a request, this is the request transport | | `ND_SRC_IP` | `src_ip` | `src_ip` | when the event happened during an inbound request, this is the IP the request came from | | `ND_SRC_PORT` | `src_port` | `src_port` | when the event happened during an inbound request, this is the port the request came from | | `ND_SRC_FORWARDED_HOST` | `src_forwarded_host` | `src_forwarded_host` | the contents of the HTTP header `X-Forwarded-Host` | | `ND_SRC_FORWARDED_FOR` | `src_forwarded_for` | `src_forwarded_for` | the contents of the HTTP header `X-Forwarded-For` | | `ND_SRC_CAPABILITIES` | `src_capabilities` | `src_capabilities` | when the request came from a child, this is the communication capabilities of the child | | `ND_DST_TRANSPORT` | `dst_transport` | `dst_transport` | when the event happened during an outbound request, this is the outbound request transport | | `ND_DST_IP` | `dst_ip` | `dst_ip` | when the event happened during an outbound request, this is the IP the request destination | | `ND_DST_PORT` | `dst_port` | `dst_port` | when the event happened during an outbound request, this is the port the request destination | | `ND_DST_CAPABILITIES` | `dst_capabilities` | `dst_capabilities` | when the request goes to a parent, this is the communication capabilities of the parent | | `ND_REQUEST_METHOD` | `req_method` | `req_method` | when the event happened during an inbound request, this is the method the request was received | | `ND_RESPONSE_CODE` | `code` | `code` | when responding to a request, this this the response code | | `ND_CONNECTION_ID` | `conn` | `conn` | when there is a connection id for an inbound connection, this is the connection id | | `ND_TRANSACTION_ID` | `transaction` | `transaction` | the transaction id (UUID) of all API requests | | `ND_RESPONSE_SENT_BYTES` | `sent_bytes` | `sent_bytes` | the bytes we sent to API responses | | `ND_RESPONSE_SIZE_BYTES` | `size_bytes` | `size_bytes` | the uncompressed bytes of the API responses | | `ND_RESPONSE_PREP_TIME_USEC` | `prep_ut` | `prep_ut` | the time needed to prepare a response | | `ND_RESPONSE_SENT_TIME_USEC` | `sent_ut` | `sent_ut` | the time needed to send a response | | `ND_RESPONSE_TOTAL_TIME_USEC` | `total_ut` | `total_ut` | the total time needed to complete a response | | `ND_ALERT_ID` | `alert_id` | `alert_id` | the alert id this event is related to | | `ND_ALERT_EVENT_ID` | `alert_event_id` | `alert_event_id` | a sequential number of the alert transition (per host) | | `ND_ALERT_UNIQUE_ID` | `alert_unique_id` | `alert_unique_id` | a sequential number of the alert transition (per alert) | | `ND_ALERT_TRANSITION_ID` | `alert_transition_id` | `alert_transition_id` | the unique UUID of this alert transition | | `ND_ALERT_CONFIG` | `alert_config` | `alert_config` | the alert configuration hash (UUID) | | `ND_ALERT_NAME` | `alert` | `alert` | the alert name | | `ND_ALERT_CLASS` | `alert_class` | `alert_class` | the alert classification | | `ND_ALERT_COMPONENT` | `alert_component` | `alert_component` | the alert component | | `ND_ALERT_TYPE` | `alert_type` | `alert_type` | the alert type | | `ND_ALERT_EXEC` | `alert_exec` | `alert_exec` | the alert notification program | | `ND_ALERT_RECIPIENT` | `alert_recipient` | `alert_recipient` | the alert recipient(s) | | `ND_ALERT_VALUE` | `alert_value` | `alert_value` | the current alert value | | `ND_ALERT_VALUE_OLD` | `alert_value_old` | `alert_value_old` | the previous alert value | | `ND_ALERT_STATUS` | `alert_status` | `alert_status` | the current alert status | | `ND_ALERT_STATUS_OLD` | `alert_value_old` | `alert_value_old` | the previous alert value | | `ND_ALERT_UNITS` | `alert_units` | `alert_units` | the units of the alert | | `ND_ALERT_SUMMARY` | `alert_summary` | `alert_summary` | the summary text of the alert | | `ND_ALERT_INFO` | `alert_info` | `alert_info` | the info text of the alert | | `ND_ALERT_DURATION` | `alert_duration` | `alert_duration` | the duration the alert was in its previous state | | `ND_ALERT_NOTIFICATION_TIMESTAMP_USEC` | `alert_notification_timestamp` | `alert_notification_timestamp` | the timestamp the notification delivery is scheduled | | `ND_REQUEST` | `request` | `request` | the full request during which the event happened | | `MESSAGE` | `msg` | `msg` | the event message |

Message IDs

Netdata assigns specific message IDs to certain events:

  • ed4cdb8f1beb4ad3b57cb3cae2d162fa when a Netdata child connects to this Netdata
  • 6e2e3839067648968b646045dbf28d66 when this Netdata connects to a Netdata parent
  • 9ce0cb58ab8b44df82c4bf1ad9ee22de when alerts change state
  • 6db0018e83e34320ae2a659d78019fb7 when notifications are sent

You can view these events using the Netdata systemd-journal.plugin at the MESSAGE_ID filter, or using journalctl like this:

# query children connection
journalctl MESSAGE_ID=ed4cdb8f1beb4ad3b57cb3cae2d162fa

# query parent connection
journalctl MESSAGE_ID=6e2e3839067648968b646045dbf28d66

# query alert transitions
journalctl MESSAGE_ID=9ce0cb58ab8b44df82c4bf1ad9ee22de

# query alert notifications
journalctl MESSAGE_ID=6db0018e83e34320ae2a659d78019fb7

Using journalctl to query Netdata logs

The Netdata service's processes execute within the netdata journal namespace. To view the Netdata logs, you should specify the --namespace=netdata option.

# Netdata logs since the last time the service was started
journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata

# All netdata logs, the oldest entries are displayed first  
journalctl -u netdata --namespace=netdata

# All netdata logs, the newest entries are displayed first  
journalctl -u netdata --namespace=netdata -r