Netdata v2.3.0 changes how ephemeral nodes are defined and managed in distributed monitoring environments This update enhances monitoring reliability while providing flexibility for dynamic infrastructure management.
Key Changes:
Netdata now defines ephemeral nodes as "nodes that are expected to disconnect without raising alerts," replacing the previous definition of nodes that are forgotten after one day of disconnection. This change provides three major benefits:
Netdata supports two types of nodes:
Type | Description | Common Examples |
---|---|---|
Ephemeral | Nodes expected to disconnect or reconnect frequently | • Auto-scaling cloud instances • Dynamic containers and VMs • IoT devices with intermittent connectivity • Development/test environments with frequent restarts |
Permanent | Nodes expected to maintain continuous connectivity | • Production servers • Core infrastructure nodes • Critical monitoring systems • Stable database servers |
Note: Disconnections in permanent nodes indicate potential system failures requiring immediate attention.
By default, Netdata treats all nodes as permanent. To mark a node as ephemeral:
netdata.conf
on the target nodeAdd the following configuration:
[global]
is ephemeral node = yes
Restart the node
This configuration sets the _is_ephemeral
host label which propagates to Netdata Parents and Netdata Cloud.
Netdata v2.3.0 adds two alerts specifically for permanent nodes:
Alert | Triggers |
---|---|
streaming_never_connected | When permanent nodes have never connected to a Netdata Parent |
streaming_disconnected | When previously connected permanent nodes disconnect |
To investigate alert:
Top
tab in your dashboardNetdata-streaming
functionEphemerality
to focus on permanent nodesInStatus
, InReason
, and InAge
columns fto analyze nodes connecting to this parentOutStatus
, OutReason
, and OutAge
columns to analyze this Parent's restreaming to other Parent nodesTo clear alerts for permanently offline nodes:
netdatacli mark-stale-nodes-ephemeral <node_id | machine_guid | hostname | ALL_NODES>
Note: Nodes will revert to permanent status if they reconnect unless configured as ephemeral in their
netdata.conf
.
Starting with v2.3.0, Netdata Cloud sends node-unreachable notifications exclusively for permanent nodes, improving alert relevance.
The automatic removal of disconnected ephemeral nodes is disabled by default in v2.3.0+. To enable this feature:
netdata.conf
file on Netdata Parent nodesAdd the following configuration:
[db]
cleanup ephemeral hosts after = 1d
Restart the node
This setting removes ephemeral nodes from queries 24 hours after disconnection. When all parent nodes remove a node, Netdata Cloud automatically deletes it too.