Netdata Parents generally scale well. According to our tests Netdata Parents scale better than Prometheus for the same workload: -35% CPU utilization, -49% Memory Consumption, -12% Network Bandwidth, -98% Disk I/O, -75% Disk footprint.
For more information, Check Sizing Netdata Parents.
No. When you set up an active-active cluster, even if child nodes connect randomly to one or the other, all the parent nodes receive all the metrics of all the child nodes. So, all of them do all the work.
Child nodes need to have only the retention required to connect to another Parent if one fails or stops for maintenance.
alloc
mode is usually enough.dbengine
so that they will have enough retention to back-fill the parent node if it stops for maintenance.Yes. You can configure your parent nodes to enable TLS at their web server and configure the child nodes to connect with TLS to it. The streaming connection is also compressed, on top of TLS.
No. The streaming protocol works on the same port as the internal web server of Netdata Agents, but the protocol is not HTTP-friendly and cannot be understood by HTTP proxy servers.
Although this can be done and for streaming between child and parent nodes it could work, we recommend not doing it. It can lead to several kinds of problems.
It is better to configure all the parent nodes directly in the child nodes stream.conf
. The child nodes will do everything in their power to find a parent node to connect, and they will never give up.
If all parents are configured to run health checks and trigger alerts, yes.
We recommend using Netdata Cloud to avoid receiving duplicate alert notifications. Netdata Cloud deduplicates alert notifications so that you will receive them only once.
Yes. Function requests will be received by the Parents and forwarded to the Child via their streaming connection. Function requests are propagated between parents, so this will work even if multiple levels of Netdata Parents are involved.
Check Restoring a Netdata Parent after maintenance.
When there are multiple data sources for the same node, Netdata Cloud follows this strategy:
live
data.Yes. When configuring the Parents at the Children stream.conf
, configure them in different order. Children get connected to the first Parent they find available, so if the order given to them is different, they will spread the connections to the Parents available.
It depends on the ephemerality setting of each Netdata Child.
Permanent nodes: These are nodes that should be available permanently and if they disconnect, an alert should be triggered to notify you. By default, all nodes are considered permanent (not ephemeral).
Ephemeral nodes: These are nodes that are ephemeral by nature, and they may shut down at any point in time without any impact on the services you run.
To set the ephemeral flag on a node, edit its netdata.conf and in the [global]
section set is ephemeral node = yes
. This setting is propagated to parent nodes and Netdata Cloud.
A parent node tracks connections and disconnections. When a node is marked as ephemeral and stops connecting for more than 24 hours, the parent will delete it from its memory and local administration, and tell Cloud that it is no longer live nor stale. Data for the node can no longer be accessed, but if the node connects again later, the node will be "revived", and previous data becomes available again.
A node can be forced into this "forgotten" state with the Netdata CLI tool on the parent the node is connected to (if still connected) or one of the parent Agents it was previously connected to. The state will be propagated upwards and sideways in case of an HA setup.
netdatacli remove-stale-node <node_id | machine_guid | hostname | ALL_NODES>
When using Netdata Cloud (via a parent or directly), and a permanent node gets disconnected, Netdata Cloud sends node disconnection notifications.