123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490 |
- ---
- title: "Streaming reference"
- description: "Each node running Netdata can stream the metrics it collects, in real time, to another node. See all of the available settings in this reference document."
- type: "reference"
- custom_edit_url: "https://github.com/netdata/netdata/edit/master/docs/metrics-storage-management/reference-streaming.mdx"
- sidebar_label: "Streaming reference"
- learn_status: "Published"
- learn_topic_type: "References"
- learn_rel_path: "References/Configuration"
- ---
- # Streaming reference
- Each node running Netdata can stream the metrics it collects, in real time, to another node. To learn more, read about
- [how streaming works](https://github.com/netdata/netdata/blob/master/docs/metrics-storage-management/how-streaming-works.mdx).
- For a quickstart guide for enabling a simple `parent-child` streaming relationship, see our [stream metrics between
- nodes](https://github.com/netdata/netdata/blob/master/docs/metrics-storage-management/enable-streaming.mdx) doc. All other configuration options and scenarios are
- covered in the sections below.
- ## Configuration
- There are two files responsible for configuring Netdata's streaming capabilities: `stream.conf` and `netdata.conf`.
- From within your Netdata config directory (typically `/etc/netdata`), [use `edit-config`](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md) to
- open either `stream.conf` or `netdata.conf`.
- ```
- sudo ./edit-config stream.conf
- sudo ./edit-config netdata.conf
- ```
- ## Settings
- As mentioned above, both `stream.conf` and `netdata.conf` contain settings relevant to streaming.
- ### `stream.conf`
- The `stream.conf` file contains three sections. The `[stream]` section is for configuring child nodes.
- The `[API_KEY]` and `[MACHINE_GUID]` sections are both for configuring parent nodes, and share the same settings.
- `[API_KEY]` settings affect every child node using that key, whereas `[MACHINE_GUID]` settings affect only the child
- node with a matching GUID.
- The file `/var/lib/netdata/registry/netdata.public.unique.id` contains a random GUID that **uniquely identifies each
- node**. This file is automatically generated by Netdata the first time it is started and remains unaltered forever.
- #### `[stream]` section
- | Setting | Default | Description |
- | :---------------------------------------------- | :------------------------ | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
- | `enabled` | `no` | Whether this node streams metrics to any parent. Change to `yes` to enable streaming. |
- | [`destination`](#destination) | ` ` | A space-separated list of parent nodes to attempt to stream to, with the first available parent receiving metrics, using the following format: `[PROTOCOL:]HOST[%INTERFACE][:PORT][:SSL]`. [Read more →](#destination) |
- | `ssl skip certificate verification` | `yes` | If you want to accept self-signed or expired certificates, set to `yes` and uncomment. |
- | `CApath` | `/etc/ssl/certs/` | The directory where known certificates are found. Defaults to OpenSSL's default path. |
- | `CAfile` | `/etc/ssl/certs/cert.pem` | Add a parent node certificate to the list of known certificates in `CAPath`. |
- | `api key` | ` ` | The `API_KEY` to use as the child node. |
- | `timeout seconds` | `60` | The timeout to connect and send metrics to a parent. |
- | `default port` | `19999` | The port to use if `destination` does not specify one. |
- | [`send charts matching`](#send-charts-matching) | `*` | A space-separated list of [Netdata simple patterns](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md) to filter which charts are streamed. [Read more →](#send-charts-matching) |
- | `buffer size bytes` | `10485760` | The size of the buffer to use when sending metrics. The default `10485760` equals a buffer of 10MB, which is good for 60 seconds of data. Increase this if you expect latencies higher than that. The buffer is flushed on reconnect. |
- | `reconnect delay seconds` | `5` | How long to wait until retrying to connect to the parent node. |
- | `initial clock resync iterations` | `60` | Sync the clock of charts for how many seconds when starting. |
- ### `[API_KEY]` and `[MACHINE_GUID]` sections
- | Setting | Default | Description |
- | :---------------------------------------------- | :------------------------ | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
- | `enabled` | `no` | Whether this API KEY enabled or disabled. |
- | [`allow from`](#allow-from) | `*` | A space-separated list of [Netdata simple patterns](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md) matching the IPs of nodes that will stream metrics using this API key. [Read more →](#allow-from) |
- | `default history` | `3600` | The default amount of child metrics history to retain when using the `save`, `map`, or `ram` memory modes. |
- | [`default memory mode`](#default-memory-mode) | `ram` | The [database](https://github.com/netdata/netdata/blob/master/database/README.md) to use for all nodes using this `API_KEY`. Valid settings are `dbengine`, `map`, `save`, `ram`, or `none`. [Read more →](#default-memory-mode) |
- | `health enabled by default` | `auto` | Whether alarms and notifications should be enabled for nodes using this `API_KEY`. `auto` enables alarms when the child is connected. `yes` enables alarms always, and `no` disables alarms. |
- | `default postpone alarms on connect seconds` | `60` | Postpone alarms and notifications for a period of time after the child connects. |
- | `default proxy enabled` | ` ` | Route metrics through a proxy. |
- | `default proxy destination` | ` ` | Space-separated list of `IP:PORT` for proxies. |
- | `default proxy api key` | ` ` | The `API_KEY` of the proxy. |
- | `default send charts matching` | `*` | See [`send charts matching`](#send-charts-matching). |
- #### `destination`
- A space-separated list of parent nodes to attempt to stream to, with the first available parent receiving metrics, using
- the following format: `[PROTOCOL:]HOST[%INTERFACE][:PORT][:SSL]`.
- - `PROTOCOL`: `tcp`, `udp`, or `unix`. (only tcp and unix are supported by parent nodes)
- - `HOST`: A IPv4, IPv6 IP, or a hostname, or a unix domain socket path. IPv6 IPs should be given with brackets
- `[ip:address]`.
- - `INTERFACE` (IPv6 only): The network interface to use.
- - `PORT`: The port number or service name (`/etc/services`) to use.
- - `SSL`: To enable TLS/SSL encryption of the streaming connection.
- To enable TCP streaming to a parent node at `203.0.113.0` on port `20000` and with TLS/SSL encryption:
- ```conf
- [stream]
- destination = tcp:203.0.113.0:20000:SSL
- ```
- #### `send charts matching`
- A space-separated list of [Netdata simple patterns](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md) to filter which charts are streamed.
- The default is a single wildcard `*`, which streams all charts.
- To send only a few charts, list them explicitly, or list a group using a wildcard. To send _only_ the `apps.cpu` chart
- and charts with contexts beginning with `system.`:
- ```conf
- [stream]
- send charts matching = apps.cpu system.*
- ```
- To send all but a few charts, use `!` to create a negative match. To send _all_ charts _but_ `apps.cpu`:
- ```conf
- [stream]
- send charts matching = !apps.cpu *
- ```
- #### `allow from`
- A space-separated list of [Netdata simple patterns](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md) matching the IPs of nodes that
- will stream metrics using this API key. The order is important, left to right, as the first positive or negative match is used.
- The default is `*`, which accepts all requests including the `API_KEY`.
- To allow from only a specific IP address:
- ```conf
- [API_KEY]
- allow from = 203.0.113.10
- ```
- To allow all IPs starting with `10.*`, except `10.1.2.3`:
- ```conf
- [API_KEY]
- allow from = !10.1.2.3 10.*
- ```
- > If you set specific IP addresses here, and also use the `allow connections` setting in the `[web]` section of
- > `netdata.conf`, be sure to add the IP address there so that it can access the API port.
- #### `default memory mode`
- The [database](https://github.com/netdata/netdata/blob/master/database/README.md) to use for all nodes using this `API_KEY`. Valid settings are `dbengine`, `ram`,
- `save`, `map`, or `none`.
- - `dbengine`: The default, recommended time-series database (TSDB) for Netdata. Stores recent metrics in memory, then
- efficiently spills them to disk for long-term storage.
- - `ram`: Stores metrics _only_ in memory, which means metrics are lost when Netdata stops or restarts. Ideal for
- streaming configurations that use ephemeral nodes.
- - `save`: Stores metrics in memory, but saves metrics to disk when Netdata stops or restarts, and loads historical
- metrics on start.
- - `map`: Stores metrics in memory-mapped files, like swap, with constant disk write.
- - `none`: No database.
- When using `default memory mode = dbengine`, the parent node creates a separate instance of the TSDB to store metrics
- from child nodes. The [size of _each_ instance is configurable](https://github.com/netdata/netdata/blob/master/docs/store/change-metrics-storage.md) with the `page
- cache size` and `dbengine multihost disk space` settings in the `[global]` section in `netdata.conf`.
- ### `netdata.conf`
- | Setting | Default | Description |
- | :----------------------------------------- | :---------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
- | **`[global]` section** | | |
- | `memory mode` | `dbengine` | Determines the [database type](https://github.com/netdata/netdata/blob/master/database/README.md) to be used on that node. Other options settings include `none`, `ram`, `save`, and `map`. `none` disables the database at this host. This also disables alarms and notifications, as those can't run without a database. |
- | **`[web]` section** | | |
- | `mode` | `static-threaded` | Determines the [web server](https://github.com/netdata/netdata/blob/master/web/server/README.md) type. The other option is `none`, which disables the dashboard, API, and registry. |
- | `accept a streaming request every seconds` | `0` | Set a limit on how often a parent node accepts streaming requests from child nodes. `0` equals no limit. If this is set, you may see `... too busy to accept new streaming request. Will be allowed in X secs` in Netdata's `error.log`. |
- ## Examples
- ### Per-child settings
- While the `[API_KEY]` section applies settings for any child node using that key, you can also use per-child settings
- with the `[MACHINE_GUID]` section.
- For example, the metrics streamed from only the child node with `MACHINE_GUID` are saved in memory, not using the
- default `dbengine` as specified by the `API_KEY`, and alarms are disabled.
- ```conf
- [API_KEY]
- enabled = yes
- default memory mode = dbengine
- health enabled by default = auto
- allow from = *
- [MACHINE_GUID]
- enabled = yes
- memory mode = save
- health enabled = no
- ```
- ### Securing streaming with TLS/SSL
- Netdata does not activate TLS encryption by default. To encrypt streaming connections, you first need to [enable TLS
- support](https://github.com/netdata/netdata/blob/master/web/server/README.md#enabling-tls-support) on the parent. With encryption enabled on the receiving side, you
- need to instruct the child to use TLS/SSL as well. On the child's `stream.conf`, configure the destination as follows:
- ```
- [stream]
- destination = host:port:SSL
- ```
- The word `SSL` appended to the end of the destination tells the child that connections must be encrypted.
- > While Netdata uses Transport Layer Security (TLS) 1.2 to encrypt communications rather than the obsolete SSL protocol,
- > it's still common practice to refer to encrypted web connections as `SSL`. Many vendors, like Nginx and even Netdata
- > itself, use `SSL` in configuration files, whereas documentation will always refer to encrypted communications as `TLS`
- > or `TLS/SSL`.
- #### Certificate verification
- When TLS/SSL is enabled on the child, the default behavior will be to not connect with the parent unless the server's
- certificate can be verified via the default chain. In case you want to avoid this check, add the following to the
- child's `stream.conf` file:
- ```
- [stream]
- ssl skip certificate verification = yes
- ```
- #### Trusted certificate
- If you've enabled [certificate verification](#certificate-verification), you might see errors from the OpenSSL library
- when there's a problem with checking the certificate chain (`X509_V_ERR_UNABLE_TO_GET_ISSUER_CERT_LOCALLY`). More
- importantly, OpenSSL will reject self-signed certificates.
- Given these known issues, you have two options. If you trust your certificate, you can set the options `CApath` and
- `CAfile` to inform Netdata where your certificates, and the certificate trusted file, are stored.
- For more details about these options, you can read about [verify
- locations](https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_load_verify_locations.html).
- Before you changed your streaming configuration, you need to copy your trusted certificate to your child system and add
- the certificate to OpenSSL's list.
- On most Linux distributions, the `update-ca-certificates` command searches inside the `/usr/share/ca-certificates`
- directory for certificates. You should double-check by reading the `update-ca-certificate` manual (`man
- update-ca-certificate`), and then change the directory in the below commands if needed.
- If you have `sudo` configured on your child system, you can use that to run the following commands. If not, you'll have
- to log in as `root` to complete them.
- ```
- # mkdir /usr/share/ca-certificates/netdata
- # cp parent_cert.pem /usr/share/ca-certificates/netdata/parent_cert.crt
- # chown -R netdata.netdata /usr/share/ca-certificates/netdata/
- ```
- First, you create a new directory to store your certificates for Netdata. Next, you need to change the extension on your
- certificate from `.pem` to `.crt` so it's compatible with `update-ca-certificate`. Finally, you need to change
- permissions so the user that runs Netdata can access the directory where you copied in your certificate.
- Next, edit the file `/etc/ca-certificates.conf` and add the following line:
- ```
- netdata/parent_cert.crt
- ```
- Now you update the list of certificates running the following, again either as `sudo` or `root`:
- ```
- # update-ca-certificates
- ```
- > Some Linux distributions have different methods of updating the certificate list. For more details, please read this
- > guide on [adding trusted root certificates](https://github.com/Busindre/How-to-Add-trusted-root-certificates).
- Once you update your certificate list, you can set the stream parameters for Netdata to trust the parent certificate.
- Open `stream.conf` for editing and change the following lines:
- ```
- [stream]
- CApath = /etc/ssl/certs/
- CAfile = /etc/ssl/certs/parent_cert.pem
- ```
- With this configuration, the `CApath` option tells Netdata to search for trusted certificates inside `/etc/ssl/certs`.
- The `CAfile` option specifies the Netdata parent certificate is located at `/etc/ssl/certs/parent_cert.pem`. With this
- configuration, you can skip using the system's entire list of certificates and use Netdata's parent certificate instead.
- #### Expected behaviors
- With the introduction of TLS/SSL, the parent-child communication behaves as shown in the table below, depending on the
- following configurations:
- - **Parent TLS (Yes/No)**: Whether the `[web]` section in `netdata.conf` has `ssl key` and `ssl certificate`.
- - **Parent port TLS (-/force/optional)**: Depends on whether the `[web]` section `bind to` contains a `^SSL=force` or
- `^SSL=optional` directive on the port(s) used for streaming.
- - **Child TLS (Yes/No)**: Whether the destination in the child's `stream.conf` has `:SSL` at the end.
- - **Child TLS Verification (yes/no)**: Value of the child's `stream.conf` `ssl skip certificate verification`
- parameter (default is no).
- | Parent TLS enabled | Parent port SSL | Child TLS | Child SSL Ver. | Behavior |
- | :----------------- | :--------------- | :-------- | :------------- | :--------------------------------------------------------------------------------------------------------------------------------------- |
- | No | - | No | no | Legacy behavior. The parent-child stream is unencrypted. |
- | Yes | force | No | no | The parent rejects the child connection. |
- | Yes | -/optional | No | no | The parent-child stream is unencrypted (expected situation for legacy child nodes and newer parent nodes) |
- | Yes | -/force/optional | Yes | no | The parent-child stream is encrypted, provided that the parent has a valid TLS/SSL certificate. Otherwise, the child refuses to connect. |
- | Yes | -/force/optional | Yes | yes | The parent-child stream is encrypted. |
- ### Proxy
- A proxy is a node that receives metrics from a child, then streams them onward to a parent. To configure a proxy,
- configure it as a receiving and a sending Netdata at the same time.
- Netdata proxies may or may not maintain a database for the metrics passing through them. When they maintain a database,
- they can also run health checks (alarms and notifications) for the remote host that is streaming the metrics.
- In the following example, the proxy receives metrics from a child node using the `API_KEY` of
- `66666666-7777-8888-9999-000000000000`, then stores metrics using `dbengine`. It then uses the `API_KEY` of
- `11111111-2222-3333-4444-555555555555` to proxy those same metrics on to a parent node at `203.0.113.0`.
- ```conf
- [stream]
- enabled = yes
- destination = 203.0.113.0
- api key = 11111111-2222-3333-4444-555555555555
- [66666666-7777-8888-9999-000000000000]
- enabled = yes
- default memory mode = dbengine
- ```
- ### Ephemeral nodes
- Netdata can help you monitor ephemeral nodes, such as containers in an auto-scaling infrastructure, by always streaming
- metrics to any number of permanently-running parent nodes.
- On the parent, set the following in `stream.conf`:
- ```conf
- [11111111-2222-3333-4444-555555555555]
- # enable/disable this API key
- enabled = yes
- # one hour of data for each of the child nodes
- default history = 3600
- # do not save child metrics on disk
- default memory = ram
- # alarms checks, only while the child is connected
- health enabled by default = auto
- ```
- On the child nodes, set the following in `stream.conf`:
- ```bash
- [stream]
- # stream metrics to another Netdata
- enabled = yes
- # the IP and PORT of the parent
- destination = 10.11.12.13:19999
- # the API key to use
- api key = 11111111-2222-3333-4444-555555555555
- ```
- In addition, edit `netdata.conf` on each child node to disable the database and alarms.
- ```bash
- [global]
- # disable the local database
- memory mode = none
- [health]
- # disable health checks
- enabled = no
- ```
- ## Troubleshooting
- Both parent and child nodes log information at `/var/log/netdata/error.log`.
- If the child manages to connect to the parent you will see something like (on the parent):
- ```
- 2017-03-09 09:38:52: netdata: INFO : STREAM [receive from [10.11.12.86]:38564]: new client connection.
- 2017-03-09 09:38:52: netdata: INFO : STREAM xxx [10.11.12.86]:38564: receive thread created (task id 27721)
- 2017-03-09 09:38:52: netdata: INFO : STREAM xxx [receive from [10.11.12.86]:38564]: client willing to stream metrics for host 'xxx' with machine_guid '1234567-1976-11e6-ae19-7cdd9077342a': update every = 1, history = 3600, memory mode = ram, health auto
- 2017-03-09 09:38:52: netdata: INFO : STREAM xxx [receive from [10.11.12.86]:38564]: initializing communication...
- 2017-03-09 09:38:52: netdata: INFO : STREAM xxx [receive from [10.11.12.86]:38564]: receiving metrics...
- ```
- and something like this on the child:
- ```
- 2017-03-09 09:38:28: netdata: INFO : STREAM xxx [send to box:19999]: connecting...
- 2017-03-09 09:38:28: netdata: INFO : STREAM xxx [send to box:19999]: initializing communication...
- 2017-03-09 09:38:28: netdata: INFO : STREAM xxx [send to box:19999]: waiting response from remote netdata...
- 2017-03-09 09:38:28: netdata: INFO : STREAM xxx [send to box:19999]: established communication - sending metrics...
- ```
- The following sections describe the most common issues you might encounter when connecting parent and child nodes.
- ### Slow connections between parent and child
- When you have a slow connection between parent and child, Netdata raises a few different errors. Most of the
- errors will appear in the child's `error.log`.
- ```bash
- netdata ERROR : STREAM_SENDER[CHILD HOSTNAME] : STREAM CHILD HOSTNAME [send to PARENT IP:PARENT PORT]: too many data pending - buffer is X bytes long,
- Y unsent - we have sent Z bytes in total, W on this connection. Closing connection to flush the data.
- ```
- On the parent side, you may see various error messages, most commonly the following:
- ```
- netdata ERROR : STREAM_PARENT[CHILD HOSTNAME,[CHILD IP]:CHILD PORT] : read failed: end of file
- ```
- Another common problem in slow connections is the child sending a partial message to the parent. In this case, the
- parent will write the following to its `error.log`:
- ```
- ERROR : STREAM_RECEIVER[CHILD HOSTNAME,[CHILD IP]:CHILD PORT] : sent command 'B' which is not known by netdata, for host 'HOSTNAME'. Disabling it.
- ```
- In this example, `B` was part of a `BEGIN` message that was cut due to connection problems.
- Slow connections can also cause problems when the parent misses a message and then receives a command related to the
- missed message. For example, a parent might miss a message containing the child's charts, and then doesn't know
- what to do with the `SET` message that follows. When that happens, the parent will show a message like this:
- ```
- ERROR : STREAM_RECEIVER[CHILD HOSTNAME,[CHILD IP]:CHILD PORT] : requested a SET on chart 'CHART NAME' of host 'HOSTNAME', without a dimension. Disabling it.
- ```
- ### Child cannot connect to parent
- When the child can't connect to a parent for any reason (misconfiguration, networking, firewalls, parent
- down), you will see the following in the child's `error.log`.
- ```
- ERROR : STREAM_SENDER[HOSTNAME] : Failed to connect to 'PARENT IP', port 'PARENT PORT' (errno 113, No route to host)
- ```
- ### 'Is this a Netdata?'
- This question can appear when Netdata starts the stream and receives an unexpected response. This error can appear when
- the parent is using SSL and the child tries to connect using plain text. You will also see this message when
- Netdata connects to another server that isn't Netdata. The complete error message will look like this:
- ```
- ERROR : STREAM_SENDER[CHILD HOSTNAME] : STREAM child HOSTNAME [send to PARENT HOSTNAME:PARENT PORT]: server is not replying properly (is it a netdata?).
- ```
- ### Stream charts wrong
- Chart data needs to be consistent between child and parent nodes. If there are differences between chart data on
- a parent and a child, such as gaps in metrics collection, it most often means your child's `memory mode`
- does not match the parent's. To learn more about the different ways Netdata can store metrics, and thus keep chart
- data consistent, read our [memory mode documentation](https://github.com/netdata/netdata/blob/master/database/README.md).
- ### Forbidding access
- You may see errors about "forbidding access" for a number of reasons. It could be because of a slow connection between
- the parent and child nodes, but it could also be due to other failures. Look in your parent's `error.log` for errors
- that look like this:
- ```
- STREAM [receive from [child HOSTNAME]:child IP]: `MESSAGE`. Forbidding access."
- ```
- `MESSAGE` will have one of the following patterns:
- - `request without KEY` : The message received is incomplete and the KEY value can be API, hostname, machine GUID.
- - `API key 'VALUE' is not valid GUID`: The UUID received from child does not have the format defined in [RFC
- 4122](https://tools.ietf.org/html/rfc4122)
- - `machine GUID 'VALUE' is not GUID.`: This error with machine GUID is like the previous one.
- - `API key 'VALUE' is not allowed`: This stream has a wrong API key.
- - `API key 'VALUE' is not permitted from this IP`: The IP is not allowed to use STREAM with this parent.
- - `machine GUID 'VALUE' is not allowed.`: The GUID that is trying to send stream is not allowed.
- - `Machine GUID 'VALUE' is not permitted from this IP. `: The IP does not match the pattern or IP allowed to connect to
- use stream.
- ### Netdata could not create a stream
- The connection between parent and child is a stream. When the parent can't convert the initial connection into
- a stream, it will write the following message inside `error.log`:
- ```
- file descriptor given is not a valid stream
- ```
- After logging this error, Netdata will close the stream.
|