![]() |
9 months ago | |
---|---|---|
.. | ||
README.md | 9 months ago | |
bandwidth-requirements.md | 1 year ago | |
cpu-requirements.md | 1 year ago | |
disk-requirements-and-retention.md | 1 year ago | |
ram-requirements.md | 9 months ago |
Netdata automatically adjusts its resources utilization based on the workload offered to it.
This is a map of how Netdata features impact resources utilization:
Feature | CPU | RAM | Disk I/O | Disk Space | Retention | Bandwidth |
---|---|---|---|---|---|---|
Metrics collected | X | X | X | X | X | - |
Samples collection frequency | X | - | X | X | X | - |
Database mode and tiers | - | X | X | X | X | - |
Machine learning | X | X | - | - | - | - |
Streaming | X | X | - | - | - | X |
When you need to lower the resources used by Netdata, this is an obvious first step.
Lowering the data collection frequency from every-second to every-2-seconds, will make Netdata use half the CPU utilization. So, CPU utilization is proportional to the data collection frequency.
The number of database tiers affects the memory requirements of Netdata. Going from 3-tiers to 1-tier, will make Netdata use half the memory. Of course metrics retention will also be limited to 1 tier.
Machine Learning: Byt default Netdata trains multiple machine learning models for every metric collected, to learn its behavior and detect anomalies. Machine Learning is a CPU intensive process and affects the overall CPU utilization of Netdata.
Streaming Compression: When using Netdata in Parent-Child configurations to create Metrics Centralization Points, the compression algorithm used greatly affects CPU utilization and bandwidth consumption.
Netdata supports multiple streaming compressions algorithms, allowing the optimization of either CPU utilization or Network Bandwidth. The default algorithm zstd
provides the best balance among them.
To minimize the resources used by Netdata Agents, we suggest to configure Netdata Parents for centralizing metric samples, and disabling most of the features on Netdata Children. This will provide minimal resources utilization at the edge, while all the features of Netdata are available at the Netdata Parents.
The following guides provide instructions on how to do this.
Netdata Parents automatically size resource utilization based on the workload they receive. The only possible option for improving query performance is to dedicate more RAM to them, by increasing their caches efficiency.
Check RAM Requirements for more information.
The following are some of the innovations the open-source Netdata agent has, that contribute to its excellent performance, and scalability.
When Netdata saves data on-disk, it stores them at their final place, eliminating the need to reorganize this data.
Netdata is organizing its data structures in such a way that samples are committed to disk as evenly as possible across time, without affecting its memory requirements.
Furthermore, Netdata Agents use direct-I/O for saving and loading metric samples. This prevents Netdata from polluting system caches with metric data. Netdata maintains its own caches for this data.
All these features make Netdata an nice partner and a polite citizen for production applications running on the same systems Netdata runs.
To achieve optimal memory and disk footprint, Netdata uses a custom 32-bit floating point number. This floating point number is used to store the samples collected, together with their anomaly bit. The database of Netdata is fixed-step, so it has predefined slots for every sample, allowing Netdata to store timestamps once every several hundreds samples, minimizing both its memory requirements and the disk footprint.
The final disk footprint of Netdata varies due to compression efficiency. It is usually about 0.6 bytes per sample for the high-resolution tier (per-second), 6 bytes per sample for the mid-resolution tier (per-minute) and 18 bytes per sample for the low-resolution tier (per-hour).
Alerting, Machine Learning, Streaming and Replication, rely on metric queries. When multiple queries are running in parallel, Netdata assigns priorities to all of them, favoring interactive queries over background tasks. This means that queries do not compete equally for resources. Machine learning or replication may slow down when interactive queries are running and the system starves for resources.
Apart from metric samples, metric labels and their cardinality is the biggest memory consumer, especially in highly ephemeral environments, like kubernetes. Netdata uses a single pointer for any label key-value pair that is reused. Keys and values are also deduplicated, providing the best possible memory footprint for metric labels.
The streaming protocol of Netdata allows minimizing the resources consumed on production systems by delegating features of to other Netdata agents (Parents), without compromising monitoring fidelity or responsiveness, enabling the creation of a highly distributed observability platform.
Netdata outperforms Prometheus in every aspect. -35% CPU Utilization, -49% RAM usage, -12% network bandwidth, -98% disk I/O, -75% in disk footprint for high resolution data, while providing more than a year of retention.
Read the full comparison here.
University of Amsterdam contacted a research on the impact monitoring systems have on docker based systems.
The study found that Netdata excels in CPU utilization, RAM usage, Execution Time and concluded that Netdata is the most energy efficient tool.
Read the full study here.