SMusatov/netdata: Monitor your servers, containers, and applications, in high-resolution and in real-time! @ a2bb44c61089a5adf877623d4f210ae8c0233545

Monitor your servers, containers, and applications, in high-resolution and in real-time! https://www.netdata.cloud/

19035 Commits

Netdata bot a2bb44c610 Regenerate integrations.js (#19048)		1 week ago
.github	66d505a419 Tidy up CI to improve overall run times. (#18957)	1 week ago
.vscode	3df554b844 fix move collectors to src/ leftovers (#16967)	9 months ago
docs	32a6cd38e7 Capitalize the word "Agent" (#19044)	1 week ago
integrations	a2bb44c610 Regenerate integrations.js (#19048)	1 week ago
packaging	32a6cd38e7 Capitalize the word "Agent" (#19044)	1 week ago
src	a2bb44c610 Regenerate integrations.js (#19048)	1 week ago
system	1550cf4850 add `shutdown` keyword to ensure graceful service termination on FreeBSD (#19033)	1 week ago
tests	7332919cf5 Docs fixes (#18676)	1 month ago
.clang-format	cec48d37ec Fine tune clang-format (#7271)	3 years ago
.codacy.yml	313f18b7e3 remove pyyaml2 (#18404)	3 months ago
.dockerignore	c6992e44d8 Restore a broken symbolic link (#12923)	2 years ago
.flake8	fffd076e34 Add flake8 to review CI to check Python files. (#14582)	1 year ago
.gitignore	6cb5e58f60 Update CI to generate MSI installer for Windows using WiX. (#18914)	4 weeks ago
.gitmodules	124757a2fa remove fluent-bit submodule (#18196)	4 months ago
.shellcheckrc	5c7bd2c648 Assorted shellcheck cleanup. (#14524)	1 year ago
.yamllint.yml	3df554b844 fix move collectors to src/ leftovers (#16967)	9 months ago
CHANGELOG.md	2d115f8bb2 [ci skip] Update changelog and version for nightly build: v2.0.0-77-nightly.	1 week ago
CMakeLists.txt	b3ef98cbc3 added /api/v3/stream_path (#18943)	2 weeks ago
Dockerfile	c4e491f7b8 Remove the confusion around the multiple Dockerfile(s) we have (#8214)	4 years ago
LICENSE	bd864d5ac9 remove license templates; add info about SPDX to main license file	6 years ago
README.md	32a6cd38e7 Capitalize the word "Agent" (#19044)	1 week ago
REDISTRIBUTED.md	a878980b0f docs: fix ui license link (#18918)	4 weeks ago
netdata-installer.sh	a0dcbfc517 --dev option to installer (#19034)	1 week ago
netdata.spec.in	c210a9e730 build(deps): update go toolchain to v1.23.3 (#18961)	3 weeks ago

Monitor your servers, containers, and applications
in high-resolution and in real-time.

Visit the Project's Home Page

Important :bulb:
People get addicted to Netdata. Once you use it on your systems, there's no going back!

Netdata is a high-performance, cloud-native, and on-premises observability platform designed to monitor metrics and logs with unparalleled efficiency. It delivers a simpler, faster, and significantly easier approach to real-time, low-latency monitoring for systems, containers, and applications. Netdata requires zero-configuration to get started, offering a powerful and comprehensive monitoring experience, out of the box.

Netdata is also known for its cost-efficient, distributed design. Unlike traditional monitoring solutions that centralize data, Netdata distributes the code. Instead of funneling all data into a few central databases, Netdata processes data at the edge, keeping it close to the source. The smart open-source Netdata Agent acts as a distributed database, enabling the construction of complex observability pipelines with modular, Lego-like simplicity.

Netdata provides A.I. insights for all monitored data, training machine learning models directly at the edge. This allows for fully automated and unsupervised anomaly detection, and with its intuitive APIs and UIs, users can quickly perform root cause analysis and troubleshoot issues, identifying correlations and gaining deeper insights into their infrastructure.

The Netdata Ecosystem

Netdata is built on three core parts:

Netdata Agent (usually called just "Netdata"): This open-source component is the heart of the Netdata ecosystem, handling data collection, storage (embedded database), querying, machine learning, exporting, and alerting of observability data. All observability data and features a Netdata ecosystem offers, are managed by the Netdata Agent. It runs in physical and virtual servers, cloud environments, Kubernetes clusters, and edge/IoT devices and is carefully optimized to have zero impact on production systems and applications.

Netdata Cloud: Enhancing the Netdata Agent, Netdata Cloud offers enterprise features such as user management, role-based access control, horizontal scalability, alert and notification management, access from anywhere, and more. Netdata Cloud does not centralize or store observability data.

Netdata Cloud is a commercial product, available as an on-premises installation, or a SaaS solution, with a free community tier.

Netdata UI: The user interface that powers all dashboards, data visualization, and configuration.

While closed-source, it is free to use with both Netdata Agents and Netdata Cloud, via their public APIs. It is included in the binary packages offered by Netdata, and its latest version is publicly available via CDN.

Netdata scales effortlessly from a single server to thousands, even in complex, multi-cloud or hybrid environments, with the ability to retain data for years.

Key characteristics of the Netdata Agent

:boom: Collects data from 800+ integrations
Operating system metrics, container metrics, virtual machines, hardware sensors, applications metrics, OpenMetrics exporters, StatsD, and logs. OpenTelemetry is on its way to be included (currently being developed)...
:muscle: Real-Time, Low-Latency, High-Resolution
All data are collected per second and are made available on the APIs for visualization, immediately after data collection (1-second latency, data collection to visualization).
:face_in_clouds: AI across the board
Trains multiple Machine-Learning (ML) models at the edge, for each metric collected and uses AI to detect anomalies based on the past behavior of each metric.
:scroll: systemd-journald Logs
Includes tools to efficiently convert plain text log (text, csv, logfmt, json) files to structured systemd-journald entries (log2journal, systemd-cat-native) and queries systemd-journal files directly enabling powerful logs visualization dashboards. The Netdata Agents eliminate the need to centralize logs and provide all the functions to work with logs directly at the edge.
:star: Lego like, Observability Pipelines
Netdata Agents can be linked to together (in parent-child relationships), to build observability centralization points within your infrastructure, allowing you to control data replication and retention at multiple levels.
:fire: Fully Automated Powerful Visualization
Using the NIDL (Nodes, Instances, Dimensions & Labels) data model, the Netdata Agent enables the creation of fully automated dashboards, providing correlated visualization of all metrics, allowing you to understand any dataset at first sight, but also to filter, slice and dice the data directly on the dashboards, without the need to learn a query language.

Note: the Netdata UI is closed-source, but free to use with Netdata Agents and Netdata Cloud.

:bell: Out of box Alerts
Comes with hundreds of alerts out of the box to detect common issues and pitfalls, revealing issues that can easily go unnoticed. It supports several notification methods to let you know when your attention is needed.
:sunglasses: Low Maintenance
Fully automated in every aspect: automated dashboards, out-of-the-box alerts, auto-detection and auto-discovery of metrics, zero-touch machine-learning, easy scalability and high availability, and CI/CD friendly.
:star: Open and Extensible
Netdata is a modular platform that can be extended in all possible ways, and it also integrates nicely with other monitoring solutions.

What can be monitored with the Netdata Agent

Netdata monitors all the following:

When the Netdata Agent runs on Linux, it monitors every kernel feature available, providing full coverage of all kernel technologies and offers full enterprise hardware coverage, monitoring all components that provide hardware error reporting, like PCI AER, RAM EDAC, IPMI, S.M.A.R.T., NVMe, Fans, Power, Voltages, and more.

:star: Netdata is the most energy-efficient monitoring tool :star:

Dec 11, 2023: University of Amsterdam published a study related to the impact of monitoring tools for Docker based systems, aiming to answer 2 questions:

The impact of monitoring on the energy efficiency of Docker-based systems
The impact of monitoring on Docker-based systems?

🚀 Netdata excels in energy efficiency: "... Netdata is the most energy-efficient tool ...", as the study says.
🚀 Netdata excels in CPU Usage, RAM Usage and Execution Time, and has a similar impact on Network Traffic as Prometheus.

The study didn’t normalize the results based on the number of metrics collected. Given that Netdata usually collects significantly more metrics than the other tools, Netdata managed to outperform the other tools, while ingesting a much higher number of metrics. Read the full study here.

Netdata vs Prometheus

On the same workload, Netdata uses 35% less CPU, 49% less RAM, 12% less bandwidth, 98% less disk I/O, and is 75% more disk space efficient on high resolution metrics storage, while providing more than a year of overall retention on the same disk footprint Prometheus offers 7 days of retention. Read the full analysis in our blog.

Netdata actively supports and is a member of the Cloud Native Computing Foundation (CNCF)

...and due to your love :heart:, it is one of the most :star:'d projects in the CNCF landscape!

Below is an animated image, but you can see Netdata live!
FRANKFURT | NEWYORK | ATLANTA | SANFRANCISCO | TORONTO | SINGAPORE | BANGALORE
They are clustered Netdata Agent Parents. They all have the same data. Select the one closer to you.
All these run with the default configuration. We only clustered them to have multi-node dashboards.
Note: These demos include the Netdata UI,
which while being closed-source, is free to use with Netdata Agents and Netdata Cloud.

Getting Started

1. Install Netdata everywhere :v:

Netdata can be installed on all Linux, macOS, FreeBSD (and soon on Windows) systems. We provide binary packages for the most popular operating systems and package managers.

Install on Ubuntu, Debian CentOS, Fedora, Suse, Red Hat, Arch, Alpine, Gentoo, even BusyBox.
Install with Docker.
Netdata is a Verified Publisher on DockerHub and our users enjoy free unlimited DockerHub pulls :heart_eyes:.
Install on macOS :metal:.
Install on FreeBSD and pfSense.
Install from source
For Kubernetes deployments check here.

Check also the Netdata Deployment Guides to decide how to deploy it in your infrastructure.

By default, you will have immediately available a local dashboard. Netdata starts a web server for its dashboard at port 19999. Open up your web browser of choice and navigate to http://NODE:19999, replacing NODE with the IP address or hostname of your Agent. If installed on localhost, you can access it through http://localhost:19999.

Note: the binary packages we provide, install Netdata UI automatically. Netdata UI is closed-source, but free to use with Netdata Agents and Netdata Cloud.

2. Configure Collectors :boom:

Netdata auto-detects and auto-discovers most operating system data sources and applications. However, many data sources require some manual configuration, usually to allow Netdata to get access to the metrics.

For a detailed list of the 800+ collectors available, check this guide.
To monitor Windows servers and applications, use this guide.
Note that Netdata on Windows is at its final release stage, so at the next Netdata release Netdata will natively support Windows.
To monitor SNMP devices, check this guide.

3. Configure Alert Notifications :bell:

Netdata comes with hundreds of pre-configured alerts that automatically check your metrics immediately after they start getting collected.

Netdata can dispatch alert notifications to multiple third party systems, including: email, Alerta, AWS SNS, Discord, Dynatrace, flock, gotify, IRC, Matrix, MessageBird, Microsoft Teams, ntfy, OPSgenie, PagerDuty, Prowl, PushBullet, PushOver, RocketChat, Slack, SMS tools, Syslog, Telegram, Twilio.

By default, Netdata will send e-mail notifications if there is a configured MTA on the system.

4. Configure Netdata Parents :family:

Optionally, configure one or more Netdata Parents. A Netdata Parent is a Netdata Agent that has been configured to accept streaming connections from other Netdata Agents.

Netdata Parents provide:

Infrastructure level dashboards, at http://parent.server.ip:19999/.

Each Netdata Agent has an API listening at the TCP port 19999 of each server. When you hit that port with a web browser (e.g. http://server.ip:19999/), the Netdata Agent UI is presented. When the Netdata Agent is also a Parent, the UI of the Parent includes data for all nodes that stream metrics to that Parent.

Increased retention for all metrics of all your nodes.

Each Netdata Agent maintains each own database of metrics. But Parents can be given additional resources to maintain a much longer database than individual Netdata Agents.

Central configuration of alerts and dispatch of notifications.

Using Netdata Parents, all the alert notifications integrations can be configured only once at the Parent and they can be disabled at the Netdata Agents.

You can also use Netdata Parents to:

Offload your production systems (the parents run ML, alerts, queries, etc. for all their children)
Secure your production systems (the parents accept user connections for all their children)

5. Connect to Netdata Cloud :cloud:

Sign-in to Netdata Cloud and claim your Netdata Agents and Parents. If you connect your Netdata Parents, there is no need to connect your Netdata Agents. They will be connected via the Parents.

When your Netdata nodes are connected to Netdata Cloud, you can (on top of the above):

Access your Netdata Agents from anywhere
Access sensitive Netdata Agent features (like "Netdata Functions": processes, systemd-journal)
Organize your infra in spaces and Rooms
Create, manage, and share custom dashboards
Invite your team and assign roles to them (Role-Based Access Control)
Get infinite horizontal scalability (multiple independent Netdata Agents are viewed as one infra)
Configure alerts from the UI
Configure data collection from the UI
Netdata Mobile App notifications

:love_you_gesture: Netdata Cloud doesn’t prevent you from using your Netdata Agents and Parents directly, and vice versa.

:ok_hand: Your metrics are still stored in your network when you connect your Netdata Agents and Parents to Netdata Cloud.

How it works

Netdata is built around a modular metrics processing pipeline.

Click to see more details about this pipeline...

Each Netdata Agent can perform the following functions: 1. **`COLLECT` metrics from their sources**
Uses [internal](https://github.com/netdata/netdata/tree/master/src/collectors) and [external](https://github.com/netdata/go.d.plugin/tree/master/modules) plugins to collect data from their sources. Netdata auto-detects and collects almost everything from the operating system: including CPU, Interrupts, Memory, Disks, Mount Points, Filesystems, Network Stack, Network Interfaces, Containers, VMs, Processes, `systemd` units, Linux Performance Metrics, Linux eBPF, Hardware Sensors, IPMI, and more. It collects application metrics from applications: PostgreSQL, MySQL/MariaDB, Redis, MongoDB, Nginx, Apache, and hundreds more. Netdata also collects your custom application metrics by scraping OpenMetrics exporters, or via StatsD. It can convert web server log files to metrics and apply ML and alerts to them in real-time. And it also supports synthetic tests / white box tests, so you can ping servers, check API responses, or even check filesystem files and directories to generate metrics, train ML and run alerts and notifications on their status. 2. **`STORE` metrics to a database**
Uses database engine plugins to store the collected data, either in memory and/or on disk. We have developed our own [`dbengine`](https://github.com/netdata/netdata/tree/master/src/database/engine#readme) for storing the data in a very efficient manner, allowing Netdata to have less than one byte per sample on disk and amazingly fast queries. 3. **`LEARN` the behavior of metrics** (ML)
Trains multiple Machine-Learning (ML) models per metric to learn the behavior of each metric individually. Netdata uses the `kmeans` algorithm and creates by default a model per metric per hour, based on the values collected for that metric over the last 6 hours. The trained models are persisted to disk. 4. **`DETECT` anomalies in metrics** (ML)
Uses the trained machine learning (ML) models to detect outliers and mark collected samples as **anomalies**. Netdata stores anomaly information together with each sample and also streams it to Netdata Parents so that the anomaly is also available at query time for the whole retention of each metric. 5. **`CHECK` metrics and trigger alert notifications**
Uses its configured alerts (you can configure your own) to check the metrics for common issues and uses notification plugins to send alert notifications. 6. **`STREAM` metrics to other Netdata Agents**
Push metrics in real-time to Netdata Parents. 7. **`ARCHIVE` metrics to third party databases**
Export metrics to industry standard time-series databases, like `Prometheus`, `InfluxDB`, `OpenTSDB`, `Graphite`, etc. 8. **`QUERY` metrics and present dashboards**
Provide an API to query the data and present interactive dashboards to users. 9. **`SCORE` metrics to reveal similarities and patterns**
Score the metrics according to the given criteria, to find the needle in the haystack. When using Netdata Parents, all the functions of a Netdata Agent (except data collection) can be delegated to Parents to offload production systems. The core of Netdata is developed in C. We have our own `libnetdata`, that provides: - **`DICTIONARY`**
A high-performance algorithm to maintain both indexed and ordered pools of structures Netdata needs. It uses JudyHS arrays for indexing, although it is modular: any hashtable or tree can be integrated into it. Despite being in C, dictionaries follow object-oriented programming principles, so there are constructors, destructors, automatic memory management, garbage collection, and more. For more, see [here](https://github.com/netdata/netdata/tree/master/src/libnetdata/dictionary). - **`ARAL`**
ARray ALlocator (ARAL) is used to minimize the system allocations made by Netdata. ARAL is optimized for maximum multithreaded performance. It also allows all structures that use it to be allocated in memory-mapped files (shared memory) instead of RAM. For more, see [here](https://github.com/netdata/netdata/tree/master/src/libnetdata/aral). - **`PROCFILE`**
A high-performance `/proc` (but also any) file parser and text tokenizer. It achieves its performance by keeping files open and adjusting its buffers to read the entire file in one call (which is also required by the Linux kernel). For more, see [here](https://github.com/netdata/netdata/tree/master/src/libnetdata/procfile). - **`STRING`**
A string internet mechanism, for string deduplication and indexing (using JudyHS arrays), optimized for multithreaded usage. For more, see [here](https://github.com/netdata/netdata/tree/master/src/libnetdata/string). - **`ARL`**
Adaptive Resortable List (ARL) is a very fast list iterator, that keeps the expected items on the list in the same order they are found in an input list. So, the first iteration is somewhat slower, but all the following iterations are perfectly aligned for the best performance. For more, see [here](https://github.com/netdata/netdata/tree/master/src/libnetdata/adaptive_resortable_list). - **`BUFFER`**
A flexible text buffer management system that allows Netdata to automatically handle dynamically sized text buffer allocations. The same mechanism is used for generating consistent JSON output by the Netdata APIs. For more, see [here](https://github.com/netdata/netdata/tree/master/src/libnetdata/buffer). - **`SPINLOCK`**
Like POSIX `MUTEX` and `RWLOCK` but a lot faster, based on atomic operations, with significantly smaller memory impact, while being portable. - **`PGC`**
A caching layer that can be used to cache any kind of time-related data, with automatic indexing (based on a tree of JudyL arrays), memory management, evictions, flushing, pressure management. This is extensively used in `dbengine`. For more, see [here](/src/database/engine/README.md). The above, and many more, allow Netdata developers to work on the application fast and with confidence. Most of the business logic in Netdata is a work of mixing the above. Netdata data collection plugins can be developed in any language. Most of our application collectors though are developed in [Go](https://github.com/netdata/go.d.plugin).