Helm charts are designed for Kubernetes to run as the local equivalent of the Netdata Cloud public offering. This means that no data is sent outside of your cluster. By default, On-Prem installation is trying to reach outside resources only when pulling the container images. There are 2 helm charts in total:
*
*
(Main persistent data app)*
(MQTT Broker that allows Agents to send messages to the On-Prem Cloud)*
(Central communication hub. Applications exchange messages through Pulsar)*
(Internal communication - API Gateway)*
(Holds Feed)*
(Cache)*
(Our ECR repos are secured)*
- available in dependencies helm chart for PoC applications.There was no point in trying to connect more nodes as we are covering the PoC purposes.
For a comparison - Netdata Cloud On-prem installation with just 100 nodes connected, without dependencies is going to consume ~2CPUs and ~2GiB of memory (REAL usage, not requests on a Kubernetes).
The helm chart for the Netdata Cloud On-Prem installation on Kubernetes is available in the ECR registry. The ECR registry is private, so you need to log in first. Credentials are sent by our Product Team. If you do not have them, please contact our Product Team - info@netdata.cloud.
The machine used for helm chart installation will also need AWS CLI installed.
There are 2 options for configuring aws cli
to work with the provided credentials. The first one is to set the environment variables:
export AWS_ACCESS_KEY_ID=<your_secret_id>
export AWS_SECRET_ACCESS_KEY=<your_secret_key>
The second one is to use an interactive shell:
aws configure
Using aws
command we will generate a token for helm to access the secured ECR repository:
aws ecr get-login-password --region us-east-1 | helm registry login --username AWS --password-stdin 362923047827.dkr.ecr.us-east-1.amazonaws.com/netdata-cloud-onprem
After this step you should be able to add the repository to your helm or just pull the helm chart:
helm pull oci://362923047827.dkr.ecr.us-east-1.amazonaws.com/netdata-cloud-dependency --untar #optional
helm pull oci://362923047827.dkr.ecr.us-east-1.amazonaws.com/netdata-cloud-onprem --untar
Local folders with the newest versions of helm charts should appear on your working dir.
Netdata provides access to two helm charts:
The entire helm chart is designed around the idea that it allows the installation of the necessary applications:
Every configuration option is available through values.yaml
in the folder that contains your netdata-cloud-dependency helm chart. All configuration options are described in README.md which is a part of the helm chart. It is enough to mention here that each component can be enabled/disabled individually. It is done by true/false switches in values.yaml
. In this way, it is easier for the user to migrate to production-grade components gradually.
Unless you prefer a different solution to the problem, k8s-ecr-login-renew
is responsible for calling out the AWS API
for token regeneration. This token is then injected into the secret that every node is using for authentication with secured ECR when pulling the images.
The default setting in values.yaml
of netdata-cloud-onprem
- .global.imagePullSecrets
is configured to work out of the box with the dependency helm chart.
For helm chart installation - save your changes in values.yaml
and execute:
cd [your helm chart location]
helm upgrade --wait --install netdata-cloud-dependency -n netdata-cloud --create-namespace -f values.yaml .
Every configuration option is available through values.yaml
in the folder that contains your netdata-cloud-onprem helm chart. All configuration options are described in README.md which is a part of the helm chart.
cd [your helm chart location]
helm upgrade --wait --install netdata-cloud-onprem -n netdata-cloud --create-namespace -f values.yaml .
netdata-cloud-common
is created. It contains several randomly generated entries. Deleting helm chart is not going to delete this secret, nor reinstalling the whole On-Prem, unless manually deleted by kubernetes administrator. The content of this secret is extremely relevant - strings that are contained there are essential parts of encryption. Losing or changing the data that it contains will result in data loss.Responsible for user registration & authentication. Manages user account information.
Forwards request from the cloud to the relevant agents. The requests include:
Fetching function data from the agent
Forwards MQTT messages emitted by the agent related to the agent entities to the internal Pulsar broker. These include agent connection state updates.
Forwards Pulsar messages emitted in the cloud related to the agent entities to the MQTT broker. From there, the messages reach the relevant agent.
Forwards MQTT messages emitted by the agent related to the alarm-config entities to the internal Pulsar broker. These include the data for the alarm configuration as seen by the agent.
Forwards MQTT messages emitted by the agent related to the alarm-log entities to the internal Pulsar broker. These contain data about the alarm transitions that occurred in an agent.
Forwards Pulsar messages emitted in the cloud related to the alarm entities to the MQTT broker. From there, the messages reach the relevant agent.
Persists latest alert statuses received from the agent in the cloud. Aggregates alert statuses from relevant node instances. Exposes API endpoints to fetch alert data for visualization on the cloud. Determines if notifications need to be sent when alert statuses change and emits relevant messages to Pulsar. Exposes API endpoints to store and return notification-silencing data.
Responsible for starting the alert stream between the agent and the cloud. Ensures that messages are processed in the correct order, and starts a reconciliation process between the cloud and the agent if out-of-order processing occurs.
Forwards MQTT messages emitted by the agent related to the chart entities to the internal Pulsar broker. These include the chart metadata that is used to display relevant charts on the cloud.
Forwards Pulsar messages emitted in the cloud related to the charts entities to the MQTT broker. From there, the messages reach the relevant agent.
Exposes API endpoints to fetch the chart metadata.
Forwards data requests via the cloud-agent-data-ctrl-service
to the relevant agents to fetch chart data points.
Exposes API endpoints to call various other endpoints on the agent, for instance, functions.
Exposes API endpoints to fetch and store custom dashboard data.
Serves as the first contact point between the agent and the cloud. Returns authentication and MQTT endpoints to connecting agents.
Processes incoming feed events and stores them in Elasticsearch. Exposes API endpoints to fetch feed events from Elasticsearch.
Contains the on-prem cloud website. Serves static content.
Acts as a middleware for authentication on most of the API endpoints. Validates incoming token headers, injects the relevant ones, and forwards the requests.
Exports various metrics from an On-Prem Cloud installation. Uses the Prometheus metric exposition format.
Exposes API endpoints to fetch a human-friendly explanation of various netdata configuration options, namely the alerts.
Forwards MQTT messages emitted by the agent related to the node entities to the internal Pulsar broker. These include the node metadata as well as their connectivity state, either direct or via parents.
Forwards Pulsar messages emitted in the cloud related to the charts entities to the MQTT broker. From there, the messages reach the relevant agent.
Exposes API endpoints to handle integrations. Handles incoming notification messages and uses the relevant channels(email, slack...) to notify relevant users.
Exposes API endpoints to fetch and store relations between agents, nodes, spaces, users, and rooms. Acts as a provider of authorization for other cloud endpoints. Exposes API endpoints to authenticate agents connecting to the cloud.