portcheck_connection_timeouts.md 2.4 KB

Understand the alert

The portcheck_connection_timeouts alert calculates the average ratio of connection timeouts when trying to connect to a TCP endpoint over the last 5 minutes. If you receive this alert, it means that the monitored TCP endpoint is unreachable, potentially due to networking issues or an overloaded host/service.

This alert triggers a warning state when the ratio of timeouts is between 10-40% and a critical state if the ratio is greater than 40%.

Troubleshoot the alert

  1. Check the network connectivity

    • Use the ping command to check network connectivity between your system and the monitored TCP endpoint.

      ping <tcp_endpoint_ip>
      

      If the connectivity is intermittent or not established, it indicates network issues. Reach out to your network administrator for assistance.

  2. Check the status of the monitored TCP service

    • Identify the service running on the monitored TCP endpoint by checking the port number.
    • Use the netstat command to check the service status:

      netstat -tnlp | grep <port_number>
      

If the service is not running or unresponsive, restart the service or investigate further into the application logs for any issues.

  1. Verify the load on the TCP endpoint host

    • Connect to the host and analyze its resource consumption (CPU, memory, disk I/O, and network bandwidth) with tools like top, vmstat, iostat, and iftop.
    • Identify resource-consuming processes or applications and apply corrective measures (kill/restart the process, allocate more resources, etc.).
  2. Examine the firewall rules and security groups

    • Ensure that there are no blocking rules or security groups for your incoming connections to the TCP endpoint.
    • If required, update the rules or create new allow rules for the required ports and IP addresses.
  3. Check the Netdata configuration

    • Review the Netdata configuration file /etc/netdata/netdata.conf to ensure the portcheck plugin settings are correctly configured for monitoring the TCP endpoint.
    • If necessary, update and restart the Netdata agent.

Useful resources

  1. Netstat Command in Linux
  2. Iostat Command Usage and Examples
  3. Iftop Guide