The riakkv_kv_put_slow
alert is triggered when the average processing time for PUT requests in Riak KV database increases significantly in comparison to the last hour's average, suggesting that the server is overloaded.
An overloaded server means that the server is unable to handle the incoming requests efficiently, leading to increased processing times and degraded performance. Sometimes, it might result in request timeouts or even crashes.
To troubleshoot this alert, follow the below steps:
Use riak-admin
tool's status
command to check the current performance of the Riak KV node:
riak-admin status
Look for the following key performance indicators (KPIs) for PUT requests:
If any of these values are significantly higher than their historical values, it may indicate an issue with the node's performance.
Examine the application logs or Riak KV logs for recent activity such as high volume of PUT requests, bulk updates or deletions, or other intensive database operations that could potentially cause the slowdown.
Check the server's CPU, memory, and disk I/O usage to identify any resource constraints that could be affecting the performance of the Riak KV node.
Analyze the Riak KV configuration settings to ensure that they are optimized for your specific use case. Improperly configured settings can lead to performance issues.
If the current Riak KV cluster is not able to handle the increasing workload, consider adding new nodes to the cluster to distribute the load and improve performance.