metadata.yaml 7.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177
  1. plugin_name: python.d.plugin
  2. modules:
  3. - meta:
  4. plugin_name: python.d.plugin
  5. module_name: alarms
  6. monitored_instance:
  7. name: Netdata Agent alarms
  8. link: https://github.com/netdata/netdata/blob/master/collectors/python.d.plugin/alarms/README.md
  9. categories:
  10. - data-collection.other
  11. icon_filename: ""
  12. related_resources:
  13. integrations:
  14. list: []
  15. info_provided_to_referring_integrations:
  16. description: ""
  17. keywords:
  18. - alarms
  19. - netdata
  20. most_popular: false
  21. overview:
  22. data_collection:
  23. metrics_description: |
  24. This collector creates an 'Alarms' menu with one line plot of `alarms.status`.
  25. method_description: |
  26. Alarm status is read from the Netdata agent rest api [`/api/v1/alarms?all`](https://learn.netdata.cloud/api#/alerts/alerts1).
  27. supported_platforms:
  28. include: []
  29. exclude: []
  30. multi_instance: true
  31. additional_permissions:
  32. description: ""
  33. default_behavior:
  34. auto_detection:
  35. description: |
  36. It discovers instances of Netdata running on localhost, and gathers metrics from `http://127.0.0.1:19999/api/v1/alarms?all`. `CLEAR` status is mapped to `0`, `WARNING` to `1` and `CRITICAL` to `2`. Also, by default all alarms produced will be monitored.
  37. limits:
  38. description: ""
  39. performance_impact:
  40. description: ""
  41. setup:
  42. prerequisites:
  43. list: []
  44. configuration:
  45. file:
  46. name: python.d/alarms.conf
  47. description: ""
  48. options:
  49. description: |
  50. There are 2 sections:
  51. * Global variables
  52. * One or more JOBS that can define multiple different instances to monitor.
  53. The following options can be defined globally: priority, penalty, autodetection_retry, update_every, but can also be defined per JOB to override the global values.
  54. Additionally, the following collapsed table contains all the options that can be configured inside a JOB definition.
  55. Every configuration JOB starts with a `job_name` value which will appear in the dashboard, unless a `name` parameter is specified.
  56. folding:
  57. title: Config options
  58. enabled: true
  59. list:
  60. - name: url
  61. description: Netdata agent alarms endpoint to collect from. Can be local or remote so long as reachable by agent.
  62. default_value: http://127.0.0.1:19999/api/v1/alarms?all
  63. required: true
  64. - name: status_map
  65. description: Mapping of alarm status to integer number that will be the metric value collected.
  66. default_value: '{"CLEAR": 0, "WARNING": 1, "CRITICAL": 2}'
  67. required: true
  68. - name: collect_alarm_values
  69. description: set to true to include a chart with calculated alarm values over time.
  70. default_value: false
  71. required: true
  72. - name: alarm_status_chart_type
  73. description: define the type of chart for plotting status over time e.g. 'line' or 'stacked'.
  74. default_value: "line"
  75. required: true
  76. - name: alarm_contains_words
  77. description: >
  78. A "," separated list of words you want to filter alarm names for. For example 'cpu,load' would filter for only alarms with "cpu" or "load" in alarm name. Default includes all.
  79. default_value: ""
  80. required: true
  81. - name: alarm_excludes_words
  82. description: >
  83. A "," separated list of words you want to exclude based on alarm name. For example 'cpu,load' would exclude all alarms with "cpu" or "load" in alarm name. Default excludes None.
  84. default_value: ""
  85. required: true
  86. - name: update_every
  87. description: Sets the default data collection frequency.
  88. default_value: 10
  89. required: false
  90. - name: priority
  91. description: Controls the order of charts at the netdata dashboard.
  92. default_value: 60000
  93. required: false
  94. - name: autodetection_retry
  95. description: Sets the job re-check interval in seconds.
  96. default_value: 0
  97. required: false
  98. - name: penalty
  99. description: Indicates whether to apply penalty to update_every in case of failures.
  100. default_value: yes
  101. required: false
  102. - name: name
  103. description: Job name. This value will overwrite the `job_name` value. JOBS with the same name are mutually exclusive. Only one of them will be allowed running at any time. This allows autodetection to try several alternatives and pick the one that works.
  104. default_value: ""
  105. required: false
  106. examples:
  107. folding:
  108. enabled: true
  109. title: Config
  110. list:
  111. - name: Basic
  112. folding:
  113. enabled: false
  114. description: A basic example configuration.
  115. config: |
  116. jobs:
  117. url: 'http://127.0.0.1:19999/api/v1/alarms?all'
  118. - name: Advanced
  119. folding:
  120. enabled: true
  121. description: |
  122. An advanced example configuration with multiple jobs collecting different subsets of alarms for plotting on different charts.
  123. "ML" job will collect status and values for all alarms with "ml_" in the name. Default job will collect status for all other alarms.
  124. config: |
  125. ML:
  126. update_every: 5
  127. url: 'http://127.0.0.1:19999/api/v1/alarms?all'
  128. status_map:
  129. CLEAR: 0
  130. WARNING: 1
  131. CRITICAL: 2
  132. collect_alarm_values: true
  133. alarm_status_chart_type: 'stacked'
  134. alarm_contains_words: 'ml_'
  135. Default:
  136. update_every: 5
  137. url: 'http://127.0.0.1:19999/api/v1/alarms?all'
  138. status_map:
  139. CLEAR: 0
  140. WARNING: 1
  141. CRITICAL: 2
  142. collect_alarm_values: false
  143. alarm_status_chart_type: 'stacked'
  144. alarm_excludes_words: 'ml_'
  145. troubleshooting:
  146. problems:
  147. list: []
  148. alerts: []
  149. metrics:
  150. folding:
  151. title: Metrics
  152. enabled: false
  153. description: ""
  154. availability: []
  155. scopes:
  156. - name: global
  157. description: |
  158. These metrics refer to the entire monitored application.
  159. labels: []
  160. metrics:
  161. - name: alarms.status
  162. description: Alarms ({status mapping})
  163. unit: "status"
  164. chart_type: line
  165. dimensions:
  166. - name: a dimension per alarm representing the latest status of the alarm.
  167. - name: alarms.values
  168. description: Alarm Values
  169. unit: "value"
  170. chart_type: line
  171. dimensions:
  172. - name: a dimension per alarm representing the latest collected value of the alarm.