Skip to main content

LibreNMS CPU使用率監控警報

新增警報規則

Severity:警報等級

Max alert :最多通知幾次

Delay :延遲多久再來通知

Interval :通知間隔

Transports:選擇傳送類別

image.png

由於多核心CPU預設一顆CPU觸發就會警報,故修改使用平均使用率計算

官方解決範本https://docs.librenms.org/Alerting/Rules/#advanced,不過有GROUP語法錯誤告警

以下為自訂修改版本搭配警報範本使用,Override SQL記得開啟

image.png

SELECT devices.device_id, devices.hostname, devices.status, devices.disabled, devices.ignore,
       AVG(processors.processor_usage) AS cpu_avg, 
	   MIN(processor_descr) AS processor_descr
FROM devices
INNER JOIN processors ON devices.device_id = processors.device_id
WHERE devices.device_id = ?
  AND devices.status = 1
  AND devices.disabled = 0
  AND devices.ignore = 0
GROUP BY devices.device_id, devices.hostname, devices.status, devices.disabled,devices.ignore,processor_descr
HAVING AVG(processors.processor_usage) > 70;
新增警報範本

image.png

{{ $alert->title }}
設備名稱: {{ $alert->sysName }}
等級: {{ $alert->severity }}
運行時間: {{ $alert->uptime_short }}
警示時間: {{ $alert->timestamp }}
@foreach ($alert->faults as $key => $value)
CPU名稱: {{ $value['processor_descr'] }}
目前CPU使用率: {{ round($value['cpu_avg']) }} %
@endforeach
測試

image.png

image.png