Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The self-diagnostics service implements rules that allow you to monitor various system statuses. You can see a full list of rules available for a particular server in the web interface of the self-diagnostics service in the StatusRule section: http://127.0.0.1:20040/rules, where:

  • alert is a rule name;
  • expr is a rule triggering condition; 
  • actions are actions of the rule;
  • summary is a rule description.
Info
titleNote

There are rules that generate alarms, but don't perform any actions. Such rules are labeled disabled: true.

Examples of rules:

alertexpractionssummary
Low disk free space (logs)

If free space on the system disk is less than 20 GB, all server logs, including archived logs, are deleted to free up space:

Expand
titleSee an example of a rule...
Code Block
wmi_logical_disk_free_bytes{volume="C:"} / (1024 * 1024) < 20480
ACTION_CLEANUP_LOGS

Clean up of the logs directory when there is insufficient space on the system disk

Low disk free space (database)

If free disk space for the database is less than 15 GB, all events older than one week are deleted:

Expand
titleSee an example of a rule...
Code Block
wmi_logical_disk_free_bytes{volume="C:"} / (1024 * 1024) < 15360


ACTION_CLEANUP_DB

Clean up of Postgres database when there is insufficient space on the disk. If free disk space for the database is less than:

  • 10 GB—all events older than one day are deleted.
  • 5 GB—all events older than one hour are deleted.
  • 3 GB—all events are deleted
archive_no_samples

Rule checks if new frames go to the archive. If new frames don't go to the archive within five minutes, the archive process is restarted:

Expand
titleSee an example of a rule...
Code Block
((changes(ngp_archive_channel_state_change
{ep_name="hosts/SERVER/MultimediaStorage"}[5m]) + ngp_archive_channel_current_state
{ep_name="hosts/SERVER/MultimediaStorage"} > 0) unless (changes(ngp_input_sample_counter
{ep_name="hosts/SERVER/MultimediaStorage"}[5m]) > 0)) 
and ignoring(ep_name) ngp_fps{ep_name="hosts/SERVER/DeviceIpint"}
ACTION_RESTART_NGP_UNIT

Restart of the archive service if new frames don't go to the archive

detector_no_sample

Rule checks if frames go to the detector. If new frames don't go to the detector, the detector service is restarted:

Expand
titleSee an example of a rule...
Code Block
(absent(ngp_fps{ep_name="hosts/SERVER/AVDetector"}) * scalar(ngp_fps{ep_name="hosts/SERVER/DeviceIpint"}) * scalar(changes(ngp_fps
{ep_name="hosts/SERVER/AVDetector"}[3m])) * scalar((ngp_service_desired_state
{ep_name="hosts/SERVER/AVDetector"} == 0) + 1)) > 0
ACTION_RESTART_NGP_UNIT

Restart of the detector service if the active detector doesn't receive new frames

statistics_server_unhealthy

If the statistics server doesn't update the item counter or becomes unavailable, the statistics services are automatically restarted:

Expand
titleSee an example of a rule...
Code Block
absent(changes(ngp_work_item_counter
{ep_name="hosts/SERVER/StatisticsServer"}[5m])) or absent(up{job="node.SERVER"})
ACTION_RESTART_NGP_UNIT

Restart of the statistics service if there are no statistics events