Skip to content

Monitoring#

Monitoring is implemented as sending data to the InfluxDB. Monitoring is enabled in the services by default.

It is also possible to use LUNA Dashboards (the "luna_index_module" directory) and Grafana Loki for LIM services. See detailed information about LUNA PLATFORM monitoring, LUNA Dashboards and Grafana Loki in the "Monitoring" section of the LUNA PLATFORM administrator manual.

LUNA Dashboards based on the Grafana web application create a set of dashboards for analyzing the state of individual services, as well as two summarised dashboards that can be used to evaluate the state of the system. Grafana Loki is a log aggregation system that enables you to flexibly work with LUNA PLATFORM logs in Grafana.

Data being sent#

The types of monitoring events are different for each service. Below is a table showing all types of events for each service:

Service Types of events
Index Manager all http requests, all failed http requests, index building
Indexer all http requests, all failed http requests
Indexed Matcher all http requests, all failed http requests, index reloading, matching request (pass through Redis)

Every event is a point in the time series. The point is represented using the following data:

  • series name (requests or errors)
  • timestamp of the request start
  • tags
  • fields

The tag is an indexed data in storage. It is represented as a dictionary, where

  • keys - string tag names,
  • values - string, integer or float.

The field is a non-indexed data in storage. It is represented as a dictionary, where

  • keys - string field names,
  • values - string, integer or float.

Saving data for requests series is triggered on every request. Each point contains data about the corresponding request (execution time and etc.).

  • tags
Tag name Description
service always "lim-manager"
route concatenation of a request method and a request resource (GET:/version)
status_code HTTP status code of response
  • fields
Field name Description
request_id request_id
execution_time request execution time

Saving data for errors series is triggered when a request fails. Each point contains error_code.

  • tags
Tag name Description
service "lim-indexer", "lim-manager", or "lim-matcher"
route concatenation of a request method and a request resource (GET:/version)
status_code HTTP status code of response
error_code LIM error code
  • fields
Field name Description
request_id request_id

Saving data for index processing is started when an error occurs during index building.

  • tags
Tag name Description
service "lim-manager", or "lim-matcher"
socket_address service address in the format <host>:<port>
stage always "build_index"
label index matching label (list_id)
error_code LIM error code (0 - request was completed successfully)
  • fields
Field name Description
index_id index unique ID
pending time spent in the internal queue (sec)
duration index processing (i.e. building / loading / dropping) time, in seconds
generation index generation (unix timestamp)

Saving data for index matching is started when a matching is performed.

  • tags
Tag name Description
service always "lim-matcher"
socket_address service address in the format <host>:<port>
label index matching label (list_id)
error_code LIM error code (0 - request was completed successfully)
  • fields
Field name Description
request_id request ID
index_id index unique ID
execution_time matching request execution time, in seconds