Monitoring#

Monitoring in FaceStream is implemented as sending data to InfluxDB and is disabled by default.

LUNA Streams has several monitoring methods:

Sending data to InfluxDB (enabled by default)
Exporting metrics in Prometheus format via the /metrics resource (disabled by default)

InfluxDB#

To work with InfluxDB, you need to register with a username and password and specify the bucket name, organization name and token. All this data is set when starting the InfluxDB container using environment variables.

In order to use FaceStream or LUNA Streams monitoring, it is necessary in FaceStream settings or LUNA Streams settings to set for the "bucket", "organization", "token" fields exactly the same data specified when launching the InfluxDB container. So, for example, if the following settings were used when starting the InfluxDB container...:

-e DOCKER_INFLUXDB_INIT_BUCKET=luna_monitoring \
-e DOCKER_INFLUXDB_INIT_USERNAME=luna \
-e DOCKER_INFLUXDB_INIT_PASSWORD=password \
-e DOCKER_INFLUXDB_INIT_ORG=luna \
-e DOCKER_INFLUXDB_INIT_ADMIN_TOKEN=kofqt4Pfqjn6o \

... then the following parameters should be specified in the FaceStream or LUNA Streams settings:

"influxdb": {
    "organization": "luna",
    "token": "kofqt4Pfqjn6o",
    "bucket": "luna_monitoring",

Login and password are used to access the InfluxDB user interface.

FaceStream and LUNA Streams settings contain different data of the "bucket", "organization" and "token" fields by default. If you need to use monitoring for both services, then you need to set the same settings. If necessary, you can save FaceStream and LUNA Streams data to different buckets (see below).

In order to separate FaceStream and LUNA Streams monitoring data, you can create separate buckets after launching the InfluxDB container. This can be done using one of the following methods:

Using the InfluxDB user interface (Explore tab > Create bucket) after launching the InfluxDB container.
Using the command influx bucket create -n <bucket_name> -o <organization_name> in InfluxCLI after launching the InfluxDB container.

The organization name must be the same as when creating the InfluxDB container.

The data sent to InfluxDB differs for LUNA Streams and FaceStream. See the relevant sections below for more details on the data sent.

FaceStream monitoring#

Enable monitoring#

To enable FaceStream monitoring, follow these steps:

Go to the Configurator user interface: http://<configurator_server_ip>:5070/.
Enter "FACE_STREAM_CONFIG" in the "Setting name" field and click "Apply Filters".
Enable the "send_data" setting in the "monitoring" section.
Depending on the values of the parameters "DOCKER_INFLUXDB_INIT_BUCKET", "DOCKER_INFLUXDB_INIT_ORG", "DOCKER_INFLUXDB_INIT_ADMIN_TOKEN" set when launching the Influx container, specify the corresponding values in the fields "bucket", "organization" and "token" in the section "monitoring".
Restart the FaceStream container: docker restart facestream.

Data being sent#

The following data is sent to InfluxDB:

"measurement" element. It is equal to the value of "fs-requests".
Tag set:
- "fs_ip" — IP address where FaceStream is deployed.
- "source" — The "name" field set when creating a stream in LUNA Streams (optional).
- "stream_id"
Field set:
- "track_id"
- "event_id"
- "request_id" — External ID for communication with monitoring of LUNA PLATFORM services.
- "track_start_time"
- "track_best_shot_time" — Time when the frame with the best shot being sent appeared in the system.
- "track_best_shot_min_size_time" (optional) — Time when the detection size reached the value specified in the "best_shot_min_size" parameter.
- "track_best_shot_proper_size_time" (optional) — Time when the detection size reached the value specified in the "best_shot_proper_size" parameter.
- "liveness_start_time" (optional) — Liveness start time.
- "liveness_end_time" (optional) — Liveness end time.
- "bestshot_count" — Number of best shots sent in one request to LP along with the current best shot. So, for example, if 2 sends of 10 best shots were made, then the value of this parameter will be 10, and the value of the `track_send_count" parameter will be 2.
- "time_from_first_frame_to_send" — Time that passed from the appearance of the first frame in FS to sending to LP.
- "track_send_count" — Sequence number of sending data from the track
Tags containing time are sent as UTC with microsecond precision.
"timestamp" element. Is the time the best shot(s) was(were) sent in microseconds.

The frequency of sending data to InfluxDB is controlled by the "flashing_period" parameter of the FaceStream settings.

There may be several best shots, because sending from one track at a time counts as one measurement. To save this measurement, InfluxDB uses the last best shot data from the best shots group. Data that is unique for each best shot (track_best_shot_time, liveness_start_time, liveness_end_time) will be lost for all best shots except the last one if sent this way.

If there are no optional fields, the data of these fields will not be sent to the Influxdb.

During normal monitoring operation, no additional information is output to the FaceStream logs. If an error is detected during monitoring, the corresponding message will appear in the FaceStream logs.

LUNA Streams monitoring#

Data being sent to InfluxDB#

There are two types of events that are monitored:

Request (all requests)
Error (failed requests only)

Every event is a point in the time series. For the API service, the point is represented using the following data:

Series name (requests or errors)
Timestamp of the request start
Tags
Fields

For other services, the set of event types may differ. For example, the Handlers service also collects data on SDK usage, estimations, and licensing.

The tag is an indexed data in storage. It is represented as a dictionary, where:

Keys — String tag names.
Values — String, integer or float.

The field is a non-indexed data in storage. It is represented as a dictionary, where:

Keys — String field names.
Values — String, integer or float.

Requests series. Triggered on every request. Each point contains a data about corresponding request (execution time and etc).

Tags

Tag name	Description
service	Always "luna-streams"
route	Concatenation of a request method and a request resource (POST:/streams)
status_code	HTTP status code of response

Fields

Field name	Description
request_id	Request ID
execution_time	Request execution time

Errors series. Triggered on failed request. Each point contains error_code of luna error.

Tags

Tag name	Description
service	Always "luna-streams"
route	Concatenation of a request method and a request resource (POST:/streams)
status_code	HTTP status code of response
error_code	LUNA PLATFORM error code

Fields

Field name	Description
request_id	Request ID

Licensing series. Triggered at service start and every 60 seconds. Each dot contains license verification data.

Tags

Tag name	Description
service	Always "luna-streams"
license_status	License status ("ok", "warning", "error", "exception")

Fields

Field name	Description
license_streams_limit_rate	Percentage of used streams
warnings	License warning messages
errors	License error messages

View InfluxDB data#

You can use the InfluxDB GUI to view monitoring data.

Go to the InfluxDB GUI <server_ip>:<influx_port>. The default port is 8086. The default login data is luna/password.
Select the "Explore" tab.
Select a way to display information in the drop-down list (graph, histogram, table, etc.).
Select a bucket at the bottom of the page.
Filter the necessary data.
Click "Submit".

Export metrics in Prometheus format#

LUNA Streams service can collect and save metrics in Prometheus format in the form of time series data that can be used to track the behavior of the service. Metrics can be integrated into the Prometheus monitoring system to track performance. See Prometheus official documentation for more information.

By default, the collection of metrics is disabled. The collection of metrics is enabled in the "LUNA_SERVICE_METRICS" section.

Note that all metric data is reset when the service is shut down.

Type of metrics#

Two types of metrics are available:

Counters, which increase with each event.
Cumulative histograms, which are used to measure the distribution of duration or size of events.

A cumulative histogram is a mapping that counts the cumulative number of observations in all of the bins up to the specified bin. See description in Wikipedia.

The following metrics of type counters are available:

request_count_total — Total number of requests
errors_count_total — Total number of errors

Each of them has at least two labels for sorting:

status_code (or error_code for error metrics)
path — Path consisting of a request method and an endpoint route.

Labels are key pairs consisting of a name and a value that are assigned to metrics.

If necessary, you can add custom label types by specifying the pair tag_name=tag_value in the "extra_labels" parameter.

Note that the pair tag_name=tag_value will be added to each metric of the LUNA PLATFORM service.

A special manager distributes all requests passing through the service among the counters using these tags. This ensures that two successful requests sent to different endpoints or to the same endpoint, but with different status codes, will be delivered to different metrics.

Unsuccessful requests are distributed according to the metrics request_count_total and request_errors_total.

The requests metric of cumulative histogram type tracks the duration of requests to the service. The following intervals (bucket) are defined for the histogram, in which the measurements fall:

0.0001
0.00025
0.0005
0.001
0.0025
0.005
0.01
0.025
0.05
0.075
0.1
0.25
0.5
0.75
1.0
2.5
5.0
7.5
10.0
Inf

In this way the range of request times can be broken down into several intervals, ranging from very fast requests (0.0001 seconds) to very long requests (Inf - infinity). Histograms also have labels to categorize the data, such as status_code for the status of a request or route to indicate the route of a request.

Examples

If you send one request to the /healthcheck resource, followed by three requests to the /docs/spec resource, one of which will be redirected (response status code 301), then when executing the request to the /metrics resource, the following result will be displayed in the response body:

# HELP request_count_total Counter of requests
# TYPE request_count_total counter
request_count_total{path="GET:/docs/spec",status_code="200"} 2.0
request_count_total{path="GET:/docs/spec",status_code="301"} 1.0
request_count_total{path="GET:/healthcheck",status_code="200"} 1.0

If you send one invalid POST request to the /streams resource, then when executing the request to the /metrics resource, the following result will be displayed in the response body:

# HELP request_count_total Counter of requests
# TYPE request_count_total counter
request_count_total{path="POST:/streams",status_code="401"} 1.0
# HELP request_errors_total Counter of request errors
# TYPE request_errors_total counter
request_errors_total{error_code="12010",path="POST:/streams"} 1.0
# HELP requests Histogram of request time metrics
# TYPE requests histogram
requests_sum{route="GET:/docs/spec",status_code="200"} 0.003174567842297907
requests_bucket{le="0.0001",route="GET:/docs/spec",status_code="200"} 0.0
requests_bucket{le="0.00025",route="GET:/docs/spec",status_code="200"} 0.0
requests_bucket{le="0.0005",route="GET:/docs/spec",status_code="200"} 0.0
requests_bucket{le="0.001",route="GET:/docs/spec",status_code="200"} 1.0
...
requests_count{route="GET:/docs/spec",status_code="200"} 2.0
requests_sum{route="GET:/docs/spec",status_code="301"} 0.002381476051209132

Configuring metrics collection for Prometheus#

Prometheus must be configured to collect LUNA PLATFORM metrics.

Example Prometheus configuration for collecting LP service metrics:

  - job_name: "luna-streams"
     static_configs:
       - targets: ["127.0.0.1:5160"]
   ...

   - job_name: "luna-configurator"
     static_configs:
       - targets: ["127.0.0.1:5070"]

See the official documentation for an example of running Prometheus.