Monitoring

Data for monitoring

We support two database options for collecting monitoring data: Clickhouse and InfluxDB. Depending on the database chosen, the structure and methodology for storing data vary.

Types of processed events

Our monitoring system processes the following event types:

  • request (any HTTP request)

  • error (failed HTTP request)

  • index processing (index building workflow)

  • indexed matching (indexed matching request)

Comparison of data formats for Clickhouse and InfluxDB:

InfluxDB: Each event is presented as a “point” in a time series. The structure of a point includes:

  • series name

  • start event time

  • tags, indexed data in storage, dictionary: keys - string tag names, values - string, integer, float

  • fields, non indexed data in storage, dictionary: keys - string tag names, values - string, integer, float

Clickhouse: In Clickhouse, the data structure resembles that of a traditional SQL table. Each event is represented as a record, where:

  • The `time` field contains the record’s creation timestamp;

  • The `data` field contains a JSON object with all the information that would otherwise be distributed across tags and fields in InfluxDB.

Important: In Clickhouse, there is no differentiation between “tags” and “fields” — all data is consolidated into a single JSON object within the data field.

Every event is a point in the time series. The point is represented as a union of the following data:

  • series name (now requests and errors)

  • start request time

  • tags, indexed data in storage, dictionary: keys - string tag names, values - string, integer, float

  • fields, non-indexed data in storage, dictionary: keys - string tag names, values - string, integer, float

  • Requests series.

    Triggered on every request. Each point contains a data about corresponding request (execution time and etc).

    InfluxDB:

    Triggered on every HTTP request. Each point contains data about the corresponding request (execution time and etc).

    Requests series tags

    tag name

    description

    service

    “lim-indexer”, “lim-manager”, or “lim-matcher”

    route

    concatenation of a request method and a request resource (GET:/version)

    status_code

    http status code of response

    Requests series fields

    fields

    description

    request_id

    request id

    execution_time

    request execution time

    ClickHouse JSON `data` field Example:

    {
        "service": "lim-manager",
        "route": "GET:/tasks",
        "status_code": 201,
        "request_id": "1536751345,6a5c2191-3e9b-f5a4-fc45-3abf43625c5f",
        "execution_time": 1.234
    }
    
  • Errors series.

    Triggered on failed request. Each point contains error_code of luna error.

    InfluxDB:

    Errors series tags

    tag name

    description

    service

    “lim-indexer”, “lim-manager”, or “lim-matcher”

    route

    concatenation of a request method and a request resource (GET:/version)

    status_code

    http status code of response

    error_code

    Luna Platform error code

    Errors series fields

    fields

    description

    request_id

    request id

    ClickHouse JSON `data` field Example:

    {
        "service": "lim-manager",
        "route": "POST:/tasks",
        "status_code": 400,
        "error_code": 13037,
        "request_id": "1536751345,6a5c2191-3e9b-f5a4-fc45-3abf43625c5f"
    }
    
  • Index processing series.

    Triggered on an error in a pipeline of an index processing.

    InfluxDB:

    Index processing series tags

    tag name

    description

    service

    “lim-manager”, or “lim-matcher”

    socket_address

    service address in the format <host>:<port> (for matcher only)

    stage

    “build_index”, “load_index” or “drop_index”

    label

    index label (some unique index content id)

    error_code

    Luna Platform error code (‘0’ - success)

    Index processing series fields

    field name

    description

    index_id

    index unique ID

    pending

    time spend in the pending queue, in seconds

    duration

    index processing (i.e. building / loading / dropping) time, in seconds

    generation

    index generation (unix timestamp)

    ClickHouse JSON `data` field Example:

    {
        "service": "lim-manager",
        "stage": "build_index",
        "label": "4d1ae1f4-fbbd-49c8-be47-f6aec34449f3",
        "index_id": "e16e57de-3e15-4052-be9f-8e33f7629893",
        "error_code": 0,
        "generation": 1536751345,
        "pending": 1,
        "duration": 23456
    }
    
  • Indexed matching series.

    Triggered on matching performed.

    InfluxDB:

    Index matching series tags

    tag name

    description

    service

    always “lim-matcher”

    socket_address

    service address in the format <host>:<port>

    label

    index label (some unique index content id)

    error_code

    Luna Platform error code (‘0’ - success)

    Indexed matching series fields

    field name

    description

    request_id

    request ID

    index_id

    index unique ID

    execution_time

    matching request execution time, in seconds

    ClickHouse JSON `data` field Example:

    {
        "service": "lim-matcher",
        "socket_address": "luna:5200",
        "label": "4d1ae1f4-fbbd-49c8-be47-f6aec34449f3",
        "index_id": "e16e57de-3e15-4052-be9f-8e33f7629893",
        "error_code": 0,
        "request_id": "1536751345,6a5c2191-3e9b-f5a4-fc45-3abf43625c5f",
        "execution_time": 0.123
    }
    

TTL

You can configure base tables(for ClickhouseDB) and buckets(for InfluxDB) retention policy when running monitoring migration via the -x monitoring-ttl parameter. See Monitoring integration section in the integration manual. Note that monitoring-ttl is specified in days and won’t affect aggregated data in any way. All the aggregated data must be cleaned up manually The default value is 30 days

Database

You can refer to documentation for influx database and clickhouse database to compare the databases and choose what benefit your needs more. Note that clickhouse might be the better choice for aggregation You can setup your database credentials in configuration file in section “monitoring”.