Metrics ======= The tool for tracking service metrics, available at `/metrics` path. Introduction ------------ The manager collects and stores metrics as time-series data, which might be used to track service behavior. The `/metrics` route and statistics collectors are disabled by default. To enable both, set the `LUNA_SERVICE_METRICS.ENABLED` parameter to 1 via the `configurator `_ or `configuration file `_. Note that all the metrics data is reset on service shutdown. Metric types ------------ There are a number of default **counter** metrics: - `request_count_total` - `errors_count_total` Each of them has at least 2 labels for sorting: - `status_code` (or `error_code` for error metrics) - `path` (which consists of the request method and endpoint route) .. note:: To add extra labels, specify the pairs of `label_name=label_value` for config file or `{"label_name": "label_value"}` for configurator in `LUNA_SERVICE_METRICS.EXTRA_LABELS`. Note that the set pair of `label_name=label_value` will be added to each metric for the running app. The manager distributes all requests passing through the service among counters using these labels, ensuring that two successful requests sent to different endpoints or to one endpoint but with different status codes will be delivered to different metrics. .. note:: Failed requests are distributed to both the `request_count_total` and `request_errors_total` metrics. There are a number of default **histogram** metrics: - `requests`, tracks request durations Request histogram buckets: - 0.0001 - 0.00025 - 0.0005 - 0.001 - 0.0025 - 0.005 - 0.01 - 0.025 - 0.05 - 0.075 - 0.1 - 0.25 - 0.5 - 0.75 - 1.0 - 2.5 - 5.0 - 7.5 - 10.0 - Inf Request histogram has two labels for sorting: - `status_code` - `route` Metrics format -------------- The only available option (`LUNA_SERVICE_METRICS.METRICS_FORMAT` property) for collecting and storing metrics at the moment is through the Prometheus toolkit. You might want to refer to its `documentation `_, as it offers a wide range of core functions. For example: .. code-block:: bash rate(request_count_total{path="GET:/healthcheck"}[5m]) will get you the average requests per second to "/healthcheck" for the last 5 minutes. Examples -------- If you send one request to `/healthcheck`, followed by three requests to `/docs/spec`, one of which will be redirected, you will receive the `response.text` from `/metrics` route as follow: | ``# HELP request_count_total Counter of requests`` | ``# TYPE request_count_total counter`` | ``request_count_total{path="GET:/docs/spec",status_code="200"} 2.0`` | ``request_count_total{path="GET:/docs/spec",status_code="301"} 1.0`` | ``request_count_total{path="GET:/healthcheck",status_code="200"} 1.0`` | ``# HELP requests Histogram of request time metrics`` | ``# TYPE requests histogram`` | ``requests_sum{route="GET:/docs/spec",status_code="200"} 0.003174567842297907`` | ``requests_bucket{le="0.0001",route="GET:/docs/spec",status_code="200"} 0.0`` | ``requests_bucket{le="0.00025",route="GET:/docs/spec",status_code="200"} 0.0`` | ``requests_bucket{le="0.0005",route="GET:/docs/spec",status_code="200"} 0.0`` | ``requests_bucket{le="0.001",route="GET:/docs/spec",status_code="200"} 1.0`` | ``requests_bucket{le="0.0025",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_bucket{le="0.005",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_bucket{le="0.01",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_bucket{le="0.025",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_bucket{le="0.05",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_bucket{le="0.075",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_bucket{le="0.1",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_bucket{le="0.25",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_bucket{le="0.5",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_bucket{le="0.75",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_bucket{le="1.0",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_bucket{le="2.5",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_bucket{le="5.0",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_bucket{le="7.5",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_bucket{le="10.0",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_bucket{le="+Inf",route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_count{route="GET:/docs/spec",status_code="200"} 2.0`` | ``requests_sum{route="GET:/docs/spec",status_code="301"} 0.002381476051209132`` | ``requests_bucket{le="0.0001",route="GET:/docs/spec",status_code="301"} 0.0`` | ``requests_bucket{le="0.00025",route="GET:/docs/spec",status_code="301"} 0.0`` | ``requests_bucket{le="0.0005",route="GET:/docs/spec",status_code="301"} 0.0`` | ``requests_bucket{le="0.001",route="GET:/docs/spec",status_code="301"} 0.0`` | ``requests_bucket{le="0.0025",route="GET:/docs/spec",status_code="301"} 0.0`` | ``requests_bucket{le="0.005",route="GET:/docs/spec",status_code="301"} 0.0`` | ``requests_bucket{le="0.01",route="GET:/docs/spec",status_code="301"} 1.0`` | ``requests_bucket{le="0.025",route="GET:/docs/spec",status_code="301"} 1.0`` | ``requests_bucket{le="0.05",route="GET:/docs/spec",status_code="301"} 1.0`` | ``requests_bucket{le="0.075",route="GET:/docs/spec",status_code="301"} 1.0`` | ``requests_bucket{le="0.1",route="GET:/docs/spec",status_code="301"} 1.0`` | ``requests_bucket{le="0.25",route="GET:/docs/spec",status_code="301"} 1.0`` | ``requests_bucket{le="0.5",route="GET:/docs/spec",status_code="301"} 1.0`` | ``requests_bucket{le="0.75",route="GET:/docs/spec",status_code="301"} 1.0`` | ``requests_bucket{le="1.0",route="GET:/docs/spec",status_code="301"} 1.0`` | ``requests_bucket{le="2.5",route="GET:/docs/spec",status_code="301"} 1.0`` | ``requests_bucket{le="5.0",route="GET:/docs/spec",status_code="301"} 1.0`` | ``requests_bucket{le="7.5",route="GET:/docs/spec",status_code="301"} 1.0`` | ``requests_bucket{le="10.0",route="GET:/docs/spec",status_code="301"} 1.0`` | ``requests_bucket{le="+Inf",route="GET:/docs/spec",status_code="301"} 1.0`` | ``requests_count{route="GET:/docs/spec",status_code="301"} 1.0`` | ``requests_sum{route="GET:/healthcheck",status_code="200"} 0.005269583023618907`` | ``requests_bucket{le="0.0001",route="GET:/healthcheck",status_code="200"} 0.0`` | ``requests_bucket{le="0.00025",route="GET:/healthcheck",status_code="200"} 0.0`` | ``requests_bucket{le="0.0005",route="GET:/healthcheck",status_code="200"} 0.0`` | ``requests_bucket{le="0.001",route="GET:/healthcheck",status_code="200"} 0.0`` | ``requests_bucket{le="0.0025",route="GET:/healthcheck",status_code="200"} 0.0`` | ``requests_bucket{le="0.005",route="GET:/healthcheck",status_code="200"} 0.0`` | ``requests_bucket{le="0.01",route="GET:/healthcheck",status_code="200"} 1.0`` | ``requests_bucket{le="0.025",route="GET:/healthcheck",status_code="200"} 1.0`` | ``requests_bucket{le="0.05",route="GET:/healthcheck",status_code="200"} 1.0`` | ``requests_bucket{le="0.075",route="GET:/healthcheck",status_code="200"} 1.0`` | ``requests_bucket{le="0.1",route="GET:/healthcheck",status_code="200"} 1.0`` | ``requests_bucket{le="0.25",route="GET:/healthcheck",status_code="200"} 1.0`` | ``requests_bucket{le="0.5",route="GET:/healthcheck",status_code="200"} 1.0`` | ``requests_bucket{le="0.75",route="GET:/healthcheck",status_code="200"} 1.0`` | ``requests_bucket{le="1.0",route="GET:/healthcheck",status_code="200"} 1.0`` | ``requests_bucket{le="2.5",route="GET:/healthcheck",status_code="200"} 1.0`` | ``requests_bucket{le="5.0",route="GET:/healthcheck",status_code="200"} 1.0`` | ``requests_bucket{le="7.5",route="GET:/healthcheck",status_code="200"} 1.0`` | ``requests_bucket{le="10.0",route="GET:/healthcheck",status_code="200"} 1.0`` | ``requests_bucket{le="+Inf",route="GET:/healthcheck",status_code="200"} 1.0`` | ``requests_count{route="GET:/healthcheck",status_code="200"} 1.0`` If you send one invalid POST-request to `/handlers` route: | ``# HELP request_count_total Counter of requests`` | ``# TYPE request_count_total counter`` | ``request_count_total{path="POST:/handlers",status_code="401"} 1.0`` | ``# HELP request_errors_total Counter of request errors`` | ``# TYPE request_errors_total counter`` | ``request_errors_total{error_code="12010",path="POST:/handlers"} 1.0`` | ``# HELP requests Histogram of request time metrics`` | ``# TYPE requests histogram`` | ``requests_sum{route="POST:/handlers",status_code="401"} 0.004234567845298518`` | ``requests_bucket{le="0.0001",route="POST:/handlers",status_code="401"} 0.0`` | ``requests_bucket{le="0.00025",route="POST:/handlers",status_code="401"} 0.0`` | ``requests_bucket{le="0.0005",route="POST:/handlers",status_code="401"} 0.0`` | ``requests_bucket{le="0.001",route="POST:/handlers",status_code="401"} 0.0`` | ``requests_bucket{le="0.0025",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_bucket{le="0.005",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_bucket{le="0.01",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_bucket{le="0.025",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_bucket{le="0.05",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_bucket{le="0.075",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_bucket{le="0.1",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_bucket{le="0.25",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_bucket{le="0.5",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_bucket{le="0.75",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_bucket{le="1.0",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_bucket{le="2.5",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_bucket{le="5.0",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_bucket{le="7.5",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_bucket{le="10.0",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_bucket{le="+Inf",route="POST:/handlers",status_code="401"} 1.0`` | ``requests_count{route="POST:/handlers",status_code="401"} 1.0``