Lambda monitoring
Data for monitoring
Now we monitor two types of events for monitoring: request and error. First type is all requests, second is failed requests only.
Comparison of data formats for Clickhouse and InfluxDB:
InfluxDB: Each event is presented as a “point” in a time series. The structure of a point includes:
series name
start event time
tags, indexed data in storage, dictionary: keys - string tag names, values - string, integer, float
fields, non indexed data in storage, dictionary: keys - string tag names, values - string, integer, float
Clickhouse: In Clickhouse, the data structure resembles that of a traditional SQL table. Each event is represented as a record, where:
The `time` field contains the record’s creation timestamp;
The `data` field contains a JSON object with all the information that would otherwise be distributed across tags and fields in InfluxDB.
Important: In Clickhouse, there is no differentiation between “tags” and “fields”—all data is consolidated into a single JSON object within the data field.
Monitoring series
The structure and the meaning of each monitoring series remain consistent. However, for Clickhouse, data from tags and fields are merged into a single JSON object under the data field. Below are examples for each series:
Requests series.
Triggered on every request. Each point contains a data about corresponding request (execution time and etc).
InfluxDB:
tags
tag name
description
service
always “lambda-<lambda-id>”
route
concatenation of a request method and a request resource (POST:/main)
status_code
http status code of response
fields
fields
description
request_id
request id
execution_time
request execution time
ClickHouse JSON `data` field Example:
{ "service": "lambda-<lambda-id>", "route": "POST:/main", "status_code": 200, "request_id": "1536751345,6a5c2191-3e9b-f5a4-fc45-3abf43625c5f", "execution_time": 123.45 }
Errors series.
Triggered on failed request. Each point contains error_code of luna error.
InfluxDB:
tags
tag name
description
service
always “lambda-<lambda-id>”
route
concatenation of a request method and a request resource (POST:/main)
status_code
http status code of response
error_code
luna error code
fields
fields
description
request_id
request id
ClickHouse JSON `data` field Example:
{ "service": "lambda-<lambda-id>", "route": "POST:/main", "status_code": 400, "error_code": 13037, "request_id": "1536751345,6a5c2191-3e9b-f5a4-fc45-3abf43625c5f", }
Custom Monitoring
For base monitoring usage information see monitoring.
It is possible to create custom monitoring points. For example, there is a need to send data about how long it took to download or process images.
It is possible to specify your own series, tags and fields, but there will always be a mandatory tag “service” with the “lambda-<lambda-id>” value.
For add custom monitoring point follow steps:
Add file monitoring_points.py to lambda archive with the following content:
monitoring_points.pyfrom luna_lambda_tools.public.monitoring import CustomMonitoringPoint class TestMonitoringPoint(CustomMonitoringPoint): """Test monitoring point""" series = "test_monitoring"There are several rules for this file:
The number of points can be any.
Each point class name may be specify by any unique name.
Each point class must be inherited by CustomMonitoringPoint
Each point class must contain series attribute to specify monitoring series.
Set monitoring points from monitoring_points.py to lambda_main.py and specify tags and fields:
There are several general rules for this file:
Enable monitoring. See INFLUX_MONITORING setting in basic principles of configuration.
If monitoring is unavailable, the points will not be sent without any errors.
Specify tags by set dictionary with tags to pointTags named argument.
Specify fileds by set dictionary with fields to pointFields named argument.
Warning
There are diferencies between standalone, handlers and tasks lambda monitoring mechanism. See description below.
Send points using request.sendToMonitoring function for standalone or handlers lambda:
lambda_main.pyimport asyncio from time import time from luna_lambda_tools import StandaloneLambdaRequest from monitoring_points import TestMonitoringPoint async def main(request: StandaloneLambdaRequest) -> dict: # start execution time stt = time() # do some logic await asyncio.sleep(1) # send monitoring point with execution time request.sendToMonitoring( (TestMonitoringPoint(pointTags={"lambda_type": "standalone"}, pointFields={"execution_time": time() - stt}),) ) return {"result": "lambda result"}request examplefrom luna3.luna_lambda.luna_lambda import LambdaApi SERVER_ORIGIN = "http://lambda_address:lambda_port" # Replace by your values before start SERVER_API_VERSION = 1 lambdaApi = LambdaApi(origin=SERVER_ORIGIN, api=SERVER_API_VERSION) lambdaId, accountId = "your_lambda_id", "your_account_id" # Replace by your values before start def makeRequest(): reply = lambdaApi.proxyLambdaPost(lambdaId=lambdaId, path="main", accountId=accountId) return reply if __name__ == "__main__": response = makeRequest() print(response.json)Send points using self.sendToMonitoring function for tasks lambda:
To run lambda tasks example, refer to the task processing description here.
lambda_main.pyimport asyncio from time import time from luna_lambda_tools.public.tasks import BaseLambdaTask from monitoring_points import TestMonitoringPoint class LambdaTask(BaseLambdaTask): """Lambda task""" async def splitTasksContent(self, content: dict) -> list[dict]: """Split task content to sub task contents""" stt = time() # do some logic await asyncio.sleep(0.5) self.sendToMonitoring( ( TestMonitoringPoint( pointTags={"lambda_type": "tasks", "target": "split_content"}, pointFields={"execution_time": time() - stt}, ), ) ) return [content] async def executeSubtask(self, subtaskContent: dict) -> dict | list: """Execute current sub task processing""" stt = time() # do some logic await asyncio.sleep(0.5) self.sendToMonitoring( ( TestMonitoringPoint( pointTags={"lambda_type": "tasks", "target": "execute_subtask"}, pointFields={"execution_time": time() - stt}, ), ) ) return {"result": "Some lambda-tasks result"}request examplefrom luna3.tasks.tasks import TasksApi TASKS_ORIGIN = "http://tasks_address:tasks_port" # Replace by your values before start TASKS_API_VERSION = 2 tasksApi = TasksApi(origin=TASKS_ORIGIN, api=TASKS_API_VERSION) lambdaId, accountId = "your_lambda_id", "your_account_id" # Replace by your values before start def makeRequest(): reply = tasksApi.taskLambda(content={"lambda_id": lambdaId}, accountId=accountId) return reply if __name__ == "__main__": response = makeRequest() print(response.json)
Database
You can refer to documentation for influx database and clickhouse database to compare the databases and choose what benefit your needs more. Note that clickhouse might be the better choice for aggregation You can setup your database credentials in configuration file in section “monitoring”.