Collections¶
To create a collection and assign a maintainer to it, which performs the initial filling and auto-syncing, the POST /collections entrypoint should be used.
To get information about all the collections built, the GET /collections entrypoint should be used.
Data¶
Each record in the collection is a triplex of Luna Platform entity ID, descriptors, and payload.
LVSM sets up collections with uint8 vector datatype for descriptors to be stored. Vectors with uint8 datatype are stored in a more compact format, which can save memory and improve search speed.
LVSM allows having multiple descriptors per record as long as each descriptor has a different type or version. Thus, for example, one collection can store both face descriptors and body descriptors for single Luna Platform entity. Multivector storage parameters are formed by the descriptor configs specified by the user on POST /collections:
"descriptors": [
{
"descriptor_type": "face",
"descriptor_version": 65,
},
{
"descriptor_type": "body",
"descriptor_version": 105,
}
],
Payload is a JSON object contains the entity’s common Luna Platform supported targets specified when creating a collection.
Metadata¶
In addition to collections, the database also stores metadata about all the collections built. Collection metadata is used by the search plug-in to decide which collection can be searched for the current request (i.e. which one is suitable by candidate type, by candidate filters, by relevance - how long ago was it synchronized). Metadata about new collection is set after its initial filling, and then new collection becomes visible to the search plug-in. On commit a collection synchronization, collection metadata is updated.
The metadata for each collection is structured as follows example:
{
"collection_name": "my-shiny-collection",
"parameters": {
"object_type": "events",
"descriptors": [{"descriptor_type": "face", "descriptor_version": 65}],
"payload": ["event_id", "handler_id"],
"counditions": {"account_id": "00000000-0000-4000-a000-000000311001"}
},
"max_link_key": 123456789,
"max_unlink_key": 123,
"sync_time": "2025-09-15T00:00:00Z",
"description": "optional collection description"
}
where
collection_name - collection name;
parameters - collection parameters, defined by user on POST /collections;
max_link_key - maximum ID of entities added (source database sync technical details, provided for reference only);
max_unlink_key - maximum ID of entities removed (source database sync technical details, provided for reference only);
sync_time - last collection synchronization time;
description - optional collection description, may be undefined.
Synchronization¶
After the collection is initially filled, it starts synchronizing with a period configured (1 second by default, see extended administator configuration). For each collection the synchronization cycle is the following sequence of steps:
catching entities from the source database according to the collection parameters in the time range since the last synchronization (minus some gap), and addding them to the collection;
catching entity IDs being deleted from the source database in the time range since the last synchronization (minus some gap), and removing items with such IDs from the collection;
collection metadata update in terms of IDs and the time of the last synchronization.
LVSM server will try to synchronize each collection after a specified period of time, but it is not always possible to manage this, since there are a number of workers (100 by default), so only certain number of collections allowed to be synchronized at the same time, and the others will wait for worker freed. If there are some initial filling tasks in processing at the moment this can take quite a long time then. If the collection synchronization tasks fails, it will be queued again. The synchronization process between failures is reset and starts from the previous sync state (since the commit is performed only at the moment of synchronization cycle completed). That also means in if the initial filling was interrupted, it will start from the very beginning for the next run.
Information about the collection synchronization process can be found in the service logs. The collection synchronization status (timemark of the last synchronization) can also be obtained on the GET /collections entrypoint.
Note
Please note that the EVENT_DELETION_LOG option should be enabled in Luna Events service in order for events deleted from the source database to be synced to the collection.