Introduction¶

Luna Vector Search Module (LVSM) is designed to create Vector database collections on user-defined set of Luna Platform descriptors. The following vector search databases are supported:

Qdrant

What is descriptor?¶

A descriptor (or embedding) is a vector feature extracted from an media data containing a human face, or body, or some scene by a neural network.

What is collection?¶

Search collection is a set of the Luna Platform entities (events, faces, etc.) bounded by Luna Platform supported filters. Each record in the collection is a triplex of entity ID, descriptor and payload. Payload contains the entity’s Luna Platform supported targets that the user is interested in when retrieving search results.

LVSM supports the following Luna Platform entities for collection and search:

events

What is search?¶

Search involves comparing a given descriptor with a batch of descriptors, resulting in similarity scores and aiming to find the most similar ones within a user-provided set of descriptors.

Why LVSM?¶

Luna Platform has built-in search based on persistent storage (Postgres) - flexible, but not always fast enough.

Luna Platform also has built-in search based on the Cached Matcher (Matcherlib) - fast but not flexible, suitable only for search within face lists with no additional filters supported.

Luna Platform also has built-in search based on a Python Matcher Proxy plug-in system, and a set of supported built-in plug-ins, for example, Luna Indexed Module plug-in, that implements highly efficient approximate search, but it is also not flexible enough and only supports face lists at the moment.

LVSM, in turn, represents a certain balance between search performance and flexibility. The base principle is the separation of storage and computation, so that there is a persistent Luna Platform storage - Postgres, and there is a storage specialized in search - Vector database. And LVSM is an adapter between Luna Platform and these powerful ready-made solutions for high-performance vector search. LVSM allows to create synchronizable collections flexibly customized by user-defined filters in Vector database for efficient and fast search.

LVSM key features:

separation of Luna Platform storage and computation, with the support of the computing linear scaling
flexible custom auto-syncing collections bounded by Luna Platform supported filters & targets
Vector database integration with taking advantage of their benefits: ability to handle large-scale collections with billions of records, efficient storage and indexing of high-dimensional data, high-performance vector search
Python Matcher Proxy plug-in system integration

How does it run?¶

LVSM consists of the following components:

server - processes collection creation tasks and synchronizes collections built so that they are keeping up to date with Luna Platform;
search plug-in - performs search on collections built, being integrated in Luna Platform via Python Matcher Proxy plug-in system.

Here are the steps for a quick start:

create a collection (and the server will take care of it, see collections for details)
enable LVSM plug-in in Python Matcher Proxy plug-in system (see search plug-in for details)
send a matching request, and the LVSM plug-in will handle it if possible, i.e. perform the search on Vector database, and return result to Python Matcher Proxy (see search plug-in for details)