General information

Warning

Matching plugins refers only to Luna Python Matcher proxy service.

The service supports the system of matching plugins.

By default, all matching requests are processed by Luna-Python-Matcher-Proxy by redirecting requests to Luna-Python-Matcher. It is possible that matching requests processing is slower than it needs for several reasons, including:

  • large amount of data and inability to speed up request by any database configuration changes, e.g. create an index in a database that speeds up request

  • the way of data storage - descriptor and entity id (face_id/event_id) are kept in different database tables (due to Luna Platform restrictions), filters, which specified in matching request also can be presented in a separate database table, what slows down the request processing speed

  • internal database specific restrictions

It is possible to separate some groups of requests and improve their processing speed by utilizing matching plugins, including by transferring data to another storage with a specific way of data storage which makes possible the fastest matching in comparison to the default way (see plugin data source). For example:

  • matching requests where all faces (let’s say that all matching candidates are faces) are linked to one list and any other filters do not specify in the request.

    In this case, it is possible to duplicate those candidates to other data storage than the default data storage and create a matching plugin, which will only match specified references with these candidates, but not with any other entities.

    The matching request processing will be faster in comparison to the default way, because the plugin will not spend time to separate faces, which linked to list from all faces, which store in the database.

  • matching requests where all candidates are events and specify only one filter - event_ids and it needs to match only by bodies, not by faces.

    In this case, it is possible to duplicate all event_id and its body descriptors to other data storage than the default data storage and create a matching plugin, that which will match specified reference with these candidates, but not with any other entities.

    The matching request processing will be faster in comparison to the default way, because the plugin will not spend time to separate events with bodies from all events and overlook filters.

It is possible to use built-in matching plugins or create your own matching plugins.

Each matching request is presented in the form of all possible combinations of candidates and references, then each such combination (further sub-request means combination of reference and candidates) is processed as a separate sub-request as follows:

  1. Get the sub-request matching cost (see matching cost for description).

  2. Choose the way for the sub-request processing using the lowest estimated matching cost: matching plugin or Luna-Python-Matcher.
    • If in the previous step Luna-Python-Matcher was selected, it will process sub-request, returns the response to the Luna-Python-Matcher-Proxy.

    • If in the previous step matching plugin was selected, it will process sub-request. If sub-request was successfully processed, the response returns to the Luna-Python-Matcher-Proxy. If a sub-request was not successfully processed, it will try to process by Luna-Python-Matcher.

  3. If the request was successfully processed by matching plugin and plugin does not have access to all matching targets which specified in sub-request, then Luna-Python-Matcher-Proxy will enrich data before next step, see matching targets for details.

  4. The Luna-Python-Matcher-Proxy collects results from all sub-requests, sorts them in the right order, and replies to the user.

Matching cost

Matching cost is a float numeric expression of matching request process complexity using a plugin. Matching cost is necessary to choose the best way to process a matching request: Luna-Python-Matcher service or one or more plugins.

The matching cost value for the Luna-Python-Matcher service is 100. If there are several plugins, then the matching cost value will be calculated for each plugin. The matching plugin with the lowest matching cost will be used if its matching cost is lower than the Luna-Python-Matcher matching cost. All requests with matching costs greater than 100 will be processed in the Luna-Python-Matcher service. If there are no plugins, Luna-Python-Matcher will be used for the request processing.

Each plugin should implement getMatchingCost method that should take match request as argument and return float if the request can be processed with the plugin or “None” if it cannot.

Example:

from typing import Optional
from luna_plugins.matcher_plugins.base_struct import MatchRequest

LIST_ID_1 = "..."
LIST_ID_2 = "..."

class BestMatcher(IMatcher):
    ...
    def isGoodFilters(self, filters: dict) -> bool:
        """
        example of a function that enables you to understand whether query filters are appropriate to perform matching using this plugin.
        """
        return filters.get("list_id") in (LIST_ID_1, LIST_ID_2)

    def getMatchingCost(self, matchRequest: MatchRequest) -> Optional[float]:
        """
        example of calculating matching cost:
        if the request does not match the filters - "None" will be returned
        if the request passes through the filters, then the matching cost will be <10*number of targets in the request>
        """
        if not self.isGoodFilters(matchRequest.candidate.filters):
            return None
        return 10 * len(matchRequest.candidate.targets)

Matching targets

The Luna-Python-Matcher has access to all data of matching entities, so it can process matching requests with all targets. Matching plugins may not have access to data, which is specified in request targets. In this case, Luna-Python-Matcher-Proxy will enrich response of plugin with missing targets data, e.g.:

  • matching response contains next targets: face_id, user_data and similarity and the chosen matching plugin does not have access to user_data field

    1. matching plugin match reference with specified face_ids and return the matching response to the Luna-Python-Matcher-Proxy, which contains only pairs of face_id and similarity

    2. for every match candidate in result, Luna-Python-Matcher-Proxy will get user_data from the main database by face_id and merge face_id and similarity with user_data

    3. return enriched response with specified targets and face_id as target to the user

  • matching response contains next targets: age, gender (all candidates are events’ faces) and the chosen matching plugin have access only to event_id, descriptor, and age fields

    1. matching plugin match reference and return the matching response to the Luna-Python-Matcher-Proxy, which contains only pairs of event_id, age and similarity

    2. for every match candidate in result, Luna-Python-Matcher-Proxy will get gender from the main database by event_id and merge event_id with gender, also after that it drops non-required event_id and similarity from the response

    3. return a prepared response with specified targets and event_id as target to the user

Warning

This mechanics requires that plugin must supports corresponding entity ID as target. If plugin does not support the entity ID as target such request will not sent to this plugin.

Built-in matching plugins

Luna-Python-Matcher-Proxy provides several built-in matching plugins, see their description in the corresponding chapters:

Note

It is possible to use built-in matching plugins as examples for new user matching plugins.

User matching plugins

Matching plugins should be written in the Python programming language.

To create a user matching plugin it is required to:

  • select the data source and synchronize data from Luna Platform if it needs (see plugin data source for details)

  • write the code (see plugin code for details)

It is possible that the matching plugin will redirect the matching request to remote service, but there are only several ways to get the correct match result of two descriptors matching:

  • using match function from vlutils package that presented in example (for additional information about function requirements see DB matching section from Luna-Faces and Luna-Events documentation)

  • using FSDK matching (FSDK documentation is presented separately)

Plugin data source

To speed up request processing, each matching plugin may use a separated data source instead of the default one ( luna-events, faces, or attributes database (see Database chapter of Luna-Faces/Luna-Events services documentation for more info)) such as a separate database, a new table in the existing database, in-memory cache, etc.

If it uses an external database plugin needs access to the source databases to fill them in. To get access to the source events, faces, or attributes databases, see the Luna Platform settings.

There are several ways to synchronize data in a custom data source with the default database, among them:

  • streaming replication (for more information see postgres streaming replication)

  • materialized views (for more information see postgres materialized views). Example:
    CREATE MATERIALIZED VIEW BEST_MATERIALIZE_VIEW AS event_id, user_data, age, gender, create_time from event
    

    Note

    It also needs to refresh materialized view to keep data up-to-date, see documentation for more information

  • triggers (for more information see postgres triggers)

Plugin code

Requirements for matching plugins:

  • plugin should represent a class inherited from BaseMatcherPlugin, which implements all its abstract methods

    availableReferenceTypes must represent reference types that can be processed using the matcher

    availableCandidateTypes must represent candidate types that can be processed using the matcher

    availableDescriptorTypes must represent descriptor types that can be processed using the matcher

    availableSortOrder must represent sorting order that can be processed using the matcher

    function getAvailableTargets must return targets that the matcher can process and return in reply to a user request for each match result

    from luna_plugins.matcher_plugins.base_plugin import BaseMatcherPlugin
    from luna_plugins.matcher_plugins.base_struct import IMatcher, MatchUnitType
    
    class BestMatcher(IMatcher):
    
        availableReferenceTypes = frozenset([MatchUnitType.face, MatchUnitType.descriptor])
        availableCandidateTypes = frozenset([MatchUnitType.face])
        availableDescriptorTypes = frozenset([DescriptorType.face])
        availableSortOrder = frozenset(["similarity"])
    
        def getAvailableTargets():
            return frozenset(["face_id", "user_data"])
    
        ...
    
    class BestPlugin(BaseMatcherPlugin):
        def getMatcher(self) -> IMatcher:
            return BestMatcher()
    
        async def initialize(self) -> None:
            await super().initialize()
            print('plugin initialization has been completed')
    
        async def close(self) -> None:
            print('plugin has been successfully stopped')
    
  • plugin should implement match method that takes MatchRequest as input and returns MatchResult as a response
    from luna_plugins.matcher_plugins.base_struct import IMatcher, MatchRequest, MatchResult
    
    class BestMatcher(IMatcher):
        ...
    
        async def match(self, matchRequest: MatchRequest) -> MatchResult:
            """must return match result"""
    
  • if any error occurred during matching, it should raise Rematch exception
    from luna_plugins.matcher_plugins.base_struct import IMatcher, MatchRequest, MatchResult
    from luna_plugins.matcher_plugins.exceptions import Rematch
    
    class BestMatcher(IMatcher):
        ...
    
        async def _match(self, matchRequest: MatchRequest) -> MatchResult:
            """must return match result"""
    
        async def match(self, matchRequest: MatchRequest) -> MatchResult:
            try:
                result: MatchResult = await self._match(matchRequest)
            except Exception as exc:
                raise Rematch from exc
            return result
    

    Note

    Rematch exception generalizes errors during matching progress, and its appearance supposes that matching request will be processed using Luna-Python-Matcher.

    The application is passed during the initialization of the plugin to get the settings that are used by the service (taking into account config-reload), as well as some other features (for example, a logger or an adapter for connecting to the database). See application for more information).

    The example of receiving faces for the plugin using the adapter to the luna-faces database is given below.

    from luna_plugins.matcher_plugins.base_struct import IMatcher
    
    class BestMatcher(IMatcher):
    
        def __init__(self, app: "LunaApplication") -> None:
            super().__init__(app)
            self.dbContext = app.ctx.facesDBContext
    
        async def match(self, matchRequest: MatchRequest) -> MatchResult:
            ...
            facesOfInterest: list[dict[str, str]] = await self.dbContext.getFaces(faceIds=matchRequest.candidate.filters["face_ids"])
            ...
    

Matching plugin implementation example:

from vlutils.descriptors.match import match

class BestMatcher(IMatcher):

    async def _match(self, descriptor_1, descriptor_2, descriptor_version: int = 59) -> float:
        """Match two descriptors and return similarity as result"""
        similarity = match(descriptor_1, descriptor_2, 59)
        return similarity

Matching plugins can store their configuration in different ways. For example, in the configuration file or in the Luna-Configurator service. An example of a setting creation and receiving using luna3 library is given below. For additional information, see luna-configurator documentation.

from luna3.configurator.configurator import ConfiguratorApi

configuratorApi = ConfiguratorApi(origin="http://configurator_host:configurator_port", api=1)
configuratorApi.putLimitation(
    "LUNA_CACHED_LIST_PLUGIN", defaultValue={"cost": 10, "shards": ["http:/127.0.0.1:5301"]}, services=["new_plugin"],
    validationSchema={
        "type": "object",
        "properties": {
            "cost": {"type": "number", "minimum": 0},
            "shards": {
                "type": "array",
                "items": {
                    "type": "string",
                    "format": "uri",
                },
                "minItems": 1
            }
        },
        "required": ["cost", "shards"]},
    description="description", raiseError=True)
setting = configuratorApi.pullConfig(serviceName="new_plugin")
print(setting)

The structures, the classes that should be inherited when creating user plugins, data structures that a particular method should accept are listed below. You can find the source files of the given examples in the “luna_python_matcher” directory.

"""Stub for legacy plugins which use import from this module"""
from luna_plugins.matcher_plugins.base_plugin import *

from .exceptions import logger

logger.warning(
    "Deprecated. Do not use import from this module. Use structures from luna_plugins.matcher_plugins.base_plugin"
)


class BaseMatcherPlugin(BaseMatcherPlugin):
    """Stub for autoflake (fix not used import)"""
"""Stub for legacy plugins which use import from this module"""
from luna_plugins.matcher_plugins.base_struct import *

from classes.candidate_batch import ORDER

from .exceptions import logger

logger.warning(
    "Deprecated. Do not use import from this module. Use structures from luna_plugins.matcher_plugins.base_struct"
)


class IMatcher(IMatcher):
    """Stub for autoflake (fix not used import)"""


ORDER = ORDER  # Stub for autoflake (fix not used import)