Clustering task

Clustering is the grouping of a particular set of objects based on their descriptors, aggregating them according to their similarities.

To make clustering task it needs attribute_ids. If you need to use face attributes for the clustering task, set the descriptor type to “face” and use faces from the Luna Faces or events from the Luna Events. If you need to use bodies for the clustering task, set the descriptor type to “body” and get the attributes using events from Luna Events. In both cases, only objects with descriptors will be processed. One can optionally specify clustering threshold. Also it need account_id to for task creation. “save_images” flag is now available: existent images will be placed in an images subfolder in the result archive. One can optionally specify clustering parameter “use_track_info”. In that case objects with the same “track_id” will be put in the same clusters.

Clustering process

Clustering is done in several steps:

  • collect objects having attribute ids using provided filters

  • match every object with all other objects

  • download objects’ track ids if required

  • create clusters as groups of “connected components” from the similarity graph, link:

    here “connected” means that similarity is greater than provided threshold or default “DEFAULT_CLUSTERING_THRESHOLD” from the config.

Luna-faces

There are three filters to get faces from service. It can be used separately or together or without any filters.

Filters:

list_id: id of list, which face(s) linked to
create_time__gte: lower included bound for object create_time
create_time__lt: upper bound for object create_time(current time for default)

Luna-Events

There are some filters to get events from service. It can be used separately or together or without any filters.

Filters:

create_time__lt: upper bound for object create time (current time for default)
create_time__gte: lower included bound for object create_time
sources: source filter
event_ids: event ids
handler_ids: handler ids
top_matching_candidates_label: top matching candidate label
top_similar_object_ids: top similar object ids
top_similar_external_ids: top similar external ids
top_similar_object_similarity__gte: top similar object similarity lower included bound
top_similar_object_similarity__lt: top similar object similarity upper excluded bound
age__lt: age upper excluded bound
age__gte: age lower included bound
gender: gender filter
emotions: emotions filter
ethnic_groups: ethnic group filter
face_ids: face ids
user_data: user data filter
tags: tags filter
cities: city location filter
areas: area location filter
districts: district location filter
streets: street location filter
house_numbers: house number location filter
geo_position: geo position filter
masks: medical mask state filter
track_ids: track id filter
liveness: liveness state filter