Clustering task

Clustering is the grouping of a particular set of objects based on their descriptors, aggregating them according to their similarities.

To make clustering task it needs attribute_ids. If you need to use face attributes for the clustering task, set the descriptor type to “face” and use faces from the Luna Faces or events from the Luna Events. If you need to use bodies for the clustering task, set the descriptor type to “body” and get the attributes using events from Luna Events. In both cases, only objects with descriptors will be processed. One can optionally specify clustering threshold. Also it need account_id to for task creation. “save_images” flag is now available: existent images will be placed in an images subfolder in the result archive. One can optionally specify clustering parameter “use_track_info”. In that case objects with the same “track_id” will be put in the same clusters.

Clustering process

Clustering is done in several steps:

  • collect objects having attribute ids using provided filters

  • match every object with all other objects

  • download objects’ track ids if required

  • create clusters as groups of “connected components” from the similarity graph, link:

    here “connected” means that similarity is greater than provided threshold or default “DEFAULT_CLUSTERING_THRESHOLD” from the config.

For detail see clustering task