Garbage Collecting pipeline¶
Garbage Collecting (GC) module¶
According to LUNA Platform 3 logic, garbage is the descriptors that are linked neither to a person nor to a list.
For normal system operation, one needs to regularly delete garbage from databases. For this, run the system cleaning script remove_not_linked_descriptors.py from this folder.
According to Backport 3 architecture, this script removes faces, which do not have links with any lists or persons from the Luna Backport 3 database, from the Faces service.
Script execution pipeline¶
The script execution pipeline consists of several stages:
A temporary table is created in the Faces database. See more info about temporary tables for oracle or postgres.
Idsof faces that are not linked to lists are obtained. Theidsare stored in the temporary table.While the temporary table is not empty, the following operations are performed:
The batch of
idsfrom the temporary table is obtained. First 10k (or less) face ids are received.Filtered idsare obtained.Filtered idsare ids that do not exist in theperson_facetable of the Backport 3 database.Filtered idsare removed from the Faces database. If some of the faces cannot be removed, the script stops.Filtered idsare removed from the Backport 3 database (foolcheck). A warning will be printed.Idsare removed from the temporary table.
Script launching¶
Python of version 3.9 is required for the script launching.
The virtual environment for the Backport 3 service should be activated before the script launching. Read about the virtual environment activation and requirements installation in the install.html document.
Set up actual configurations for the Faces database and Backport 3 database in the
./config.conffile.Launch the script:
python remove_not_linked_descriptors.py
The output will include information about the number of removed faces and the number of persons with faces.