Parameter Estimation Facility#
Overview#
The estimation facility is the only multi-purpose facility in FaceEngine. It is designed as a collection of tools that help to estimate various images or depicted object properties. These properties may be used to increase the precision of algorithms implemented by other FaceEngine facilities or to accomplish custom user tasks.
Best shot selection functionality#
Eyes Estimation#
Name: EyeEstimator
Algorithm description:
The estimator is trained to work with warped images (see chapter "Image warping" for details).
This estimator aims to determine:
- Eye state: Open, Closed, Occluded;
- Precise eye iris location as an array of landmarks;
- Precise eyelid location as an array of landmarks.
You can only pass warped image with detected face to the estimator interface. Better image quality leads to better results.
Eye state classifier supports three categories: "Open", "Closed", "Occluded". Poor quality images or ones that depict obscured eyes (think eyewear, hair, gestures) fall into the "Occluded" category. It is always a good idea to check eye state before using the segmentation result.
The precise location allows iris and eyelid segmentation. The estimator is capable of outputting iris and eyelid shapes as an array of points together forming an ellipsis. You should only use segmentation results if the state of that eye is "Open".
Implementation description:
The estimator:
-
Implements the estimate() function that accepts warped source image and warped landmarks, either of type Landmarks5 or Landmarks68. The warped image and landmarks are received from the warper (see
IWarper::warp()
); -
Classifies eyes state and detects its iris and eyelid landmarks;
-
Outputs EyesEstimation structures.
Orientation terms 'left' and 'right' refer to the way you see the image as it is shown on the screen. It means that left eye is not necessarily left from the person's point of view, but is on the left side of the screen. Consequently, right eye is the one on the right side of the screen. More formally, the label 'left' refers to subject's left eye (and similarly for the right eye), such that xright < xleft.
EyesEstimation::EyeAttributes
presents eye state as enum EyeState with possible values: Open, Closed, Occluded.
Iris landmarks are presented with a template structure Landmarks that is specialized for 32 points.
Eyelid landmarks are presented with a template structure Landmarks that is specialized for 6 points.
API structure name:
IEyeEstimator
Plan files:
- eyes_estimation_flwr8_cpu.plan
- eyes_estimation_ir_cpu.plan
- eye_status_estimation_flwr_cpu.plan
- eyes_estimation_flwr8_cpu-avx2.plan
- eyes_estimation_ir_cpu-avx2.plan
- eyes_estimation_ir_gpu.plan
- eyes_estimation_flwr8_gpu.plan
- eye_status_estimation_flwr_cpu.plan
- eye_status_estimation_flwr_cpu-avx2.plan
- eye_status_estimation_flwr_gpu.plan
BestShotQuality Estimation#
Name: BestShotQualityEstimator
Algorithm description:
The BestShotQuality estimator is designed to evaluate image quality to choose the best image before descriptor extraction. The BestShotQuality estimator consists of two components - AGS (garbage score) and Head Pose.
AGS aims to determine the source image score for further descriptor extraction and matching.
Estimation output is a float score which is normalized in range [0..1]. The closer score to 1, the better matching result is received for the image.
When you have several images of a person, it is better to save the image with the highest AGS score.
Recommended threshold for AGS score is equal to 0.2. But it can be changed depending on the purpose of use. Consult VisionLabs about the recommended threshold value for this parameter.
Head Pose determines person head rotation angles in 3D space, namely pitch, yaw and roll.
Since 3D head translation is hard to determine reliably without camera-specific calibration, only 3D rotation component is estimated.
Head pose estimation characteristics:
- Units (degrees);
- Notation (Euler angles);
- Precision (see table below).
Implementation description:
The estimator (see IBestShotQualityEstimator in IEstimator.h):
-
Implements the estimate() function that needs
fsdk::Image
in R8G8B8 format,fsdk::Detection
structure of corresponding source image (see section "Detection structure" in chapter "Face detection facility"),fsdk::IBestShotQualityEstimator::EstimationRequest
structure andfsdk::IBestShotQualityEstimator::EstimationResult
to store estimation result; -
Implements the estimate() function that needs the span of
fsdk::Image
in R8G8B8 format, the span offsdk::Detection
structures of corresponding source images (see section "Detection structure" in chapter "Face detection facility"),fsdk::IBestShotQualityEstimator::EstimationRequest
structure and span offsdk::IBestShotQualityEstimator::EstimationResult
to store estimation results. -
Implements the estimateAsync() function that needs
fsdk::Image
in R8G8B8 format,fsdk::Detection
structure of corresponding source image (see section "Detection structure" in chapter "Face detection facility"),fsdk::IBestShotQualityEstimator::EstimationRequest
structure;
Note: Method estimateAsync() is experimental, and it's interface may be changed in the future. Note: Method estimateAsync() is not marked as noexcept and may throw an exception.
Before using this estimator, user is free to decide whether to estimate or not some listed attributes. For this purpose, estimate() method takes one of the estimation requests:
fsdk::IBestShotQualityEstimator::EstimationRequest::estimateAGS
to make only AGS estimation;fsdk::IBestShotQualityEstimator::EstimationRequest::estimateHeadPose
to make only Head Pose estimation;fsdk::IBestShotQualityEstimator::EstimationRequest::estimateAll
to make both AGS and Head Pose estimations;
Head Pose accuracy:
Prediction precision decreases as a rotation angle increases. We present typical average errors for different angle ranges in the table below.
"Head pose prediction precision"
Range | -45°...+45° | < -45° or > +45° | |
---|---|---|---|
Average prediction error (per axis) | Yaw | ±2.7° | ±4.6° |
Average prediction error (per axis) | Pitch | ±3.0° | ±4.8° |
Average prediction error (per axis) | Roll | ±3.0° | ±4.6° |
Zero position corresponds to a face placed orthogonally to camera direction, with the axis of symmetry parallel to the vertical camera axis.
API structure name:
IBestShotQualityEstimator
Plan files:
- ags_angle_estimation_flwr_cpu.plan
- ags_angle_estimation_flwr_cpu-avx2.plan
- ags_angle_estimation_flwr_gpu.plan
LivenessOneShotRGB Estimation#
Name: LivenessOneShotRGBEstimator
Algorithm description:
This estimator shows whether the person's face is real or fake (photo, printed image).
The requirements for the processed image and the face in the image are listed above.
This estimator supports images taken on mobile devices or webcams (PC or laptop). Image resolution minimum requirements:
- Mobile devices - 720 × 960 px
- Webcam (PC or laptop) - 1280 x 720 px
There should be only one face in the image. An error occurs when there are two or more faces in the image.
The minimum face detection size must be 200 pixels.
Yaw, pitch, and roll angles should be no more than 25 degrees in either direction.
The minimum indent between the face and the image borders should be 10 pixels.
Implementation description:
The estimator (see ILivenessOneShotRGBEstimator in ILivenessOneShotRGBEstimator.h):
-
Implements the estimate() function that needs
fsdk::Image
andfsdk::Face
with valid image in R8G8B8 format and detection structure of corresponding source image (see section "Detection structure" in chapter "Face detection facility"). Output estimation is a structurefsdk::LivenessOneShotRGBEstimation
. -
Implements the estimate() function that needs the span of
fsdk::Image
and span offsdk::Face
with valid image in R8G8B8 format and detection structure of corresponding source image (see section "Detection structure" in chapter "Face detection facility"). The first output estimation is a span of structurefsdk::LivenessOneShotRGBEstimation
. The second output value (structurefsdk::LivenessOneShotRGBEstimation
) is the result of aggregation based on span of estimations announced above. Pay attention the second output value (aggregation) is optional, i.e.default argument
, which isnullptr
.
The LivenessOneShotRGBEstimation structure contains results of the estimation:
struct LivenessOneShotRGBEstimation {
enum class State {
Alive = 0, //!< The person on image is real
Fake, //!< The person on image is fake (photo, printed image)
Unknown //!< The liveness status of person on image is Unknown
};
float score; //!< Estimation score
State state; //!< Liveness status
float qualityScore; //!< Liveness quality score
};
Estimation score is normalized in range [0..1], where 1 - is real person, 0 - is fake.
Liveness quality score is an image quality estimation for the liveness recognition.
This parameter is used for filtering if it is possible to make bestshot when checking for liveness.
The reference score is 0,5.
The value of State
depends on score
and qualityThreshold
.
The value qualityThreshold
can be given as an argument of method estimate
(see ILivenessOneShotRGBEstimator
),
and in configuration file faceengine.conf (see ConfigurationGuide LivenessOneShotRGBEstimator
).
Recommended thresholds:
Table below contain thresholds from faceengine configuration file (faceengine.conf) in the LivenessOneShotRGBEstimator::Settings
section. By default, these threshold values are set to optimal.
"LivenessOneShotRGB estimator recommended thresholds"
Threshold | Recommended value |
---|---|
realThreshold | 0.5 |
qualityThreshold | 0.5 |
calibrationCoeff | 0.94 |
Configurations:
See the "LivenessOneShotRGBEstimator settings" section in the "ConfigurationGuide.pdf" document.
API structure name:
ILivenessOneShotRGBEstimator
Plan files:
- oneshot_rgb_liveness_model_1_cpu.plan
- oneshot_rgb_liveness_model_2_cpu.plan
- oneshot_rgb_liveness_model_3_cpu.plan
- oneshot_rgb_liveness_model_4_cpu.plan
- oneshot_rgb_liveness_model_5_cpu.plan
- oneshot_rgb_liveness_model_6_cpu.plan
- oneshot_rgb_liveness_model_7_cpu.plan
- oneshot_rgb_liveness_model_1_cpu-avx2.plan
- oneshot_rgb_liveness_model_2_cpu-avx2.plan
- oneshot_rgb_liveness_model_3_cpu-avx2.plan
- oneshot_rgb_liveness_model_4_cpu-avx2.plan
- oneshot_rgb_liveness_model_5_cpu-avx2.plan
- oneshot_rgb_liveness_model_6_cpu-avx2.plan
- oneshot_rgb_liveness_model_7_cpu-avx2.plan
- oneshot_rgb_liveness_model_1_gpu.plan
- oneshot_rgb_liveness_model_2_gpu.plan
- oneshot_rgb_liveness_model_3_gpu.plan
- oneshot_rgb_liveness_model_4_gpu.plan
- oneshot_rgb_liveness_model_5_gpu.plan
- oneshot_rgb_liveness_model_6_gpu.plan
- oneshot_rgb_liveness_model_7_gpu.plan
Usage example#
The face in the image and the image itself should meet the estimator requirements.
You can find additional information in example (examples/example_estimation/main.cpp
) or in the code below.
// Minimum detection size in pixels.
constexpr int minDetSize = 200;
// Step back from the borders.
constexpr int borderDistance = 10;
if (std::min(detectionRect.width, detectionRect.height) < minDetSize) {
std::cerr << "Bounding Box width and/or height is less than `minDetSize` - " << minDetSize << std::endl;
return false;
}
if ((detectionRect.x + detectionRect.width) > (image.getWidth() - borderDistance) || detectionRect.x < borderDistance) {
std::cerr << "Bounding Box width is out of border distance - " << borderDistance << std::endl;
return false;
}
if ((detectionRect.y + detectionRect.height) > (image.getHeight() - borderDistance) || detectionRect.y < borderDistance) {
std::cerr << "Bounding Box height is out of border distance - " << borderDistance << std::endl;
return false;
}
// Yaw, pitch and roll.
constexpr int principalAxes = 25;
if (std::abs(headPose.pitch) > principalAxes ||
std::abs(headPose.yaw) > principalAxes ||
std::abs(headPose.roll) > principalAxes ) {
std::cerr << "Can't estimate LivenessOneShotRGBEstimation. " <<
"Yaw, pith or roll absolute value is larger than expected value: " << principalAxes << "." <<
"\nPitch angle estimation: " << headPose.pitch <<
"\nYaw angle estimation: " << headPose.yaw <<
"\nRoll angle estimation: " << headPose.roll << std::endl;
return false;
}
We recommend using
Detector type 3 (fsdk::ObjectDetectorClassType::FACE_DET_V3)
.
BestShotQuality Estimation#
Name: BestShotQualityEstimator
Algorithm description:
The BestShotQuality estimator is designed to evaluate image quality to choose the best image before descriptor extraction. The BestShotQuality estimator consists of two components - AGS (garbage score) and Head Pose.
AGS aims to determine the source image score for further descriptor extraction and matching.
Estimation output is a float score which is normalized in range [0..1]. The closer score to 1, the better matching result is received for the image.
When you have several images of a person, it is better to save the image with the highest AGS score.
Recommended threshold for AGS score is equal to 0.2. But it can be changed depending on the purpose of use. Consult VisionLabs about the recommended threshold value for this parameter.
Head Pose determines person head rotation angles in 3D space, namely pitch, yaw and roll.
Since 3D head translation is hard to determine reliably without camera-specific calibration, only 3D rotation component is estimated.
Head pose estimation characteristics:
- Units (degrees);
- Notation (Euler angles);
- Precision (see table below).
Implementation description:
The estimator (see IBestShotQualityEstimator in IEstimator.h):
-
Implements the estimate() function that needs
fsdk::Image
in R8G8B8 format,fsdk::Detection
structure of corresponding source image (see section "Detection structure" in chapter "Face detection facility"),fsdk::IBestShotQualityEstimator::EstimationRequest
structure andfsdk::IBestShotQualityEstimator::EstimationResult
to store estimation result; -
Implements the estimate() function that needs the span of
fsdk::Image
in R8G8B8 format, the span offsdk::Detection
structures of corresponding source images (see section "Detection structure" in chapter "Face detection facility"),fsdk::IBestShotQualityEstimator::EstimationRequest
structure and span offsdk::IBestShotQualityEstimator::EstimationResult
to store estimation results. -
Implements the estimateAsync() function that needs
fsdk::Image
in R8G8B8 format,fsdk::Detection
structure of corresponding source image (see section "Detection structure" in chapter "Face detection facility"),fsdk::IBestShotQualityEstimator::EstimationRequest
structure;
Note: Method estimateAsync() is experimental, and it's interface may be changed in the future. Note: Method estimateAsync() is not marked as noexcept and may throw an exception.
Before using this estimator, user is free to decide whether to estimate or not some listed attributes. For this purpose, estimate() method takes one of the estimation requests:
fsdk::IBestShotQualityEstimator::EstimationRequest::estimateAGS
to make only AGS estimation;fsdk::IBestShotQualityEstimator::EstimationRequest::estimateHeadPose
to make only Head Pose estimation;fsdk::IBestShotQualityEstimator::EstimationRequest::estimateAll
to make both AGS and Head Pose estimations;
Head Pose accuracy:
Prediction precision decreases as a rotation angle increases. We present typical average errors for different angle ranges in the table below.
"Head pose prediction precision"
Range | -45°...+45° | < -45° or > +45° | |
---|---|---|---|
Average prediction error (per axis) | Yaw | ±2.7° | ±4.6° |
Average prediction error (per axis) | Pitch | ±3.0° | ±4.8° |
Average prediction error (per axis) | Roll | ±3.0° | ±4.6° |
Zero position corresponds to a face placed orthogonally to camera direction, with the axis of symmetry parallel to the vertical camera axis.
API structure name:
IBestShotQualityEstimator
Plan files:
- ags_angle_estimation_flwr_cpu.plan
- ags_angle_estimation_flwr_cpu-avx2.plan
- ags_angle_estimation_flwr_gpu.plan
Image Quality Estimation#
Name: QualityEstimator
Algorithm description:
The estimator is trained to work with warped images (see chapter "Image warping" for details).
This estimator is designed to determine the image quality. You can estimate the image according to the following criteria:
- The image is blurred;
- The image is underexposed (i.e., too dark);
- The image is overexposed (i.e., too light);
- The face in the image is illuminated unevenly (there is a great difference between light and dark regions);
- Image contains flares on face (too specular).
Examples are presented in the images below. Good quality images are shown on the right.
Implementation description:
The general rule of thumb for quality estimation:
- Detect a face, see if detection confidence is high enough. If not, reject the detection;
- Produce a warped face image (see chapter "Descriptor processing facility") using a face detection and its landmarks;
- Estimate visual quality using the estimator, finally reject low-quality images.
While the scheme above might seem a bit complicated, it is the most efficient performance-wise, since possible rejections on each step reduce workload for the next step.
At the moment estimator exposes two interface functions to predict image quality:
- virtual Result
estimate(const Image& warp, Quality& quality); - virtual Result
estimate(const Image& warp, SubjectiveQuality& quality);
Each one of this functions use its own CNN internally and return slightly different quality criteria.
The first CNN is trained specifically on pre-warped human face images and will produce lower score factors if one of the following conditions are satisfied:
- Image is blurred;
- Image is under-exposured (i.e., too dark);
- Image is over-exposured (i.e., too light);
- Image color variation is low (i.e., image is monochrome or close to monochrome).
Each one of this score factors is defined in [0..1] range, where higher value corresponds to better image quality and vice versa.
The second interface function output will produce lower factor if:
- The image is blurred;
- The image is underexposed (i.e., too dark);
- The image is overexposed (i.e., too light);
- The face in the image is illuminated unevenly (there is a great difference between light and dark regions);
- Image contains flares on face (too specular).
The estimator determines the quality of the image based on each of the aforementioned parameters. For each parameter, the estimator function returns two values: the quality factor and the resulting verdict.
As with the first estimator function the second one will also return the quality factors in the range [0..1], where 0 corresponds to low image quality and 1 to high image quality. E. g., the estimator returns low quality factor for the Blur parameter, if the image is too blurry.
The resulting verdict is a quality output based on the estimated parameter. E. g., if the image is too blurry, the estimator returns “isBlurred = true”.
The threshold (see below) can be specified for each of the estimated parameters. The resulting verdict and the quality factor are linked through this threshold. If the received quality factor is lower than the threshold, the image quality is low and the estimator returns “true”. E. g., if the image blur quality factor is higher than the threshold, the resulting verdict is “false”.
If the estimated value for any of the parameters is lower than the corresponding threshold, the image is considered of bad quality. If resulting verdicts for all the parameters are set to "False" the quality of the image is considered good.
The quality factor is a value in the range [0..1] where 0 corresponds to low quality and 1 to high quality.
Illumination uniformity corresponds to the face illumination in the image. The lower the difference between light and dark zones of the face, the higher the estimated value. When the illumination is evenly distributed throughout the face, the value is close to "1".
Specularity is a face possibility to reflect light. The higher the estimated value, the lower the specularity and the better the image quality. If the estimated value is low, there are bright glares on the face.
Recommended thresholds:
Table below contain thresholds from faceengine configuration file (faceengine.conf) in QualityEstimator::Settings
section. By default, these threshold values are set to optimal.
"Image quality estimator recommended thresholds"
Threshold | Recommended value |
---|---|
blurThreshold | 0.61 |
darknessThreshold | 0.50 |
lightThreshold | 0.57 |
illuminationThreshold | 0.1 |
specularityThreshold | 0.1 |
The most important parameters for face recognition are "blurThreshold", "darknessThreshold" and "lightThreshold", so you should select them carefully.
You can select images of better visual quality by setting higher values of the "illuminationThreshold" and "specularityThreshold". Face recognition is not greatly affected by uneven illumination or glares.
Configurations:
See the "Quality estimator settings" section in the "ConfigurationGuide.pdf" document.
API structure name:
IQualityEstimator
Plan files:
- model_subjective_quality_v2_cpu.plan
- model_subjective_quality_v2_cpu-avx2.plan
- model_subjective_quality_v2_gpu.plan
Medical Mask Estimation Functionality#
Name: MedicalMaskEstimator
This estimator aims to detect a medical mask on the face in the source image. For the interface with MedicalMaskEstimation it can return the next results:
- A medical mask is on the face (see MedicalMask::Mask field in the MedicalMask enum);
- There is no medical mask on the face (see MedicalMask::NoMask field in the MedicalMask enum);
- The face is occluded with something (see MedicalMask::OccludedFace field in the MedicalMask enum);
For the interface with MedicalMaskEstimationExtended it can return the next results:
- A medical mask is on the face (see MedicalMaskExtended::Mask field in the MedicalMaskExtended enum);
- There is no medical mask on the face (see MedicalMaskExtended::NoMask field in the MedicalMaskExtended enum);
- A medical mask is not on the right place (see MedicalMaskExtended::MaskNotInPlace field in the MedicalMaskExtended enum);
- The face is occluded with something (see MedicalMaskExtended::OccludedFace field in the MedicalMaskExtended enum);
The estimator (see IMedicalMaskEstimator in IEstimator.h):
- Implements the estimate() function that accepts source warped image in R8G8B8 format and medical mask estimation structure to return results of estimation;
- Implements the estimate() function that accepts source image in R8G8B8 format, face detection to estimate and medical mask estimation structure to return results of estimation;
- Implements the estimate() function that accepts fsdk::Span of the source warped images in R8G8B8 format and fsdk::Span of the medical mask estimation structures to return results of estimation;
- Implements the estimate() function that accepts fsdk::Span of the source images in R8G8B8 format, fsdk::Span of face detections and fsdk::Span of the medical mask estimation structures to return results of the estimation.
Every method can be used with MedicalMaskEstimation and MedicalMaskEstimationExtended.
The estimator was implemented for two use-cases:
- When the user already has warped images. For example, when the medical mask estimation is performed right before (or after) the face recognition;
- When the user has face detections only.
Note: Calling the estimate() method with warped image and the estimate() method with image and detection for the same image and the same face could lead to different results.
MedicalMaskEstimator thresholds#
The estimator returns several scores, one for each possible result. The final result is based on that scores and thresholds. If some score is above the corresponding threshold, that result is estimated as final. If none of the scores exceed the matching threshold, the maximum value will be taken. If some of the scores exceed their thresholds, the results will take precedence in the following order for the case with MedicalMaskEstimation:
Mask, NoMask, OccludedFace
and for the case with MedicalMaskEstimationExtended:
Mask, NoMask, MaskNotInPlace, OccludedFace
The default values for all thresholds are taken from the configuration file. See Configuration guide for details.
MedicalMask enumeration#
The MedicalMask enumeration contains all possible results of the MedicalMask estimation:
enum class MedicalMask {
Mask = 0, //!< medical mask is on the face
NoMask, //!< no medical mask on the face
OccludedFace //!< face is occluded by something
};
enum class DetailedMaskType {
CorrectMask = 0, //!< correct mask on the face (mouth and nose are covered correctly)
MouthCoveredWithMask, //!< mask covers only a mouth
ClearFace, //!< clear face - no mask on the face
ClearFaceWithMaskUnderChin, //!< clear face with a mask around of a chin, mask does not cover anything in the face region (from mouth to eyes)
PartlyCoveredFace, //!< face is covered with not a medical mask or a full mask
FullMask, //!< face is covered with a full mask (such as balaclava, sky mask, etc.)
Count
};
Mask
is according toCorrectMask
orMouthCoveredWithMask
;NoMask
is according toClearFace
orClearFaceWithMaskUnderChin
;OccludedFace
is according toPartlyCoveredFace
orFullMask
.
Note - NoMask
means absence of medical mask or any occlusion in the face region (from mouth to eyes).
Note - DetailedMaskType
is not supported for NPU-based platforms.
MedicalMaskEstimation structure#
The MedicalMaskEstimation
structure contains results of the estimation:
struct MedicalMaskEstimation {
MedicalMask result; //!< estimation result (@see MedicalMask enum)
DetailedMaskType maskType; //!< detailed type (@see DetailedMaskType enum)
// scores
float maskScore; //!< medical mask is on the face score
float noMaskScore; //!< no medical mask on the face score
float occludedFaceScore; //!< face is occluded by something score
float scores[static_cast<int>(DetailedMaskType::Count)]{}; //!< detailed estimation scores
inline float getScore(DetailedMaskType type) const;
};
There are two groups of the fields:
- The first group contains the result:
MedicalMask result;
Result enum field MedicalMaskEstimation contains the target results of the estimation. Also you can see the more detailed type in MedicalMaskEstimation.
DetailedMaskType maskType; //!< detailed type
- The second group contains scores:
float maskScore; //!< medical mask is on the face score
float noMaskScore; //!< no medical mask on the face score
float occludedFaceScore; //!< face is occluded by something score
The score group contains the estimation scores for each possible result of the estimation. All scores are defined in [0,1] range. They can be useful for users who want to change the default thresholds for this estimator. If the default thresholds are used, the group with scores could be just ignored in the user code. More detailed scores for every type of a detailed type of face covering are
float scores[static_cast<int>(DetailedMaskType::Count)]{}; //!< detailed estimation scores
maskScore
is the sum of scores forCorrectMask
,MouthCoveredWithMask
;NoMask
is the sum of scores forClearFace
andClearFaceWithMaskUnderChin
;occludedFaceScore
is the sum of scores forPartlyCoveredFace
andFullMask
fields.
Note - DetailedMaskType
, scores
, getScore
are not supported for NPU-based platforms. It means a user cannot use this fields and methods in code.
MedicalMaskExtended enumeration#
The MedicalMask enumeration contains all possible results of the MedicalMask estimation:
enum class MedicalMaskExtended {
Mask = 0, //!< medical mask is on the face
NoMask, //!< no medical mask on the face
MaskNotInPlace, //!< mask is not on the right place
OccludedFace //!< face is occluded by something
};
MedicalMaskEstimationExtended structure#
The MedicalMaskEstimationExtended structure contains results of the estimation:
struct MedicalMaskEstimationExtended {
MedicalMaskExtended result; //!< estimation result (@see MedicalMaskExtended enum)
// scores
float maskScore; //!< medical mask is on the face score
float noMaskScore; //!< no medical mask on the face score
float maskNotInPlace; //!< mask is not on the right place
float occludedFaceScore; //!< face is occluded by something score
};
There are two groups of the fields:
- The first group contains only the result enum:
MedicalMaskExtended result;
Result enum field MedicalMaskEstimationExtended contains the target results of the estimation.
- The second group contains scores:
float maskScore; //!< medical mask is on the face score
float noMaskScore; //!< no medical mask on the face score
float maskNotInPlace; //!< mask is not on the right place
float occludedFaceScore; //!< face is occluded by something score
The score group contains the estimation scores for each possible result of the estimation. All scores are defined in [0,1] range.
Filtration parameters#
The estimator is trained to work with face images that meet the following requirements:
"Requirements for fsdk::MedicalMaskEstimator::EstimationResult
"
Attribute | Acceptable values |
---|---|
headPose.pitch | [-40...40] |
headPose.yaw | [-40...40] |
headPose.roll | [-40...40] |
ags | [0.5...1.0] |
Configurations:
See the "Medical mask estimator settings" section in the "ConfigurationGuide.pdf" document.
API structure name:
IMedicalMaskEstimator
Plan files:
- mask_clf_v3_cpu.plan
- mask_clf_v3_cpu-avx2.plan
- mask_clf_v3_gpu.plan
Glasses Estimation#
Name: GlassesEstimator
Algorithm description:
Glasses estimator is designed to determine whether a person is currently wearing any glasses or not. There are 3 types of states estimator is currently able to estimate:
- NoGlasses state determines whether a person is wearing any glasses at all;
- EyeGlasses state determines whether a person is wearing eyeglasses;
- SunGlasses state determines whether a person is wearing sunglasses.
Note. Source input image must be warped in order for estimator to work properly (see chapter "Image warping" for details). Quality of estimation depends on threshold values located in faceengine configuration file (see below).
Recommended thresholds:
Table below contain thresholds from faceengine configuration file (faceengine.conf) in GlassesEstimator::Settings
section. By default, these threshold values are set to optimal.
"Glasses estimator recommended thresholds"
Threshold | Recommended value |
---|---|
noGlassesThreshold | 0.986 |
eyeGlassesThreshold | 0.57 |
sunGlassesThreshold | 0.506 |
Configurations:
See the "GlassesEstimator settings" section in the "ConfigurationGuide.pdf" document.
Metrics:
Table below contain true positive rates corresponding to selected false positive rates.
"Glasses estimator TPR/FPR rates"
State | TPR | FPR |
---|---|---|
NoGlasses | 0.997 | 0.00234 |
EyeGlasses | 0.9768 | 0.000783 |
SunGlasses | 0.9712 | 0.000383 |
API structure name:
IGlassesEstimator
Plan files:
- glasses_estimation_flwr_cpu.plan
- glasses_estimation_flwr_cpu-avx2.plan
- glasses_estimation_flwr_gpu.plan