Parameter Estimation Facility#

Overview#

Estimation facility is the only multi-purpose facility in FaceEngine. It is designed as a collection of tools that help to estimate various image or depicted object properties. These properties may be used to increase the precision of algorithms implemented by other FaceEngine facilities or to accomplish custom user tasks.

Eyes Estimation#

Note. The estimator is trained to work with warped images (see Chapter "Image warping" for details).

This estimator aims to determine:

Eye state: Open, Closed, Occluded;
Precise eye iris location as an array of landmarks;
Precise eyelid location as an array of landmarks.

You can only pass warped image with detected face to the estimator interface. Better image quality leads to better results.

Eye state classifier supports three categories: "Open", "Closed", "Occluded". Poor quality images or ones that depict obscured eyes (think eyewear, hair, gestures) fall into the "Occluded" category. It is always a good idea to check eye state before using the segmentation result.

The precise location allows iris and eyelid segmentation. The estimator is capable of outputting iris and eyelid shapes as an array of points together forming an ellipsis. You should only use segmentation results if the state of that eye is "Open".

The estimator:

Implements the estimate() function that accepts warped source image (see Chapter "Image warping") and warped landmarks, either of type Landmarks5 or Landmarks68. The warped image and landmarks are received from the warper (see IWarper::warp());
Classifies eyes state and detects its iris and eyelid landmarks;
Outputs EyesEstimation structures.

Note. Orientation terms 'left' and 'right' refer to the way you see the image as it is shown on the screen. It means that left eye is not necessarily left from the person's point of view, but is on the left side of the screen. Consequently, right eye is the one on the right side of the screen. More formally, the label 'left' refers to subject\'s left eye (and similarly for the right eye), such that xright < xleft.

EyesEstimation::EyeAttributes presents eye state as enum EyeState with possible values: Open, Closed, Occluded.

Iris landmarks are presented with a template structure Landmarks that is specialized for 32 points.

Eyelid landmarks are presented with a template structure Landmarks that is specialized for 6 points.

Head pose estimation#

This estimator is designed to determine camera-space head pose. Since 3D head translation is hard to determine reliably without camera-specific calibration, only 3D rotation component is estimated.

There are two head pose estimation method available:

Estimate by 68 face-aligned landmarks (you may get it from Detector facility, see Chapter "Detection facility") ;
Estimate by original input image in RGB format.

Estimation by image is more precise. If you have already extracted 68 landmarks for another facilities you may save time, and use fast estimator from 68 landmarks.

By default, all methods are available to use in config (faceengine.conf) in section "HeadPoseEstimator". You may disable these methods to decrease RAM usage and initialization time.

Estimation characteristics:

Units (degrees);
Notation (Euler angles);
Precision (see the table below).

Note. Prediction precision decreases as a rotation angle increases. We present typical average errors for different angle ranges in the table below.

"Head pose prediction precision"

	Range	-45°...+45°	< -45° or > +45°
Average prediction error (per axis)	Yaw	±2.7°	±4.6°
Average prediction error (per axis)	Pitch	±3.0°	±4.8°
Average prediction error (per axis)	Roll	±3.0°	±4.6°

Zero position corresponds to a face placed orthogonally to camera direction, with the axis of symmetry parallel to the vertical camera axis. See the image below for a reference.

Note. In order to work, this estimator requires precise 68-point face alignment results, so familiarize with section "Face alignment" in the "Detection facility" chapter as well.

Approximate Garbage Score Estimation (AGS)#

This estimator aims to determine the source image score for further descriptor extraction and matching. The higher the score, the better matching result is received for the image.

When you have several images of a person, it is better to save the image with the highest AGS score.

Consult VisionLabs about the recommended threshold value for this parameter.

The estimator (see IAGSEstimator in IEstimator.h):

Implements the estimate() function that accepts source image in R8G8B8 format and fsdk::Detection structure of corresponding source image (see section "Detection structure" in chapter "Detection facility");
Estimates garbage score of input image;
Outputs garbage score value.

BestShotQuality Estimation#

The BestShotQuality estimator represents a collection of estimator functionalities unified for end-user convenience.

Estimation types that were merged into this estimator are described in the following list:

AGS: image quality score (see section "Approximate garbage score estimation (AGS)" for more details);
HeadPose: determines person head rotation angles in 3D space, namely pitch, yaw and roll (see section Head pose estimation for more details).

Before using this estimator, user is free to decide whether to estimate or not some specific attributes listed above through IBestShotQualityEstimator::EstimationRequests structure, which later get passed in main estimate() method.

Estimator overrides AQEEstimationResults output structure, which consists of optional fields describing results of user requested attributes.