Appendix A. Specifications#

Classification performance#

Classification performance was measured on a two datasets:

Cooperative dataset ( containing 20K images from various sources obtained at several banks);
Non cooperative dataset ( containing 20K ).

The two tables below contain true positive rates corresponding to select false positive rates.

"Classification performance @ low FPR on cooperative dataset"

FPR	TPR CNN 54	TPR CNN 56	TPR CNN 57	TPR CNN 58	TPR CNN 59	TPR CNN 54m	TPR CNN 56m	TPR CNN 59m
10^-7^	0.9765	0.9907	0.9906	0.9910	0.9911	0.9699	0.9652	0.9876
10^-6^	0.9849	0.9914	0.9915	0.9916	0.9915	0.9829	0.9814	0.9904
10^-5^	0.9892	0.9916	0.9917	0.9918	0.9919	0.9887	0.9886	0.9915
10^-4^	0.9909	0.9917	0.9918	0.9919	0.9921	0.9910	0.9910	0.9919

"Classification performance @ low FPR on non cooperative dataset"

FPR	TPR CNN 54	TPR CNN 56	TPR CNN 57	TPR CNN 58	TPR CNN 59	TPR CNN 54m	TPR CNN 56m	TPR CNN 59m
10^-7^	0.9638	0.9698	0.9723	0.9767	0.9832	0.8813	0.8844	0.9377
10^-6^	0.9773	0.9809	0.9817	0.9839	0.9880	0.9233	0.9229	0.9629
10^-5^	0.9852	0.9871	0.9873	0.9880	0.9908	0.9538	0.9561	0.9794
10^-4^	0.9896	0.9902	0.9905	0.9909	0.9924	0.9752	0.9757	0.9880

Runtime performance#

Server environment#

Face detection performance depends on input image parameters such as resolution and bit depth as well as the size of the detected face.

Input data characteristics:

Image resolution: 1920x1080px;
Image format: 24 BPP RGB;

Performance measurements are presented for CPU, GPU and NPU execution modes in tables below. Measured values are averages of at least 100 experiments.

The results for minimum batch size and optimal batch size are shown in the tables below. All the intermediate and non-optimal values are omitted.

Face detections are performed using FaceDetV3 NN.

All types of face detection and redetect performed with capturing bounding boxes and 5 facial landmarks.

CPU performance#

Benchmarking for CPU was performed on the server with the following hardware configuration:

CPU:

Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz;
CPU(s): 40
Thread(s) per core: 2
Core(s) per socket: 10
Socket(s): 2
NUMA node(s): 2
CPU with AVX2 instruction set was used

OS: CentOS Linux release 8.3.2011

RAM: 128 GB DDR4 (Clock Speed: 2133 MHz)

In experiments listed in tables below face detection and descriptor extraction algorithms used all available CPU cores, whereas matching performance is specified per-core.

Descriptor matching is only implemented on CPU.

"CPU. Detector performance"

Measurement	CPU threads	BatchSize	Average (ms)
Detector (minFaceSize=20)	1	1	358.3
Detector (minFaceSize=20)	8	1	169.6
Detector (minFaceSize=20)	8	4	166.4
Detector (minFaceSize=20)	8	8	169.2
Detector (minFaceSize=50)	1	1	55.8
Detector (minFaceSize=50)	8	1	27.1
Detector (minFaceSize=50)	8	4	25.1
Detector (minFaceSize=50)	8	8	26.5
Detector (minFaceSize=90)	1	1	18.9
Detector (minFaceSize=90)	8	1	12.3
Detector (minFaceSize=90)	8	4	8.5
Detector (minFaceSize=90)	8	8	9.2
Redetect	1	1	4.05
Redetect	8	1	2.99
Redetect	8	4	1.5
Redetect	8	8	1.34

"CPU. HumanDetector performance"

Measurement	CPU threads	BatchSize	Average (ms)
HumanDetector (imageSize=320)	1	1	12.5
HumanDetector (imageSize=320)	8	1	7.0
HumanDetector (imageSize=320)	8	4	4.4
HumanDetector (imageSize=320)	8	8	4.4
HumanDetector (imageSize=640)	1	1	39.8
HumanDetector (imageSize=640)	8	1	19.2
HumanDetector (imageSize=640)	8	4	16.4
HumanDetector (imageSize=640)	8	8	16.8
HumanLandmarksDetector (imageSize=320)	1	1	44.5
HumanLandmarksDetector (imageSize=320)	8	1	20.6
HumanLandmarksDetector (imageSize=320)	8	4	13.0
HumanLandmarksDetector (imageSize=320)	8	8	13.3
HumanLandmarksDetector (imageSize=640)	1	1	72.6
HumanLandmarksDetector (imageSize=640)	8	1	32.5
HumanLandmarksDetector (imageSize=640)	8	4	24.8
HumanLandmarksDetector (imageSize=640)	8	8	25.7
HumanRedetect	1	1	2.42
HumanRedetect	8	1	2.47
HumanRedetect	8	4	1.13
HumanRedetect	8	8	1.11

Below are the measurements for estimators that have a batch interface. All these measurements are performed with minFaceSize=50.

"CPU. Estimation performance with batch interface"

Measurement	CPU threads	BatchSize	Average (ms)
Eyes (INFRA_RED, useStatusPlan=0)	1	1	0.6
Eyes (INFRA_RED, useStatusPlan=0)	8	1	0.4
Eyes (INFRA_RED, useStatusPlan=0)	8	8	0.3
Eyes (RGB, useStatusPlan=0)	1	1	1.2
Eyes (RGB, useStatusPlan=0)	8	1	0.8
Eyes (RGB, useStatusPlan=0)	8	8	0.5
Eyes (INFRA_RED, useStatusPlan=1)	1	1	0.6
Eyes (INFRA_RED, useStatusPlan=1)	8	1	0.4
Eyes (INFRA_RED, useStatusPlan=1)	8	8	0.3
Eyes (RGB, useStatusPlan=1)	1	1	1.1
Eyes (RGB, useStatusPlan=1)	8	1	0.8
Eyes (RGB, useStatusPlan=1)	8	8	0.5
Infra-Red	1	1	2
Infra-Red	8	1	1.0
Infra-Red	8	8	0.7
AGS	1	1	0.3
AGS	8	1	0.2
AGS	8	8	0.07
HeadPoseByImage	1	1	0.3
HeadPoseByImage	8	1	0.3
HeadPoseByImage	8	8	0.09
Child	1	1	18.7
Child	8	1	6.3
Child	8	8	5.2
BlackWhite	1	1	1.3
BlackWhite	8	1	0.7
BlackWhite	8	8	1.2
BestShotQuality	1	1	0.3
BestShotQuality	8	1	0.2
BestShotQuality	8	8	0.08
MedicalMask	1	1	5.6
MedicalMask	8	1	3.2
MedicalMask	8	8	1.5
LivenessOneShotRGBEstimator	1	1	214.6
LivenessOneShotRGBEstimator	8	1	58.7
LivenessOneShotRGBEstimator	8	8	78.8
Orientation	1	1	20.8
Orientation	8	1	10.1
Orientation	8	8	8.9
CredibilityCheck	1	1	120.3
CredibilityCheck	8	1	35.1
CredibilityCheck	8	8	34.1
FacialHair	1	1	2.7
FacialHair	8	1	1.9
FacialHair	8	8	0.99
PortraitStyle	1	1	1.0
PortraitStyle	8	1	1.2
PortraitStyle	8	8	1.7
Background	1	1	1.1
Background	8	1	1.2
Background	8	8	1.7
NaturalLight	1	1	2.37
NaturalLight	8	1	1.49
NaturalLight	8	8	1.97
FishEye	1	1	2.77
FishEye	8	1	2.08
FishEye	8	8	5.86
RedEye	1	1	5.7
RedEye	8	1	1.9
RedEye	8	8	1.6
HeadWear	1	1	2.22
HeadWear	8	1	1.51
HeadWear	8	8	1.96
EyeBrowEstimator	1	1	13.82
EyeBrowEstimator	8	1	4.77
EyeBrowEstimator	8	8	3.05

Below are the measurements for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50.

"CPU. Estimation performance without batch interface"

Measurement	CPU threads	Average (ms)
EyesGaze	1	2.2
EyesGaze	8	1.4
Emotions	1	13.6
Emotions	8	4.9
Attributes	1	63.3
Attributes	8	19.8
Quality	1	1.2
Quality	8	0.6
Warper	1	2.2
Warper	8	2.3
Overlap	1	4.5
Overlap	8	1.3
Glasses	1	1.8
Glasses	8	0.8
Mouth	1	6.9
Mouth	8	2.69
PPE	1	8.9
PPE	8	4.9
LivenessFlyingFaces	1	9.2
LivenessFlyingFaces	8	5.0
LivenessRGBMEstimator	1	30.6
LivenessRGBMEstimator	8	9.7
LivenessFRP	1	44.2
LivenessFRP	8	19.9

"CPU. Extractor performance"

Type	Model	CPU threads	Average (ms)
Extractor	57	1	221.2
Extractor	57	8	58.3
Extractor	58	1	219.3
Extractor	58	8	58.0
Extractor	59	1	219.7
Extractor	59	8	58.2
Extractor	102	1	1.8
Extractor	102	8	2.1
Extractor	103	1	142.2
Extractor	103	8	50.6
Extractor	104	1	12.6
Extractor	104	8	6.2

The following table includes average matcher per second for descriptors received using the following CNN model versions:

face descriptors: 57, 58, 59
human body descriptors: 102, 103, 104

"CPU. Matcher performance"

Type	Model	CPU threads	Batch Size	Average (matches/sec)
Matcher	57, 58, 59	1	1000	42.2 M
Matcher	102, 103, 104	1	1000	10.17 M

Note: The above value is the maximum performance of the matcher on a particular piece of hardware. Performance in general does not depend on the size of the batch, but may be limited by memory performance at large values of the batch size.

GPU performance#

Benchmarking for GPU was performed on the following hardware configuration:

GPU: NVIDIA Tesla T4.

OS: CentOS Linux release 8.3.2011

"GPU. Detector performance"

Measurement	Batch Size	Average (ms)
Detector (minFaceSize=20)	1	31.8
Detector (minFaceSize=20)	4	35.0
Detector (minFaceSize=20)	8	38.9
Detector (minFaceSize=50)	1	7.9
Detector (minFaceSize=50)	4	6.9
Detector (minFaceSize=50)	8	6.6
Detector (minFaceSize=90)	1	5.2
Detector (minFaceSize=90)	4	3.8
Detector (minFaceSize=90)	8	3.4
Redetect	1	3.45
Redetect	4	1.91
Redetect	8	1.64
Redetect	16	1.51

"GPU. HumanDetector performance"

Measurement	Batch Size	Average (ms)
HumanDetector (imageSize=320)	1	4.7
HumanDetector (imageSize=320)	4	2.7
HumanDetector (imageSize=320)	8	2.5
HumanDetector (imageSize=640)	1	6.1
HumanDetector (imageSize=640)	4	5.5
HumanDetector (imageSize=640)	8	5.3
HumanLandmarksDetector (imageSize=320)	1	15.33
HumanLandmarksDetector (imageSize=320)	4	6.57
HumanLandmarksDetector (imageSize=320)	8	5.32
HumanLandmarksDetector (imageSize=640)	1	16.8
HumanLandmarksDetector (imageSize=640)	4	8.94
HumanLandmarksDetector (imageSize=640)	8	7.72
HumanRedetect	1	2.87
HumanRedetect	4	1.72
HumanRedetect	8	1.5
HumanRedetect	16	1.4

Below are the measurements for estimators that have a batch interface. All these measurements are performed with minFaceSize=50.

"GPU. Estimation performance with batch interface"

Measurement	Batch Size	Average (ms)
HeadPoseByImage	1	2.32
HeadPoseByImage	32	1.43
Eyes (INFRA_RED, useStatusPlan=0)	1	0.65
Eyes (INFRA_RED, useStatusPlan=0)	16	0.23
Eyes (INFRA_RED, useStatusPlan=0)	32	0.2
Eyes (RGB, useStatusPlan=0)	1	1.19
Eyes (RGB, useStatusPlan=0)	16	0.44
Eyes (RGB, useStatusPlan=0)	32	0.43
Eyes (INFRA_RED, useStatusPlan=1)	1	0.64
Eyes (INFRA_RED, useStatusPlan=1)	16	0.23
Eyes (INFRA_RED, useStatusPlan=1)	32	0.2
Eyes (RGB, useStatusPlan=1)	1	0.66
Eyes (RGB, useStatusPlan=1)	16	0.24
Eyes (RGB, useStatusPlan=1)	32	0.23
Infra-Red	1	1.11
Infra-Red	32	0.54
AGS	1	2.2
AGS	16	1.46
Child	1	2.66
Child	16	1.11
BlackWhite	1	1.05
BlackWhite	16	0.4
BestShotQuality	1	2.31
BestShotQuality	16	1.45
MedicalMask	1	5.01
MedicalMask	16	1.69
LivenessOneShotRGBEstimator	1	20.41
LivenessOneShotRGBEstimator	16	17.48
Orientation	1	3.56
Orientation	16	2.92
CredibilityCheck	1	5.54
CredibilityCheck	16	3.72
FacialHair	1	1.59
FacialHair	16	0.33
PortraitStyle	1	2.5
PortraitStyle	16	1.5
Background	1	2.6
Background	16	1.5
NaturalLight	1	3.61
NaturalLight	16	0.27
FishEye	1	2.91
FishEye	16	1.51
RedEye	1	1.1
RedEye	16	0.15
HeadWear	1	3.65
HeadWear	16	0.26
EyeBrowEstimator	1	1.8
EyeBrowEstimator	8	0.95
EyeBrowEstimator	16	0.88
EyeBrowEstimator	32	0.84

Below are the measurements for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50.

"GPU. Estimation performance without batch interface"

Measurement	Average (ms)
EyesGaze	1.65
Emotions	1.99
Attributes	4.95
Quality	0.98
Warper	2.26
Overlap	1.23
PPE	2.62
Glasses	1.01
Mouth	3.92
LivenessFlyingFaces	5.78
LivenessRGBMEstimator	6.96
LivenessFPR	12.56

"GPU. Extractor performance"

Type	Model	Batch Size	Average (ms)
Extractor	57	1	10.2
Extractor	57	16	6.5
Extractor	58	1	10.2
Extractor	58	16	6.4
Extractor	59	1	10.2
Extractor	59	16	6.4
Extractor	102	1	3.7
Extractor	102	16	0.3
Extractor	103	1	7.2
Extractor	103	16	3.7
Extractor	104	1	4.5
Extractor	104	16	0.6

NPU Performance#

Benchmarking for NPU was performed on the server with the following hardware configuration:

NPU: Huawei Atlas 300I (inference card).

OS: Ubuntu 18.04

CPU: Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz x 48

RAM: 64GB

"NPU. Detector performance"

Measurement	BatchSize	Average (ms)
Detector (minFaceSize=20)	1	25.7
Detector (minFaceSize=20)	4	18.7
Detector (minFaceSize=20)	8	17.3
Detector (minFaceSize=50)	1	25.7
Detector (minFaceSize=50)	4	18.0
Detector (minFaceSize=50)	8	17.3
Detector (minFaceSize=90)	1	25.5
Detector (minFaceSize=90)	4	18.0
Detector (minFaceSize=90)	8	17.1
Redetect	1	12.7
Redetect	4	6.0
Redetect	8	5.1

Below are the measurements for estimators that have a batch interface. All these measurements are performed with minFaceSize=50.

"NPU. Estimation performance with batch interface"

Measurement	BatchSize	Average (ms)
HeadPoseByImage	1	8.0
HeadPoseByImage	16	4.2
HeadPoseByImage	32	3.9
AGS	1	6.6
AGS	16	3.7
AGS	32	3.7
BestShotQuality	1	15.6
BestShotQuality	16	7.8
BestShotQuality	32	7.6
MedicalMask	1	6.1
MedicalMask	16	3.8
MedicalMask	32	3.7

Below is the measurement for Warper that does not have a batch interface. This measurement is performed with minFaceSize=50.

"NPU. Estimation performance without batch interface"

Measurement	Average (ms)
Warper	2.1

"NPU. Extractor performance"

Type	Model	Batch Size	Average (ms)
Extractor	57	1	10.9
Extractor	57	16	7.4

Embedded environment#

Face detection performance depends on input image parameters such as resolution and bit depth as well as the size of the detected face.

Input data characteristics:

Image resolution: 640x480px;
Image format: 24 BPP RGB;

The results for minimum batch size and optimal batch size are shown in the tables below. All the intermediate and non-optimal values are omitted.

Face detections are performed using FaceDetV3 NN.

Jetson#

Jetson does not use mobilenet by default.

Performance measurements are presented for Jetson. Measured values are averages of at least 100 experiments. Mobilenet is not used by default.

Jetson TX#

"Jetson TX GPU. Detector performance"

Type	Batch Size	Average (ms)
Detector (minFaceSize=20)	1	499.59
Detector (minFaceSize=20)	4	470.32
Detector (minFaceSize=50)	1	88.97
Detector (minFaceSize=50)	4	80.13
Detector (minFaceSize=50)	8	79.67
Detector (minFaceSize=90)	1	35.66
Detector (minFaceSize=90)	4	30.14
Detector (minFaceSize=90)	8	29.48
Redetect	1	9.5
Redetect	4	5.2
Redetect	8	4.5

"Jetson TX GPU. HumanDetector performance"

Type	Batch Size	Average (ms)
HumanDetector (imageSize=320)	1	16.28
HumanDetector (imageSize=320)	4	14.81
HumanDetector (imageSize=320)	8	14.27
HumanDetector (imageSize=640)	1	47.7
HumanDetector (imageSize=640)	4	44.3
HumanDetector (imageSize=640)	8	42.0
HumanLandmarksDetector (imageSize=320)	1	67.3
HumanLandmarksDetector (imageSize=320)	4	35.15
HumanLandmarksDetector (imageSize=320)	8	32.94
HumanLandmarksDetector (imageSize=640)	1	99.05
HumanLandmarksDetector (imageSize=640)	4	64.64
HumanLandmarksDetector (imageSize=640)	8	61.68
HumanRedetect	1	6.08
HumanRedetect	4	3.71
HumanRedetect	8	3.46

Below are the measurements for estimators that have a batch interface. All these measurements are performed with minFaceSize=50.

"Jetson TX GPU. Estimation performance with batch interface"

Type	Batch Size	Average (ms)
HeadPoseByImage	1	8.85
HeadPoseByImage	32	2.82
Eyes (INFRA_RED, useStatusPlan=0)	1	1.53
Eyes (INFRA_RED, useStatusPlan=0)	16	1.02
Eyes (INFRA_RED, useStatusPlan=0)	32	0.93
Eyes (RGB, useStatusPlan=0)	1	2.83
Eyes (RGB, useStatusPlan=0)	16	1.68
Eyes (RGB, useStatusPlan=0)	32	1.65
Eyes (INFRA_RED, useStatusPlan=1)	1	1.49
Eyes (INFRA_RED, useStatusPlan=1)	16	1.17
Eyes (INFRA_RED, useStatusPlan=1)	32	1.1
Eyes (RGB, useStatusPlan=1)	1	2.82
Eyes (RGB, useStatusPlan=1)	16	1.68
Eyes (RGB, useStatusPlan=1)	32	1.6
Infra-Red	1	3.29
AGS	1	5.02
AGS	16	2.57
Child	1	15.23
Child	16	8.95
BlackWhite	1	3.0
BlackWhite	16	1.1
BestShotQuality	1	5.41
BestShotQuality	16	2.59
MedicalMask	1	13.4
MedicalMask	32	4.98
LivenessOneShotRGBEstimator	1	188.8
Orientation	1	26.3
CredibilityCheck	1	44.5
CredibilityCheck	8	35.7
CredibilityCheck	16	34.4
CredibilityCheck	32	34.1
FacialHair	1	3.6
FacialHair	16	2.7
PortraitStyle	1	7.1
PortraitStyle	16	4.0
Background	1	7.2
Background	16	3.9
NaturalLight	1	13.8
NaturalLight	16	1.5
FishEye	1	8.24
FishEye	16	5.41
RedEye	1	2.1
RedEye	16	0.8
HeadWear	1	14.1
HeadWear	16	1.58
EyeBrowEstimator	1	11.81
EyeBrowEstimator	8	10.32
EyeBrowEstimator	16	9.81
EyeBrowEstimator	32	9.57

Below are the measurements for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50.

"Jetson TX GPU. Estimation performance without batch interface"

Type	Average (ms)
EyesGaze	4.29
Emotions	11.96
Attributes	27.24
Quality	2.17
Warper	8.08
Overlap	3.98
Glasses	3.63
PPE	9.96
Mouth	15.32
LivenessFlyingFaces	19.68
LivenessRGBMEstimator	64.42
LivenessFPR	62.67

"Jetson TX GPU. Extractor performance"

Type	Model	Batch Size	Average (ms)
Extractor	57	1	76.07
Extractor	57	8	62.03
Extractor	58	1	76.15
Extractor	58	8	61.63
Extractor	59	1	76.15
Extractor	59	8	61.64
Extractor	102	1	17.31
Extractor	102	8	2.61
Extractor	103	1	45.64
Extractor	103	8	32.34
Extractor	104	1	15.23
Extractor	104	8	5.41

Jetson Xavier#

"Jetson Xavier GPU. Detector performance"

Type	Batch Size	Average (ms)
Detector (minFaceSize=20)	1	89.56
Detector (minFaceSize=20)	4	102.86
Detector (minFaceSize=20)	8	153.48
Detector (minFaceSize=50)	1	19.27
Detector (minFaceSize=50)	4	16.73
Detector (minFaceSize=50)	8	16.24
Detector (minFaceSize=90)	1	10.38
Detector (minFaceSize=90)	4	7.41
Detector (minFaceSize=90)	8	6.87
Redetect	1	6.4
Redetect	4	2.9
Redetect	8	2.3

"Jetson Xavier GPU. HumanDetector performance"

Type	Batch Size	Average (ms)
HumanDetector (imageSize=320)	1	10.41
HumanDetector (imageSize=320)	4	7.53
HumanDetector (imageSize=320)	8	6.75
HumanDetector (imageSize=640)	1	22.33
HumanDetector (imageSize=640)	4	19.81
HumanDetector (imageSize=640)	8	19.05
HumanLandmarksDetector (imageSize=320)	1	38.99
HumanLandmarksDetector (imageSize=320)	4	22.14
HumanLandmarksDetector (imageSize=320)	8	18.58
HumanLandmarksDetector (imageSize=640)	1	51.76
HumanLandmarksDetector (imageSize=640)	4	34.93
HumanLandmarksDetector (imageSize=640)	8	31.14
HumanRedetect	1	3.6
HumanRedetect	4	1.95
HumanRedetect	8	1.68

Below are the measurements for estimators that have a batch interface. All these measurements are performed with minFaceSize=50.

"Jetson Xavier GPU. Estimation performance with batch interface"

Type	Batch Size	Average (ms)
HeadPoseByImage	1	4.38
HeadPoseByImage	32	0.89
Eyes (INFRA_RED, useStatusPlan=0)	1	1.12
Eyes (INFRA_RED, useStatusPlan=0)	16	0.53
Eyes (INFRA_RED, useStatusPlan=0)	32	0.48
Eyes (RGB, useStatusPlan=0)	1	2.17
Eyes (RGB, useStatusPlan=0)	16	1.0
Eyes (RGB, useStatusPlan=0)	32	0.99
Eyes (INFRA_RED, useStatusPlan=1)	1	1.12
Eyes (INFRA_RED, useStatusPlan=1)	16	0.51
Eyes (INFRA_RED, useStatusPlan=1)	32	0.5
Eyes (RGB, useStatusPlan=1)	1	2.16
Eyes (RGB, useStatusPlan=1)	16	1.1
Eyes (RGB, useStatusPlan=1)	32	0.99
Infra-Red	1	2.3
Infra-Red	32	1.25
AGS	1	2.83
AGS	32	0.86
Child	1	8.37
Child	8	5.88
BlackWhite	1	2.2
BlackWhite	16	0.6
BestShotQuality	1	3.04
BestShotQuality	32	0.88
MedicalMask	1	6.59
MedicalMask	32	3.45
LivenessOneShotRGBEstimator	1	97.95
LivenessOneShotRGBEstimator	8	81.8
Orientation	1	11.6
Orientation	32	9.75
CredibilityCheck	1	35.2
CredibilityCheck	8	25.09
CredibilityCheck	16	24.64
CredibilityCheck	32	24.22
FacialHair	1	3.35
FacialHair	16	1.84
PortraitStyle	1	3.6
PortraitStyle	16	1.8
Background	1	3.8
Background	16	1.8
NaturalLight	1	3.6
NaturalLight	16	1.5
FishEye	1	4.75
FishEye	16	2.36
RedEye	1	2.0
RedEye	16	0.5
HeadWear	1	4.34
HeadWear	16	1.49
EyeBrowEstimator	1	7.21
EyeBrowEstimator	8	5.32
EyeBrowEstimator	16	5.16
EyeBrowEstimator	32	5.02

Below are the measurements for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50.

"Jetson Xavier GPU. Estimation performance without batch interface"

Type	Average (ms)
EyesGaze	2.99
Emotions	7.48
Attributes	20.3
Quality	1.64
Warper	6.63
Overlap	3.03
PPE	6.43
Glasses	2.14
Mouth	5.86
LivenessFlyingFaces	6.98
LivenessRGBMEstimator	27.14
LivenessFPR	39.41

"Jetson Xavier GPU. Extractor performance"

Type	Model	Batch Size	Average (ms)
Extractor	57	1	66.4
Extractor	57	8	44.1
Extractor	58	1	66.2
Extractor	58	8	44.1
Extractor	59	1	66.3
Extractor	59	8	44.1
Extractor	102	1	8.3
Extractor	102	8	0.98
Extractor	103	1	18.3
Extractor	103	8	19.4
Extractor	104	1	6.6
Extractor	104	8	2.4

Jetson Xavier NX#

"Jetson Xavier NX GPU. Detector performance"

Type	Batch Size	Average (ms)
Detector (minFaceSize=20)	1	172.28
Detector (minFaceSize=20)	4	171.78
Detector (minFaceSize=20)	8	238.0
Detector (minFaceSize=50)	1	32.12
Detector (minFaceSize=50)	4	32.21
Detector (minFaceSize=50)	8	29.32
Detector (minFaceSize=90)	1	15.57
Detector (minFaceSize=90)	4	12.19
Detector (minFaceSize=90)	8	11.57
Redetect	1	6.9
Redetect	4	2.8
Redetect	8	2.3

"Jetson Xavier NX GPU. HumanDetector performance"

Type	Batch Size	Average (ms)
HumanDetector (imageSize=320)	1	9.49
HumanDetector (imageSize=320)	4	7.86
HumanDetector (imageSize=320)	8	7.26
HumanDetector (imageSize=640)	1	24.39
HumanDetector (imageSize=640)	4	23.12
HumanDetector (imageSize=640)	8	22.51
HumanLandmarksDetector (imageSize=320)	1	40.7
HumanLandmarksDetector (imageSize=320)	4	20.4
HumanLandmarksDetector (imageSize=320)	8	17.9
HumanLandmarksDetector (imageSize=640)	1	59.7
HumanLandmarksDetector (imageSize=640)	4	33.1
HumanLandmarksDetector (imageSize=640)	8	30.5
HumanRedetect	1	4.45
HumanRedetect	4	2.0
HumanRedetect	8	1.75

Below are the measurements for estimators that have a batch interface. All these measurements are performed with minFaceSize=50.

"Jetson Xavier NX GPU. Estimation performance with batch interface"

Type	Batch Size	Average (ms)
HeadPoseByImage	1	5.6
HeadPoseByImage	32	1.3
Eyes (INFRA_RED, useStatusPlan=0)	1	1.36
Eyes (INFRA_RED, useStatusPlan=0)	16	0.65
Eyes (INFRA_RED, useStatusPlan=0)	32	0.6
Eyes (RGB, useStatusPlan=0)	1	2.21
Eyes (RGB, useStatusPlan=0)	16	1.09
Eyes (RGB, useStatusPlan=0)	32	1.01
Eyes (INFRA_RED, useStatusPlan=1)	1	1.37
Eyes (INFRA_RED, useStatusPlan=1)	16	0.71
Eyes (INFRA_RED, useStatusPlan=1)	32	0.65
Eyes (RGB, useStatusPlan=1)	1	2.48
Eyes (RGB, useStatusPlan=1)	16	1.31
Eyes (RGB, useStatusPlan=1)	32	1.21
Infra-Red	1	2.32
Infra-Red	32	1.49
AGS	1	3.41
AGS	32	1.25
Child	1	7.85
Child	8	5.49
BlackWhite	1	2.4
BlackWhite	16	0.7
BestShotQuality	1	3.59
BestShotQuality	32	1.27
MedicalMask	1	7.01
MedicalMask	32	3.41
LivenessOneShotRGBEstimator	1	112.7
LivenessOneShotRGBEstimator	16	81.81
Orientation	1	11.57
Orientation	32	10.17
CredibilityCheck	1	31.05
CredibilityCheck	8	22.59
CredibilityCheck	16	21.91
CredibilityCheck	32	21.5
FacialHair	1	2.97
FacialHair	16	1.63
PortraitStyle	1	4.2
PortraitStyle	16	2.0
Background	1	4.0
Background	16	2.1
NaturalLight	1	4.48
NaturalLight	16	1.26
FishEye	1	5.01
FishEye	16	2.42
RedEye	1	2.1
RedEye	16	0.5
HeadWear	1	4.96
HeadWear	16	1.27
EyeBrowEstimator	1	6.27
EyeBrowEstimator	8	5.14
EyeBrowEstimator	16	4.89
EyeBrowEstimator	32	4.79

Below are the measurements for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50.

"Jetson Xavier NX GPU. Estimation performance without batch interface"

Type	Average (ms)
EyesGaze	3.7
Emotions	6.8
Attributes	17.36
Quality	1.59
Warper	9.82
Overlap	3.56
PPE	6.31
Glasses	2.05
Mouth	6.61
LivenessFlyingFaces	8.46
LivenessRGBMEstimator	28.1
LivenessFPR	40.5

"Jetson Xavier NX GPU. Extractor performance"

Type	Model	Batch Size	Average (ms)
Extractor	57	1	58.2
Extractor	57	16	38.1
Extractor	58	1	58.0
Extractor	58	16	38.1
Extractor	59	1	58.0
Extractor	59	16	38.0
Extractor	102	1	10.7
Extractor	102	16	1.0
Extractor	103	1	28.4
Extractor	103	16	41.3
Extractor	104	1	9.8
Extractor	104	16	3.6

Jetson Nano#

"Jetson Nano GPU. Detector performance"

Type	Batch Size	Average (ms)
Detector (minFaceSize=20)	1	1749.35
Detector (minFaceSize=50)	1	321.64
Detector (minFaceSize=90)	1	117.22
Redetect	1	18.2

"Jetson Nano GPU. HumanDetector performance"

Type	Batch Size	Average (ms)
HumanDetector (imageSize=320)	1	60.89
HumanDetector (imageSize=320)	4	58.35
HumanDetector (imageSize=640)	1	188.86
HumanDetector (imageSize=640)	4	189.72
HumanLandmarksDetector (imageSize=320)	1	174.27
HumanLandmarksDetector (imageSize=320)	4	148.13
HumanLandmarksDetector (imageSize=640)	1	341.1
HumanLandmarksDetector (imageSize=640)	4	252.7
HumanRedetect	1	10.63
HumanRedetect	4	7.5

Below are the measurements for estimators that have a batch interface. All these measurements are performed with minFaceSize=50.

"Jetson Nano GPU. Estimation performance with batch interface"

Type	Batch Size	Average (ms)
HeadPoseByImage	1	7.45
HeadPoseByImage	4	4.08
Eyes (INFRA_RED, useStatusPlan=0)	1	3.37
Eyes (INFRA_RED, useStatusPlan=0)	4	2.46
Eyes (RGB, useStatusPlan=0)	1	6.85
Eyes (RGB, useStatusPlan=0)	4	5.52
Eyes (INFRA_RED, useStatusPlan=1)	1	3.07
Eyes (INFRA_RED, useStatusPlan=1)	4	2.42
Eyes (RGB, useStatusPlan=1)	1	7.02
Eyes (RGB, useStatusPlan=1)	4	5.53
Infra-Red	1	10.1
Infra-Red	4	8.89
AGS	1	6.3
AGS	4	3.87
Child	1	59.89
Child	4	48.3
BlackWhite	1	5.8
BlackWhite	4	3.1
BestShotQuality	1	6.55
BestShotQuality	4	4.05
MedicalMask	1	26.38
MedicalMask	4	19.45
LivenessOneShotRGBEstimator	1	1120.7
LivenessOneShotRGBEstimator	4	1110.2
Orientation	1	113.0
Orientation	4	106.1
CredibilityCheck	1	271.18
CredibilityCheck	4	226.63
FacialHair	1	14.58
FacialHair	4	14.37
PortraitStyle	1	12.1
PortraitStyle	4	10.0
Background	1	12.2
Background	4	9.7
NaturalLight	1	28.45
NaturalLight	4	10.86
FishEye	1	19.17
FishEye	16	17.55
RedEye	1	6.7
RedEye	16	4.4
HeadWear	1	28.2
HeadWear	16	11.5
EyeBrowEstimator	1	46.87
EyeBrowEstimator	4	46.53

Below are the measurements for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50.

"Jetson Nano GPU. Estimation performance without batch interface"

Type	Average (ms)
EyesGaze	12.8
Emotions	48.9
Attributes	129.57
Quality	3.98
Warper	10.54
Overlap	9.76
PPE	30.43
Glasses	10.71
Mouth	38.38
LivenessFlyingFaces	41.59
LivenessRGBMEstimator	169.7
LivenessFPR	201.0

"Jetson Nano GPU. Extractor performance"

Type	Model	Batch Size	Average (ms)
Extractor	58	1	442.35
Extractor	58	4	403.95
Extractor	59	1	428.35
Extractor	59	4	411.47
Extractor	102	1	26.17
Extractor	102	4	9.9
Extractor	103	1	254.11
Extractor	103	4	215.07
Extractor	104	1	39.97
Extractor	104	4	31.63

Descriptor size#

Table below shows size of serialized face descriptors to estimate memory requirements.

"Descriptor size"

Face descriptor version	Data size (bytes)	Metadata size (bytes)	Total size
CNN 54	512	8	520
CNN 56	512	8	520
CNN 57	512	8	520
CNN 58	512	8	520
CNN 59	512	8	520

Table below shows size of serialized human descriptors to estimate memory requirements. Human descriptors are used only for reidentification tasks.

"Human descriptor size (used only for reidentification tasks)"

Human descriptor version	Data size (bytes)	Metadata size (bytes)	Total size
CNN 102	2048	8	2056
CNN 103	2048	8	2056
CNN 104	2048	8	2056

Metadata includes signature and version information that may be omitted during serialization if the NoSignature flag is specified.

When estimating individual descriptor size in memory or serialization storage requirements with default options, consider using values from the "Total size" column.

When estimating memory requirements for descriptor batches, use values from the "Data size" column instead, since a descriptor batch does not duplicate metadata per descriptor and thus is more memory-efficient.

These numbers are for approximate computation only, since they do not include overhead like memory alignment for accelerated SIMD processing and the like.