Appendix A. Specifications#
Classification performance#
Classification performance was measured on a two datasets:
- Cooperative dataset ( containing 20K images from various sources obtained at several banks);
- Non cooperative dataset ( containing 20K ).
The two tables below contain true positive rates corresponding to select false positive rates.
"Classification performance @ low FPR on cooperative dataset"
FPR | TPR CNN 54 | TPR CNN 56 | TPR CNN 57 | TPR CNN 58 | TPR CNN 59 | TPR CNN 54m | TPR CNN 56m | TPR CNN 59m |
---|---|---|---|---|---|---|---|---|
10^-7^ | 0.9765 | 0.9907 | 0.9906 | 0.9910 | 0.9911 | 0.9699 | 0.9652 | 0.9876 |
10^-6^ | 0.9849 | 0.9914 | 0.9915 | 0.9916 | 0.9915 | 0.9829 | 0.9814 | 0.9904 |
10^-5^ | 0.9892 | 0.9916 | 0.9917 | 0.9918 | 0.9919 | 0.9887 | 0.9886 | 0.9915 |
10^-4^ | 0.9909 | 0.9917 | 0.9918 | 0.9919 | 0.9921 | 0.9910 | 0.9910 | 0.9919 |
"Classification performance @ low FPR on non cooperative dataset"
FPR | TPR CNN 54 | TPR CNN 56 | TPR CNN 57 | TPR CNN 58 | TPR CNN 59 | TPR CNN 54m | TPR CNN 56m | TPR CNN 59m |
---|---|---|---|---|---|---|---|---|
10^-7^ | 0.9638 | 0.9698 | 0.9723 | 0.9767 | 0.9832 | 0.8813 | 0.8844 | 0.9377 |
10^-6^ | 0.9773 | 0.9809 | 0.9817 | 0.9839 | 0.9880 | 0.9233 | 0.9229 | 0.9629 |
10^-5^ | 0.9852 | 0.9871 | 0.9873 | 0.9880 | 0.9908 | 0.9538 | 0.9561 | 0.9794 |
10^-4^ | 0.9896 | 0.9902 | 0.9905 | 0.9909 | 0.9924 | 0.9752 | 0.9757 | 0.9880 |
Runtime performance#
Server environment#
Face detection performance depends on input image parameters such as resolution and bit depth as well as the size of the detected face.
Input data characteristics:
- Image resolution: 1920x1080px;
- Image format: 24 BPP RGB;
Performance measurements are presented for CPU, GPU and NPU execution modes in tables below. Measured values are averages of at least 100 experiments.
The results for minimum batch size and optimal batch size are shown in the tables below. All the intermediate and non-optimal values are omitted.
Face detections are performed using FaceDetV3 NN.
All types of face detection and redetect performed with capturing bounding boxes and 5 facial landmarks.
CPU performance#
Benchmarking for CPU was performed on the server with the following hardware configuration:
CPU:
- Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz;
- CPU(s): 40
- Thread(s) per core: 2
- Core(s) per socket: 10
- Socket(s): 2
- NUMA node(s): 2
- CPU with AVX2 instruction set was used
OS: CentOS Linux release 8.3.2011
RAM: 128 GB DDR4 (Clock Speed: 2133 MHz)
In experiments listed in tables below face detection and descriptor extraction algorithms used all available CPU cores, whereas matching performance is specified per-core.
Descriptor matching is only implemented on CPU.
"CPU. Detector performance"
Measurement | CPU threads | BatchSize | Average (ms) |
---|---|---|---|
Detector (minFaceSize=20) | 1 | 1 | 358.3 |
Detector (minFaceSize=20) | 8 | 1 | 169.6 |
Detector (minFaceSize=20) | 8 | 4 | 166.4 |
Detector (minFaceSize=20) | 8 | 8 | 169.2 |
Detector (minFaceSize=50) | 1 | 1 | 55.8 |
Detector (minFaceSize=50) | 8 | 1 | 27.1 |
Detector (minFaceSize=50) | 8 | 4 | 25.1 |
Detector (minFaceSize=50) | 8 | 8 | 26.5 |
Detector (minFaceSize=90) | 1 | 1 | 18.9 |
Detector (minFaceSize=90) | 8 | 1 | 12.3 |
Detector (minFaceSize=90) | 8 | 4 | 8.5 |
Detector (minFaceSize=90) | 8 | 8 | 9.2 |
Redetect | 1 | 1 | 4.05 |
Redetect | 8 | 1 | 2.99 |
Redetect | 8 | 4 | 1.5 |
Redetect | 8 | 8 | 1.34 |
"CPU. HumanDetector performance"
Measurement | CPU threads | BatchSize | Average (ms) |
---|---|---|---|
HumanDetector (imageSize=320) | 1 | 1 | 12.5 |
HumanDetector (imageSize=320) | 8 | 1 | 7.0 |
HumanDetector (imageSize=320) | 8 | 4 | 4.4 |
HumanDetector (imageSize=320) | 8 | 8 | 4.4 |
HumanDetector (imageSize=640) | 1 | 1 | 39.8 |
HumanDetector (imageSize=640) | 8 | 1 | 19.2 |
HumanDetector (imageSize=640) | 8 | 4 | 16.4 |
HumanDetector (imageSize=640) | 8 | 8 | 16.8 |
HumanLandmarksDetector (imageSize=320) | 1 | 1 | 44.5 |
HumanLandmarksDetector (imageSize=320) | 8 | 1 | 20.6 |
HumanLandmarksDetector (imageSize=320) | 8 | 4 | 13.0 |
HumanLandmarksDetector (imageSize=320) | 8 | 8 | 13.3 |
HumanLandmarksDetector (imageSize=640) | 1 | 1 | 72.6 |
HumanLandmarksDetector (imageSize=640) | 8 | 1 | 32.5 |
HumanLandmarksDetector (imageSize=640) | 8 | 4 | 24.8 |
HumanLandmarksDetector (imageSize=640) | 8 | 8 | 25.7 |
HumanRedetect | 1 | 1 | 2.42 |
HumanRedetect | 8 | 1 | 2.47 |
HumanRedetect | 8 | 4 | 1.13 |
HumanRedetect | 8 | 8 | 1.11 |
Below are the measurements for estimators that have a batch interface. All these measurements are performed with minFaceSize=50
.
"CPU. Estimation performance with batch interface"
Measurement | CPU threads | BatchSize | Average (ms) |
---|---|---|---|
Eyes (INFRA_RED, useStatusPlan=0) | 1 | 1 | 0.6 |
Eyes (INFRA_RED, useStatusPlan=0) | 8 | 1 | 0.4 |
Eyes (INFRA_RED, useStatusPlan=0) | 8 | 8 | 0.3 |
Eyes (RGB, useStatusPlan=0) | 1 | 1 | 1.2 |
Eyes (RGB, useStatusPlan=0) | 8 | 1 | 0.8 |
Eyes (RGB, useStatusPlan=0) | 8 | 8 | 0.5 |
Eyes (INFRA_RED, useStatusPlan=1) | 1 | 1 | 0.6 |
Eyes (INFRA_RED, useStatusPlan=1) | 8 | 1 | 0.4 |
Eyes (INFRA_RED, useStatusPlan=1) | 8 | 8 | 0.3 |
Eyes (RGB, useStatusPlan=1) | 1 | 1 | 1.1 |
Eyes (RGB, useStatusPlan=1) | 8 | 1 | 0.8 |
Eyes (RGB, useStatusPlan=1) | 8 | 8 | 0.5 |
Infra-Red | 1 | 1 | 2 |
Infra-Red | 8 | 1 | 1.0 |
Infra-Red | 8 | 8 | 0.7 |
AGS | 1 | 1 | 0.3 |
AGS | 8 | 1 | 0.2 |
AGS | 8 | 8 | 0.07 |
HeadPoseByImage | 1 | 1 | 0.3 |
HeadPoseByImage | 8 | 1 | 0.3 |
HeadPoseByImage | 8 | 8 | 0.09 |
Child | 1 | 1 | 18.7 |
Child | 8 | 1 | 6.3 |
Child | 8 | 8 | 5.2 |
BlackWhite | 1 | 1 | 1.3 |
BlackWhite | 8 | 1 | 0.7 |
BlackWhite | 8 | 8 | 1.2 |
BestShotQuality | 1 | 1 | 0.3 |
BestShotQuality | 8 | 1 | 0.2 |
BestShotQuality | 8 | 8 | 0.08 |
MedicalMask | 1 | 1 | 5.6 |
MedicalMask | 8 | 1 | 3.2 |
MedicalMask | 8 | 8 | 1.5 |
LivenessOneShotRGBEstimator | 1 | 1 | 214.6 |
LivenessOneShotRGBEstimator | 8 | 1 | 58.7 |
LivenessOneShotRGBEstimator | 8 | 8 | 78.8 |
Orientation | 1 | 1 | 20.8 |
Orientation | 8 | 1 | 10.1 |
Orientation | 8 | 8 | 8.9 |
CredibilityCheck | 1 | 1 | 120.3 |
CredibilityCheck | 8 | 1 | 35.1 |
CredibilityCheck | 8 | 8 | 34.1 |
FacialHair | 1 | 1 | 2.7 |
FacialHair | 8 | 1 | 1.9 |
FacialHair | 8 | 8 | 0.99 |
PortraitStyle | 1 | 1 | 1.0 |
PortraitStyle | 8 | 1 | 1.2 |
PortraitStyle | 8 | 8 | 1.7 |
Background | 1 | 1 | 1.1 |
Background | 8 | 1 | 1.2 |
Background | 8 | 8 | 1.7 |
NaturalLight | 1 | 1 | 2.37 |
NaturalLight | 8 | 1 | 1.49 |
NaturalLight | 8 | 8 | 1.97 |
FishEye | 1 | 1 | 2.77 |
FishEye | 8 | 1 | 2.08 |
FishEye | 8 | 8 | 5.86 |
RedEye | 1 | 1 | 5.7 |
RedEye | 8 | 1 | 1.9 |
RedEye | 8 | 8 | 1.6 |
HeadWear | 1 | 1 | 2.22 |
HeadWear | 8 | 1 | 1.51 |
HeadWear | 8 | 8 | 1.96 |
EyeBrowEstimator | 1 | 1 | 13.82 |
EyeBrowEstimator | 8 | 1 | 4.77 |
EyeBrowEstimator | 8 | 8 | 3.05 |
Below are the measurements for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50
.
"CPU. Estimation performance without batch interface"
Measurement | CPU threads | Average (ms) |
---|---|---|
EyesGaze | 1 | 2.2 |
EyesGaze | 8 | 1.4 |
Emotions | 1 | 13.6 |
Emotions | 8 | 4.9 |
Attributes | 1 | 63.3 |
Attributes | 8 | 19.8 |
Quality | 1 | 1.2 |
Quality | 8 | 0.6 |
Warper | 1 | 2.2 |
Warper | 8 | 2.3 |
Overlap | 1 | 4.5 |
Overlap | 8 | 1.3 |
Glasses | 1 | 1.8 |
Glasses | 8 | 0.8 |
Mouth | 1 | 6.9 |
Mouth | 8 | 2.69 |
PPE | 1 | 8.9 |
PPE | 8 | 4.9 |
LivenessFlyingFaces | 1 | 9.2 |
LivenessFlyingFaces | 8 | 5.0 |
LivenessRGBMEstimator | 1 | 30.6 |
LivenessRGBMEstimator | 8 | 9.7 |
LivenessFRP | 1 | 44.2 |
LivenessFRP | 8 | 19.9 |
"CPU. Extractor performance"
Type | Model | CPU threads | Average (ms) |
---|---|---|---|
Extractor | 57 | 1 | 221.2 |
Extractor | 57 | 8 | 58.3 |
Extractor | 58 | 1 | 219.3 |
Extractor | 58 | 8 | 58.0 |
Extractor | 59 | 1 | 219.7 |
Extractor | 59 | 8 | 58.2 |
Extractor | 102 | 1 | 1.8 |
Extractor | 102 | 8 | 2.1 |
Extractor | 103 | 1 | 142.2 |
Extractor | 103 | 8 | 50.6 |
Extractor | 104 | 1 | 12.6 |
Extractor | 104 | 8 | 6.2 |
The following table includes average matcher per second for descriptors received using the following CNN model versions:
- face descriptors: 57, 58, 59
- human body descriptors: 102, 103, 104
"CPU. Matcher performance"
Type | Model | CPU threads | Batch Size | Average (matches/sec) |
---|---|---|---|---|
Matcher | 57, 58, 59 | 1 | 1000 | 42.2 M |
Matcher | 102, 103, 104 | 1 | 1000 | 10.17 M |
Note: The above value is the maximum performance of the matcher on a particular piece of hardware. Performance in general does not depend on the size of the batch, but may be limited by memory performance at large values of the batch size.
GPU performance#
Benchmarking for GPU was performed on the following hardware configuration:
GPU: NVIDIA Tesla T4.
OS: CentOS Linux release 8.3.2011
"GPU. Detector performance"
Measurement | Batch Size | Average (ms) |
---|---|---|
Detector (minFaceSize=20) | 1 | 31.8 |
Detector (minFaceSize=20) | 4 | 35.0 |
Detector (minFaceSize=20) | 8 | 38.9 |
Detector (minFaceSize=50) | 1 | 7.9 |
Detector (minFaceSize=50) | 4 | 6.9 |
Detector (minFaceSize=50) | 8 | 6.6 |
Detector (minFaceSize=90) | 1 | 5.2 |
Detector (minFaceSize=90) | 4 | 3.8 |
Detector (minFaceSize=90) | 8 | 3.4 |
Redetect | 1 | 3.45 |
Redetect | 4 | 1.91 |
Redetect | 8 | 1.64 |
Redetect | 16 | 1.51 |
"GPU. HumanDetector performance"
Measurement | Batch Size | Average (ms) |
---|---|---|
HumanDetector (imageSize=320) | 1 | 4.7 |
HumanDetector (imageSize=320) | 4 | 2.7 |
HumanDetector (imageSize=320) | 8 | 2.5 |
HumanDetector (imageSize=640) | 1 | 6.1 |
HumanDetector (imageSize=640) | 4 | 5.5 |
HumanDetector (imageSize=640) | 8 | 5.3 |
HumanLandmarksDetector (imageSize=320) | 1 | 15.33 |
HumanLandmarksDetector (imageSize=320) | 4 | 6.57 |
HumanLandmarksDetector (imageSize=320) | 8 | 5.32 |
HumanLandmarksDetector (imageSize=640) | 1 | 16.8 |
HumanLandmarksDetector (imageSize=640) | 4 | 8.94 |
HumanLandmarksDetector (imageSize=640) | 8 | 7.72 |
HumanRedetect | 1 | 2.87 |
HumanRedetect | 4 | 1.72 |
HumanRedetect | 8 | 1.5 |
HumanRedetect | 16 | 1.4 |
Below are the measurements for estimators that have a batch interface. All these measurements are performed with minFaceSize=50
.
"GPU. Estimation performance with batch interface"
Measurement | Batch Size | Average (ms) |
---|---|---|
HeadPoseByImage | 1 | 2.32 |
HeadPoseByImage | 32 | 1.43 |
Eyes (INFRA_RED, useStatusPlan=0) | 1 | 0.65 |
Eyes (INFRA_RED, useStatusPlan=0) | 16 | 0.23 |
Eyes (INFRA_RED, useStatusPlan=0) | 32 | 0.2 |
Eyes (RGB, useStatusPlan=0) | 1 | 1.19 |
Eyes (RGB, useStatusPlan=0) | 16 | 0.44 |
Eyes (RGB, useStatusPlan=0) | 32 | 0.43 |
Eyes (INFRA_RED, useStatusPlan=1) | 1 | 0.64 |
Eyes (INFRA_RED, useStatusPlan=1) | 16 | 0.23 |
Eyes (INFRA_RED, useStatusPlan=1) | 32 | 0.2 |
Eyes (RGB, useStatusPlan=1) | 1 | 0.66 |
Eyes (RGB, useStatusPlan=1) | 16 | 0.24 |
Eyes (RGB, useStatusPlan=1) | 32 | 0.23 |
Infra-Red | 1 | 1.11 |
Infra-Red | 32 | 0.54 |
AGS | 1 | 2.2 |
AGS | 16 | 1.46 |
Child | 1 | 2.66 |
Child | 16 | 1.11 |
BlackWhite | 1 | 1.05 |
BlackWhite | 16 | 0.4 |
BestShotQuality | 1 | 2.31 |
BestShotQuality | 16 | 1.45 |
MedicalMask | 1 | 5.01 |
MedicalMask | 16 | 1.69 |
LivenessOneShotRGBEstimator | 1 | 20.41 |
LivenessOneShotRGBEstimator | 16 | 17.48 |
Orientation | 1 | 3.56 |
Orientation | 16 | 2.92 |
CredibilityCheck | 1 | 5.54 |
CredibilityCheck | 16 | 3.72 |
FacialHair | 1 | 1.59 |
FacialHair | 16 | 0.33 |
PortraitStyle | 1 | 2.5 |
PortraitStyle | 16 | 1.5 |
Background | 1 | 2.6 |
Background | 16 | 1.5 |
NaturalLight | 1 | 3.61 |
NaturalLight | 16 | 0.27 |
FishEye | 1 | 2.91 |
FishEye | 16 | 1.51 |
RedEye | 1 | 1.1 |
RedEye | 16 | 0.15 |
HeadWear | 1 | 3.65 |
HeadWear | 16 | 0.26 |
EyeBrowEstimator | 1 | 1.8 |
EyeBrowEstimator | 8 | 0.95 |
EyeBrowEstimator | 16 | 0.88 |
EyeBrowEstimator | 32 | 0.84 |
Below are the measurements for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50
.
"GPU. Estimation performance without batch interface"
Measurement | Average (ms) |
---|---|
EyesGaze | 1.65 |
Emotions | 1.99 |
Attributes | 4.95 |
Quality | 0.98 |
Warper | 2.26 |
Overlap | 1.23 |
PPE | 2.62 |
Glasses | 1.01 |
Mouth | 3.92 |
LivenessFlyingFaces | 5.78 |
LivenessRGBMEstimator | 6.96 |
LivenessFPR | 12.56 |
"GPU. Extractor performance"
Type | Model | Batch Size | Average (ms) |
---|---|---|---|
Extractor | 57 | 1 | 10.2 |
Extractor | 57 | 16 | 6.5 |
Extractor | 58 | 1 | 10.2 |
Extractor | 58 | 16 | 6.4 |
Extractor | 59 | 1 | 10.2 |
Extractor | 59 | 16 | 6.4 |
Extractor | 102 | 1 | 3.7 |
Extractor | 102 | 16 | 0.3 |
Extractor | 103 | 1 | 7.2 |
Extractor | 103 | 16 | 3.7 |
Extractor | 104 | 1 | 4.5 |
Extractor | 104 | 16 | 0.6 |
NPU Performance#
Benchmarking for NPU was performed on the server with the following hardware configuration:
NPU: Huawei Atlas 300I (inference card).
OS: Ubuntu 18.04
CPU: Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz x 48
RAM: 64GB
"NPU. Detector performance"
Measurement | BatchSize | Average (ms) |
---|---|---|
Detector (minFaceSize=20) | 1 | 25.7 |
Detector (minFaceSize=20) | 4 | 18.7 |
Detector (minFaceSize=20) | 8 | 17.3 |
Detector (minFaceSize=50) | 1 | 25.7 |
Detector (minFaceSize=50) | 4 | 18.0 |
Detector (minFaceSize=50) | 8 | 17.3 |
Detector (minFaceSize=90) | 1 | 25.5 |
Detector (minFaceSize=90) | 4 | 18.0 |
Detector (minFaceSize=90) | 8 | 17.1 |
Redetect | 1 | 12.7 |
Redetect | 4 | 6.0 |
Redetect | 8 | 5.1 |
Below are the measurements for estimators that have a batch interface. All these measurements are performed with minFaceSize=50
.
"NPU. Estimation performance with batch interface"
Measurement | BatchSize | Average (ms) |
---|---|---|
HeadPoseByImage | 1 | 8.0 |
HeadPoseByImage | 16 | 4.2 |
HeadPoseByImage | 32 | 3.9 |
AGS | 1 | 6.6 |
AGS | 16 | 3.7 |
AGS | 32 | 3.7 |
BestShotQuality | 1 | 15.6 |
BestShotQuality | 16 | 7.8 |
BestShotQuality | 32 | 7.6 |
MedicalMask | 1 | 6.1 |
MedicalMask | 16 | 3.8 |
MedicalMask | 32 | 3.7 |
Below is the measurement for Warper that does not have a batch interface. This measurement is performed with minFaceSize=50
.
"NPU. Estimation performance without batch interface"
Measurement | Average (ms) |
---|---|
Warper | 2.1 |
"NPU. Extractor performance"
Type | Model | Batch Size | Average (ms) |
---|---|---|---|
Extractor | 57 | 1 | 10.9 |
Extractor | 57 | 16 | 7.4 |
Embedded environment#
Face detection performance depends on input image parameters such as resolution and bit depth as well as the size of the detected face.
Input data characteristics:
- Image resolution: 640x480px;
- Image format: 24 BPP RGB;
The results for minimum batch size and optimal batch size are shown in the tables below. All the intermediate and non-optimal values are omitted.
Face detections are performed using FaceDetV3 NN.
Jetson#
Jetson does not use mobilenet by default.
Performance measurements are presented for Jetson. Measured values are averages of at least 100 experiments. Mobilenet is not used by default.
Jetson TX#
"Jetson TX GPU. Detector performance"
Type | Batch Size | Average (ms) |
---|---|---|
Detector (minFaceSize=20) | 1 | 499.59 |
Detector (minFaceSize=20) | 4 | 470.32 |
Detector (minFaceSize=50) | 1 | 88.97 |
Detector (minFaceSize=50) | 4 | 80.13 |
Detector (minFaceSize=50) | 8 | 79.67 |
Detector (minFaceSize=90) | 1 | 35.66 |
Detector (minFaceSize=90) | 4 | 30.14 |
Detector (minFaceSize=90) | 8 | 29.48 |
Redetect | 1 | 9.5 |
Redetect | 4 | 5.2 |
Redetect | 8 | 4.5 |
"Jetson TX GPU. HumanDetector performance"
Type | Batch Size | Average (ms) |
---|---|---|
HumanDetector (imageSize=320) | 1 | 16.28 |
HumanDetector (imageSize=320) | 4 | 14.81 |
HumanDetector (imageSize=320) | 8 | 14.27 |
HumanDetector (imageSize=640) | 1 | 47.7 |
HumanDetector (imageSize=640) | 4 | 44.3 |
HumanDetector (imageSize=640) | 8 | 42.0 |
HumanLandmarksDetector (imageSize=320) | 1 | 67.3 |
HumanLandmarksDetector (imageSize=320) | 4 | 35.15 |
HumanLandmarksDetector (imageSize=320) | 8 | 32.94 |
HumanLandmarksDetector (imageSize=640) | 1 | 99.05 |
HumanLandmarksDetector (imageSize=640) | 4 | 64.64 |
HumanLandmarksDetector (imageSize=640) | 8 | 61.68 |
HumanRedetect | 1 | 6.08 |
HumanRedetect | 4 | 3.71 |
HumanRedetect | 8 | 3.46 |
Below are the measurements for estimators that have a batch interface. All these measurements are performed with minFaceSize=50
.
"Jetson TX GPU. Estimation performance with batch interface"
Type | Batch Size | Average (ms) |
---|---|---|
HeadPoseByImage | 1 | 8.85 |
HeadPoseByImage | 32 | 2.82 |
Eyes (INFRA_RED, useStatusPlan=0) | 1 | 1.53 |
Eyes (INFRA_RED, useStatusPlan=0) | 16 | 1.02 |
Eyes (INFRA_RED, useStatusPlan=0) | 32 | 0.93 |
Eyes (RGB, useStatusPlan=0) | 1 | 2.83 |
Eyes (RGB, useStatusPlan=0) | 16 | 1.68 |
Eyes (RGB, useStatusPlan=0) | 32 | 1.65 |
Eyes (INFRA_RED, useStatusPlan=1) | 1 | 1.49 |
Eyes (INFRA_RED, useStatusPlan=1) | 16 | 1.17 |
Eyes (INFRA_RED, useStatusPlan=1) | 32 | 1.1 |
Eyes (RGB, useStatusPlan=1) | 1 | 2.82 |
Eyes (RGB, useStatusPlan=1) | 16 | 1.68 |
Eyes (RGB, useStatusPlan=1) | 32 | 1.6 |
Infra-Red | 1 | 3.29 |
AGS | 1 | 5.02 |
AGS | 16 | 2.57 |
Child | 1 | 15.23 |
Child | 16 | 8.95 |
BlackWhite | 1 | 3.0 |
BlackWhite | 16 | 1.1 |
BestShotQuality | 1 | 5.41 |
BestShotQuality | 16 | 2.59 |
MedicalMask | 1 | 13.4 |
MedicalMask | 32 | 4.98 |
LivenessOneShotRGBEstimator | 1 | 188.8 |
Orientation | 1 | 26.3 |
CredibilityCheck | 1 | 44.5 |
CredibilityCheck | 8 | 35.7 |
CredibilityCheck | 16 | 34.4 |
CredibilityCheck | 32 | 34.1 |
FacialHair | 1 | 3.6 |
FacialHair | 16 | 2.7 |
PortraitStyle | 1 | 7.1 |
PortraitStyle | 16 | 4.0 |
Background | 1 | 7.2 |
Background | 16 | 3.9 |
NaturalLight | 1 | 13.8 |
NaturalLight | 16 | 1.5 |
FishEye | 1 | 8.24 |
FishEye | 16 | 5.41 |
RedEye | 1 | 2.1 |
RedEye | 16 | 0.8 |
HeadWear | 1 | 14.1 |
HeadWear | 16 | 1.58 |
EyeBrowEstimator | 1 | 11.81 |
EyeBrowEstimator | 8 | 10.32 |
EyeBrowEstimator | 16 | 9.81 |
EyeBrowEstimator | 32 | 9.57 |
Below are the measurements for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50
.
"Jetson TX GPU. Estimation performance without batch interface"
Type | Average (ms) |
---|---|
EyesGaze | 4.29 |
Emotions | 11.96 |
Attributes | 27.24 |
Quality | 2.17 |
Warper | 8.08 |
Overlap | 3.98 |
Glasses | 3.63 |
PPE | 9.96 |
Mouth | 15.32 |
LivenessFlyingFaces | 19.68 |
LivenessRGBMEstimator | 64.42 |
LivenessFPR | 62.67 |
"Jetson TX GPU. Extractor performance"
Type | Model | Batch Size | Average (ms) |
---|---|---|---|
Extractor | 57 | 1 | 76.07 |
Extractor | 57 | 8 | 62.03 |
Extractor | 58 | 1 | 76.15 |
Extractor | 58 | 8 | 61.63 |
Extractor | 59 | 1 | 76.15 |
Extractor | 59 | 8 | 61.64 |
Extractor | 102 | 1 | 17.31 |
Extractor | 102 | 8 | 2.61 |
Extractor | 103 | 1 | 45.64 |
Extractor | 103 | 8 | 32.34 |
Extractor | 104 | 1 | 15.23 |
Extractor | 104 | 8 | 5.41 |
Jetson Xavier#
"Jetson Xavier GPU. Detector performance"
Type | Batch Size | Average (ms) |
---|---|---|
Detector (minFaceSize=20) | 1 | 89.56 |
Detector (minFaceSize=20) | 4 | 102.86 |
Detector (minFaceSize=20) | 8 | 153.48 |
Detector (minFaceSize=50) | 1 | 19.27 |
Detector (minFaceSize=50) | 4 | 16.73 |
Detector (minFaceSize=50) | 8 | 16.24 |
Detector (minFaceSize=90) | 1 | 10.38 |
Detector (minFaceSize=90) | 4 | 7.41 |
Detector (minFaceSize=90) | 8 | 6.87 |
Redetect | 1 | 6.4 |
Redetect | 4 | 2.9 |
Redetect | 8 | 2.3 |
"Jetson Xavier GPU. HumanDetector performance"
Type | Batch Size | Average (ms) |
---|---|---|
HumanDetector (imageSize=320) | 1 | 10.41 |
HumanDetector (imageSize=320) | 4 | 7.53 |
HumanDetector (imageSize=320) | 8 | 6.75 |
HumanDetector (imageSize=640) | 1 | 22.33 |
HumanDetector (imageSize=640) | 4 | 19.81 |
HumanDetector (imageSize=640) | 8 | 19.05 |
HumanLandmarksDetector (imageSize=320) | 1 | 38.99 |
HumanLandmarksDetector (imageSize=320) | 4 | 22.14 |
HumanLandmarksDetector (imageSize=320) | 8 | 18.58 |
HumanLandmarksDetector (imageSize=640) | 1 | 51.76 |
HumanLandmarksDetector (imageSize=640) | 4 | 34.93 |
HumanLandmarksDetector (imageSize=640) | 8 | 31.14 |
HumanRedetect | 1 | 3.6 |
HumanRedetect | 4 | 1.95 |
HumanRedetect | 8 | 1.68 |
Below are the measurements for estimators that have a batch interface. All these measurements are performed with minFaceSize=50
.
"Jetson Xavier GPU. Estimation performance with batch interface"
Type | Batch Size | Average (ms) |
---|---|---|
HeadPoseByImage | 1 | 4.38 |
HeadPoseByImage | 32 | 0.89 |
Eyes (INFRA_RED, useStatusPlan=0) | 1 | 1.12 |
Eyes (INFRA_RED, useStatusPlan=0) | 16 | 0.53 |
Eyes (INFRA_RED, useStatusPlan=0) | 32 | 0.48 |
Eyes (RGB, useStatusPlan=0) | 1 | 2.17 |
Eyes (RGB, useStatusPlan=0) | 16 | 1.0 |
Eyes (RGB, useStatusPlan=0) | 32 | 0.99 |
Eyes (INFRA_RED, useStatusPlan=1) | 1 | 1.12 |
Eyes (INFRA_RED, useStatusPlan=1) | 16 | 0.51 |
Eyes (INFRA_RED, useStatusPlan=1) | 32 | 0.5 |
Eyes (RGB, useStatusPlan=1) | 1 | 2.16 |
Eyes (RGB, useStatusPlan=1) | 16 | 1.1 |
Eyes (RGB, useStatusPlan=1) | 32 | 0.99 |
Infra-Red | 1 | 2.3 |
Infra-Red | 32 | 1.25 |
AGS | 1 | 2.83 |
AGS | 32 | 0.86 |
Child | 1 | 8.37 |
Child | 8 | 5.88 |
BlackWhite | 1 | 2.2 |
BlackWhite | 16 | 0.6 |
BestShotQuality | 1 | 3.04 |
BestShotQuality | 32 | 0.88 |
MedicalMask | 1 | 6.59 |
MedicalMask | 32 | 3.45 |
LivenessOneShotRGBEstimator | 1 | 97.95 |
LivenessOneShotRGBEstimator | 8 | 81.8 |
Orientation | 1 | 11.6 |
Orientation | 32 | 9.75 |
CredibilityCheck | 1 | 35.2 |
CredibilityCheck | 8 | 25.09 |
CredibilityCheck | 16 | 24.64 |
CredibilityCheck | 32 | 24.22 |
FacialHair | 1 | 3.35 |
FacialHair | 16 | 1.84 |
PortraitStyle | 1 | 3.6 |
PortraitStyle | 16 | 1.8 |
Background | 1 | 3.8 |
Background | 16 | 1.8 |
NaturalLight | 1 | 3.6 |
NaturalLight | 16 | 1.5 |
FishEye | 1 | 4.75 |
FishEye | 16 | 2.36 |
RedEye | 1 | 2.0 |
RedEye | 16 | 0.5 |
HeadWear | 1 | 4.34 |
HeadWear | 16 | 1.49 |
EyeBrowEstimator | 1 | 7.21 |
EyeBrowEstimator | 8 | 5.32 |
EyeBrowEstimator | 16 | 5.16 |
EyeBrowEstimator | 32 | 5.02 |
Below are the measurements for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50
.
"Jetson Xavier GPU. Estimation performance without batch interface"
Type | Average (ms) |
---|---|
EyesGaze | 2.99 |
Emotions | 7.48 |
Attributes | 20.3 |
Quality | 1.64 |
Warper | 6.63 |
Overlap | 3.03 |
PPE | 6.43 |
Glasses | 2.14 |
Mouth | 5.86 |
LivenessFlyingFaces | 6.98 |
LivenessRGBMEstimator | 27.14 |
LivenessFPR | 39.41 |
"Jetson Xavier GPU. Extractor performance"
Type | Model | Batch Size | Average (ms) |
---|---|---|---|
Extractor | 57 | 1 | 66.4 |
Extractor | 57 | 8 | 44.1 |
Extractor | 58 | 1 | 66.2 |
Extractor | 58 | 8 | 44.1 |
Extractor | 59 | 1 | 66.3 |
Extractor | 59 | 8 | 44.1 |
Extractor | 102 | 1 | 8.3 |
Extractor | 102 | 8 | 0.98 |
Extractor | 103 | 1 | 18.3 |
Extractor | 103 | 8 | 19.4 |
Extractor | 104 | 1 | 6.6 |
Extractor | 104 | 8 | 2.4 |
Jetson Xavier NX#
"Jetson Xavier NX GPU. Detector performance"
Type | Batch Size | Average (ms) |
---|---|---|
Detector (minFaceSize=20) | 1 | 172.28 |
Detector (minFaceSize=20) | 4 | 171.78 |
Detector (minFaceSize=20) | 8 | 238.0 |
Detector (minFaceSize=50) | 1 | 32.12 |
Detector (minFaceSize=50) | 4 | 32.21 |
Detector (minFaceSize=50) | 8 | 29.32 |
Detector (minFaceSize=90) | 1 | 15.57 |
Detector (minFaceSize=90) | 4 | 12.19 |
Detector (minFaceSize=90) | 8 | 11.57 |
Redetect | 1 | 6.9 |
Redetect | 4 | 2.8 |
Redetect | 8 | 2.3 |
"Jetson Xavier NX GPU. HumanDetector performance"
Type | Batch Size | Average (ms) |
---|---|---|
HumanDetector (imageSize=320) | 1 | 9.49 |
HumanDetector (imageSize=320) | 4 | 7.86 |
HumanDetector (imageSize=320) | 8 | 7.26 |
HumanDetector (imageSize=640) | 1 | 24.39 |
HumanDetector (imageSize=640) | 4 | 23.12 |
HumanDetector (imageSize=640) | 8 | 22.51 |
HumanLandmarksDetector (imageSize=320) | 1 | 40.7 |
HumanLandmarksDetector (imageSize=320) | 4 | 20.4 |
HumanLandmarksDetector (imageSize=320) | 8 | 17.9 |
HumanLandmarksDetector (imageSize=640) | 1 | 59.7 |
HumanLandmarksDetector (imageSize=640) | 4 | 33.1 |
HumanLandmarksDetector (imageSize=640) | 8 | 30.5 |
HumanRedetect | 1 | 4.45 |
HumanRedetect | 4 | 2.0 |
HumanRedetect | 8 | 1.75 |
Below are the measurements for estimators that have a batch interface. All these measurements are performed with minFaceSize=50
.
"Jetson Xavier NX GPU. Estimation performance with batch interface"
Type | Batch Size | Average (ms) |
---|---|---|
HeadPoseByImage | 1 | 5.6 |
HeadPoseByImage | 32 | 1.3 |
Eyes (INFRA_RED, useStatusPlan=0) | 1 | 1.36 |
Eyes (INFRA_RED, useStatusPlan=0) | 16 | 0.65 |
Eyes (INFRA_RED, useStatusPlan=0) | 32 | 0.6 |
Eyes (RGB, useStatusPlan=0) | 1 | 2.21 |
Eyes (RGB, useStatusPlan=0) | 16 | 1.09 |
Eyes (RGB, useStatusPlan=0) | 32 | 1.01 |
Eyes (INFRA_RED, useStatusPlan=1) | 1 | 1.37 |
Eyes (INFRA_RED, useStatusPlan=1) | 16 | 0.71 |
Eyes (INFRA_RED, useStatusPlan=1) | 32 | 0.65 |
Eyes (RGB, useStatusPlan=1) | 1 | 2.48 |
Eyes (RGB, useStatusPlan=1) | 16 | 1.31 |
Eyes (RGB, useStatusPlan=1) | 32 | 1.21 |
Infra-Red | 1 | 2.32 |
Infra-Red | 32 | 1.49 |
AGS | 1 | 3.41 |
AGS | 32 | 1.25 |
Child | 1 | 7.85 |
Child | 8 | 5.49 |
BlackWhite | 1 | 2.4 |
BlackWhite | 16 | 0.7 |
BestShotQuality | 1 | 3.59 |
BestShotQuality | 32 | 1.27 |
MedicalMask | 1 | 7.01 |
MedicalMask | 32 | 3.41 |
LivenessOneShotRGBEstimator | 1 | 112.7 |
LivenessOneShotRGBEstimator | 16 | 81.81 |
Orientation | 1 | 11.57 |
Orientation | 32 | 10.17 |
CredibilityCheck | 1 | 31.05 |
CredibilityCheck | 8 | 22.59 |
CredibilityCheck | 16 | 21.91 |
CredibilityCheck | 32 | 21.5 |
FacialHair | 1 | 2.97 |
FacialHair | 16 | 1.63 |
PortraitStyle | 1 | 4.2 |
PortraitStyle | 16 | 2.0 |
Background | 1 | 4.0 |
Background | 16 | 2.1 |
NaturalLight | 1 | 4.48 |
NaturalLight | 16 | 1.26 |
FishEye | 1 | 5.01 |
FishEye | 16 | 2.42 |
RedEye | 1 | 2.1 |
RedEye | 16 | 0.5 |
HeadWear | 1 | 4.96 |
HeadWear | 16 | 1.27 |
EyeBrowEstimator | 1 | 6.27 |
EyeBrowEstimator | 8 | 5.14 |
EyeBrowEstimator | 16 | 4.89 |
EyeBrowEstimator | 32 | 4.79 |
Below are the measurements for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50
.
"Jetson Xavier NX GPU. Estimation performance without batch interface"
Type | Average (ms) |
---|---|
EyesGaze | 3.7 |
Emotions | 6.8 |
Attributes | 17.36 |
Quality | 1.59 |
Warper | 9.82 |
Overlap | 3.56 |
PPE | 6.31 |
Glasses | 2.05 |
Mouth | 6.61 |
LivenessFlyingFaces | 8.46 |
LivenessRGBMEstimator | 28.1 |
LivenessFPR | 40.5 |
"Jetson Xavier NX GPU. Extractor performance"
Type | Model | Batch Size | Average (ms) |
---|---|---|---|
Extractor | 57 | 1 | 58.2 |
Extractor | 57 | 16 | 38.1 |
Extractor | 58 | 1 | 58.0 |
Extractor | 58 | 16 | 38.1 |
Extractor | 59 | 1 | 58.0 |
Extractor | 59 | 16 | 38.0 |
Extractor | 102 | 1 | 10.7 |
Extractor | 102 | 16 | 1.0 |
Extractor | 103 | 1 | 28.4 |
Extractor | 103 | 16 | 41.3 |
Extractor | 104 | 1 | 9.8 |
Extractor | 104 | 16 | 3.6 |
Jetson Nano#
"Jetson Nano GPU. Detector performance"
Type | Batch Size | Average (ms) |
---|---|---|
Detector (minFaceSize=20) | 1 | 1749.35 |
Detector (minFaceSize=50) | 1 | 321.64 |
Detector (minFaceSize=90) | 1 | 117.22 |
Redetect | 1 | 18.2 |
"Jetson Nano GPU. HumanDetector performance"
Type | Batch Size | Average (ms) |
---|---|---|
HumanDetector (imageSize=320) | 1 | 60.89 |
HumanDetector (imageSize=320) | 4 | 58.35 |
HumanDetector (imageSize=640) | 1 | 188.86 |
HumanDetector (imageSize=640) | 4 | 189.72 |
HumanLandmarksDetector (imageSize=320) | 1 | 174.27 |
HumanLandmarksDetector (imageSize=320) | 4 | 148.13 |
HumanLandmarksDetector (imageSize=640) | 1 | 341.1 |
HumanLandmarksDetector (imageSize=640) | 4 | 252.7 |
HumanRedetect | 1 | 10.63 |
HumanRedetect | 4 | 7.5 |
Below are the measurements for estimators that have a batch interface. All these measurements are performed with minFaceSize=50
.
"Jetson Nano GPU. Estimation performance with batch interface"
Type | Batch Size | Average (ms) |
---|---|---|
HeadPoseByImage | 1 | 7.45 |
HeadPoseByImage | 4 | 4.08 |
Eyes (INFRA_RED, useStatusPlan=0) | 1 | 3.37 |
Eyes (INFRA_RED, useStatusPlan=0) | 4 | 2.46 |
Eyes (RGB, useStatusPlan=0) | 1 | 6.85 |
Eyes (RGB, useStatusPlan=0) | 4 | 5.52 |
Eyes (INFRA_RED, useStatusPlan=1) | 1 | 3.07 |
Eyes (INFRA_RED, useStatusPlan=1) | 4 | 2.42 |
Eyes (RGB, useStatusPlan=1) | 1 | 7.02 |
Eyes (RGB, useStatusPlan=1) | 4 | 5.53 |
Infra-Red | 1 | 10.1 |
Infra-Red | 4 | 8.89 |
AGS | 1 | 6.3 |
AGS | 4 | 3.87 |
Child | 1 | 59.89 |
Child | 4 | 48.3 |
BlackWhite | 1 | 5.8 |
BlackWhite | 4 | 3.1 |
BestShotQuality | 1 | 6.55 |
BestShotQuality | 4 | 4.05 |
MedicalMask | 1 | 26.38 |
MedicalMask | 4 | 19.45 |
LivenessOneShotRGBEstimator | 1 | 1120.7 |
LivenessOneShotRGBEstimator | 4 | 1110.2 |
Orientation | 1 | 113.0 |
Orientation | 4 | 106.1 |
CredibilityCheck | 1 | 271.18 |
CredibilityCheck | 4 | 226.63 |
FacialHair | 1 | 14.58 |
FacialHair | 4 | 14.37 |
PortraitStyle | 1 | 12.1 |
PortraitStyle | 4 | 10.0 |
Background | 1 | 12.2 |
Background | 4 | 9.7 |
NaturalLight | 1 | 28.45 |
NaturalLight | 4 | 10.86 |
FishEye | 1 | 19.17 |
FishEye | 16 | 17.55 |
RedEye | 1 | 6.7 |
RedEye | 16 | 4.4 |
HeadWear | 1 | 28.2 |
HeadWear | 16 | 11.5 |
EyeBrowEstimator | 1 | 46.87 |
EyeBrowEstimator | 4 | 46.53 |
Below are the measurements for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50
.
"Jetson Nano GPU. Estimation performance without batch interface"
Type | Average (ms) |
---|---|
EyesGaze | 12.8 |
Emotions | 48.9 |
Attributes | 129.57 |
Quality | 3.98 |
Warper | 10.54 |
Overlap | 9.76 |
PPE | 30.43 |
Glasses | 10.71 |
Mouth | 38.38 |
LivenessFlyingFaces | 41.59 |
LivenessRGBMEstimator | 169.7 |
LivenessFPR | 201.0 |
"Jetson Nano GPU. Extractor performance"
Type | Model | Batch Size | Average (ms) |
---|---|---|---|
Extractor | 58 | 1 | 442.35 |
Extractor | 58 | 4 | 403.95 |
Extractor | 59 | 1 | 428.35 |
Extractor | 59 | 4 | 411.47 |
Extractor | 102 | 1 | 26.17 |
Extractor | 102 | 4 | 9.9 |
Extractor | 103 | 1 | 254.11 |
Extractor | 103 | 4 | 215.07 |
Extractor | 104 | 1 | 39.97 |
Extractor | 104 | 4 | 31.63 |
Descriptor size#
Table below shows size of serialized face descriptors to estimate memory requirements.
"Descriptor size"
Face descriptor version | Data size (bytes) | Metadata size (bytes) | Total size |
---|---|---|---|
CNN 54 | 512 | 8 | 520 |
CNN 56 | 512 | 8 | 520 |
CNN 57 | 512 | 8 | 520 |
CNN 58 | 512 | 8 | 520 |
CNN 59 | 512 | 8 | 520 |
Table below shows size of serialized human descriptors to estimate memory requirements. Human descriptors are used only for reidentification tasks.
"Human descriptor size (used only for reidentification tasks)"
Human descriptor version | Data size (bytes) | Metadata size (bytes) | Total size |
---|---|---|---|
CNN 102 | 2048 | 8 | 2056 |
CNN 103 | 2048 | 8 | 2056 |
CNN 104 | 2048 | 8 | 2056 |
Metadata includes signature and version information that may be omitted during serialization if the NoSignature flag is specified.
When estimating individual descriptor size in memory or serialization storage requirements with default options, consider using values from the "Total size" column.
When estimating memory requirements for descriptor batches, use values from the "Data size" column instead, since a descriptor batch does not duplicate metadata per descriptor and thus is more memory-efficient.
These numbers are for approximate computation only, since they do not include overhead like memory alignment for accelerated SIMD processing and the like.