Appendix A. Specifications#
Classification performance#
Classification performance was measured on a two datasets:
- Cooperative dataset (containing 20K images from various sources obtained at several banks);
- Non cooperative dataset (containing 20K).
The two tables below contain true positive rates corresponding to select false positive rates.
"Classification performance @ low FPR on cooperative dataset"
FPR | TPR CNN 58 | TPR CNN 59 | TPR CNN 59m | TPR CNN 60 | TPR CNN 60m | TPR CNN 62 |
---|---|---|---|---|---|---|
10^-7^ | 0.9910 | 0.9911 | 0.9876 | 0.9917 | 0.9660 | 0.9909 |
10^-6^ | 0.9916 | 0.9915 | 0.9904 | 0.9917 | 0.9824 | 0.9950 |
10^-5^ | 0.9918 | 0.9919 | 0.9915 | 0.9919 | 0.9889 | 0.9976 |
10^-4^ | 0.9919 | 0.9921 | 0.9919 | 0.9921 | 0.9909 | 0.9988 |
"Classification performance @ low FPR on non cooperative dataset"
FPR | TPR CNN 58 | TPR CNN 59 | TPR CNN 59m | TPR CNN 60 | TPR CNN 60m | TPR CNN 62 |
---|---|---|---|---|---|---|
10^-7^ | 0.9767 | 0.9832 | 0.9377 | 0.9893 | 0.8797 | 0.9916 |
10^-6^ | 0.9839 | 0.9880 | 0.9629 | 0.9914 | 0.9246 | 0.9917 |
10^-5^ | 0.9880 | 0.9908 | 0.9794 | 0.9914 | 0.9595 | 0.9918 |
10^-4^ | 0.9909 | 0.9924 | 0.9880 | 0.9925 | 0.9821 | 0.9920 |
Runtime performance for CentOS Linux environment#
Face detection performance depends on input image parameters such as resolution and bit depth as well as the size of the detected face.
Input data characteristics:
- Image resolution: 1920x1080px;
- Image format: 24 BPP RGB;
Performance measurements are presented for CPU, GPU and NPU execution modes in tables below. Measured values are averages of at least 100 experiments.
Estimated values of memory consumption are also presented for CPU and GPU. These values are highly depend on the input data and the conditions of the experiment.
The results for minimum batch size and optimal batch size are shown in the tables below. All the intermediate and non-optimal values are omitted.
Face detections are performed using FaceDetV3 NN.
All types of face detection and redetect performed with capturing bounding boxes and 5 facial landmarks.
CPU performance#
Benchmarking for CPU was performed on the server with the following hardware configuration:
CPU:
- Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz;
- CPU(s): 40
- Thread(s) per core: 2
- Core(s) per socket: 10
- Socket(s): 2
- NUMA node(s): 2
- CPU with AVX2 instruction set was used
OS: CentOS Linux release 8.3.2011
RAM: 128 GB DDR4 (Clock Speed: 2133 MHz)
In experiments listed in tables below face detection and descriptor extraction algorithms used all available CPU cores, whereas matching performance is specified per-core.
Descriptor matching is only implemented on CPU.
CPU. Detector performance#
The table below shows the performance of FaceDetV3 Detector on the CPU.
Measurement | CPU threads | BatchSize | Average (ms) | RAM Memory (Mb) |
---|---|---|---|---|
Detector (minFaceSize=20) | 1 | 1 | 373.92 | 1889.0 |
Detector (minFaceSize=20) | 8 | 1 | 152.73 | 2076.0 |
Detector (minFaceSize=20) | 8 | 4 | 147.26 | 4411.0 |
Detector (minFaceSize=20) | 8 | 8 | 148.32 | 7329.0 |
Detector (minFaceSize=50) | 1 | 1 | 63.23 | 1261.0 |
Detector (minFaceSize=50) | 8 | 1 | 27.52 | 1482.0 |
Detector (minFaceSize=50) | 8 | 4 | 23.43 | 1810.0 |
Detector (minFaceSize=50) | 8 | 8 | 24.61 | 2358.0 |
Detector (minFaceSize=90) | 1 | 1 | 23.11 | 1184.0 |
Detector (minFaceSize=90) | 8 | 1 | 11.62 | 1364.0 |
Detector (minFaceSize=90) | 8 | 4 | 8.03 | 1470.0 |
Detector (minFaceSize=90) | 8 | 8 | 8.23 | 1748.0 |
Redetect | 1 | 1 | 0.63 | 1252.0 |
Redetect | 8 | 1 | 0.83 | 1284.0 |
Redetect | 8 | 4 | 0.32 | 1673.0 |
Redetect | 8 | 8 | 0.25 | 2153.0 |
FaceLandmarks5Detector | 1 | 1 | 0.22 | 1225.0 |
FaceLandmarks5Detector | 8 | 1 | 0.37 | 1225.0 |
FaceLandmarks5Detector | 8 | 8 | 0.09 | 1226.0 |
FaceLandmarks68Detector | 1 | 1 | 3.2 | 1226.0 |
FaceLandmarks68Detector | 8 | 1 | 2.0 | 1230.0 |
FaceLandmarks68Detector | 8 | 8 | 1.0 | 1237.0 |
The table below shows the performance of FaceDetV3m Detector on the CPU.
Type | CPU threads | Batch Size | Average (ms) | RAM Memory (Mb) |
---|---|---|---|---|
Detector (minFaceSize=20) | 1 | 1 | 193.29 | 2137.0 |
Detector (minFaceSize=20) | 8 | 1 | 93.75 | 2389.0 |
Detector (minFaceSize=20) | 8 | 8 | 85.35 | 5839.0 |
Detector (minFaceSize=50) | 1 | 1 | 32.78 | 1695.0 |
Detector (minFaceSize=50) | 8 | 1 | 17.18 | 1862.0 |
Detector (minFaceSize=50) | 8 | 8 | 14.58 | 2593.0 |
Detector (minFaceSize=90) | 1 | 1 | 12.45 | 1666.0 |
Detector (minFaceSize=90) | 8 | 1 | 7.96 | 1864.0 |
Detector (minFaceSize=90) | 8 | 8 | 4.98 | 2098.0 |
Redetect | 1 | 1 | 0.59 | 1690.0 |
Redetect | 8 | 1 | 0.78 | 1739.0 |
Redetect | 8 | 8 | 0.25 | 2325.0 |
Landmarks5Detector | 1 | 1 | 0.18 | 1702.0 |
Landmarks5Detector | 8 | 1 | 0.31 | 1722.0 |
Landmarks5Detector | 8 | 8 | 0.08 | 1724.0 |
Landmarks68Detector | 1 | 1 | 3.18 | 1703.0 |
Landmarks68Detector | 8 | 1 | 2.02 | 1749.0 |
Landmarks68Detector | 8 | 8 | 1.0 | 1763.0 |
CPU. HumanDetector performance#
The table below shows the performance of HumanDetector on the CPU.
Measurement | CPU threads | BatchSize | Average (ms) | RAM Memory (Mb) |
---|---|---|---|---|
HumanDetector (resize to 320) | 1 | 1 | 10.05 | 1740.0 |
HumanDetector (resize to 320) | 8 | 1 | 6.18 | 1813.0 |
HumanDetector (resize to 320) | 8 | 8 | 3.53 | 1978.0 |
HumanDetector (resize to 640) | 1 | 1 | 35.03 | 1776.0 |
HumanDetector (resize to 640) | 8 | 1 | 14.71 | 1865.0 |
HumanDetector (resize to 640) | 8 | 8 | 11.55 | 2234.0 |
HumanRedetect | 1 | 1 | 2.61 | 1239.0 |
HumanRedetect | 8 | 1 | 2.76 | 1545.0 |
HumanRedetect | 8 | 4 | 1.24 | 1770.0 |
HumanRedetect | 8 | 8 | 1.26 | 1987.0 |
CPU. HumanFaceDetector performance#
The table below shows the performance of HumanFaceDetector on the CPU.
Measurement | CPU threads | BatchSize | Average (ms) | RAM Memory (Mb) |
---|---|---|---|---|
HumanFaceDetector (minFaceSize=20) | 1 | 1 | 425.37 | 2558 |
HumanFaceDetector (minFaceSize=20) | 8 | 1 | 183.5 | 2600 |
HumanFaceDetector (minFaceSize=20) | 8 | 8 | 182.35 | 9340 |
HumanFaceDetector (minFaceSize=50) | 1 | 1 | 66.97 | 1783 |
HumanFaceDetector (minFaceSize=50) | 8 | 1 | 28.9 | 1812 |
HumanFaceDetector (minFaceSize=50) | 8 | 8 | 29.17 | 2900 |
HumanFaceDetector (minFaceSize=90) | 1 | 1 | 22.6 | 1734 |
HumanFaceDetector (minFaceSize=90) | 8 | 1 | 10.71 | 1758 |
HumanFaceDetector (minFaceSize=90) | 8 | 8 | 9.17 | 2072 |
CPU. HeadDetector performance#
The table below shows the performance of HeadDetector on the CPU.
Measurement | CPU threads | BatchSize | Average (ms) | RAM Memory (Mb) |
---|---|---|---|---|
HeadDetector (minHeadSize=20) | 1 | 1 | 361.9 | 2374 |
HeadDetector (minHeadSize=20) | 8 | 1 | 158.72 | 2423 |
HeadDetector (minHeadSize=20) | 8 | 8 | 153.9 | 7710 |
HeadDetector (minHeadSize=50) | 1 | 1 | 59.02 | 1754 |
HeadDetector (minHeadSize=50) | 8 | 1 | 26.18 | 1801 |
HeadDetector (minHeadSize=50) | 8 | 8 | 25.43 | 2797 |
HeadDetector (minHeadSize=90) | 1 | 1 | 21.08 | 1731 |
HeadDetector (minHeadSize=90) | 8 | 1 | 10.9 | 1778 |
HeadDetector (minHeadSize=90) | 8 | 8 | 8.34 | 2168 |
CPU. Estimations performance with batch interface#
The table below shows the performance of Estimations on the CPU for estimators that have a batch interface. All these measurements are performed with minFaceSize=50
.
Measurement | CPU threads | BatchSize | Average (ms) | RAM Memory (Mb) |
---|---|---|---|---|
Eyes (INFRA_RED, useStatusPlan=0) | 1 | 1 | 0.6 | 1184.0 |
Eyes (INFRA_RED, useStatusPlan=0) | 8 | 1 | 0.4 | 1204.0 |
Eyes (INFRA_RED, useStatusPlan=0) | 8 | 8 | 0.3 | 1202.0 |
Eyes (RGB, useStatusPlan=0) | 1 | 1 | 1.2 | 1237.0 |
Eyes (RGB, useStatusPlan=0) | 8 | 1 | 0.8 | 1259.0 |
Eyes (RGB, useStatusPlan=0) | 8 | 8 | 0.5 | 1258.0 |
Eyes (INFRA_RED, useStatusPlan=1) | 1 | 1 | 0.6 | 1187.0 |
Eyes (INFRA_RED, useStatusPlan=1) | 8 | 1 | 0.4 | 1207.0 |
Eyes (INFRA_RED, useStatusPlan=1) | 8 | 8 | 0.3 | 1205.0 |
Eyes (RGB, useStatusPlan=1) | 1 | 1 | 1.1 | 1241.0 |
Eyes (RGB, useStatusPlan=1) | 8 | 1 | 0.8 | 1257.0 |
Eyes (RGB, useStatusPlan=1) | 8 | 8 | 0.5 | 1255.0 |
Infra-Red | 1 | 1 | 2 | 1191.0 |
Infra-Red | 8 | 1 | 1.0 | 1209.0 |
Infra-Red | 8 | 8 | 0.7 | 1218.0 |
AGS | 1 | 1 | 0.24 | 1735.0 |
AGS | 8 | 1 | 0.15 | 1763.0 |
AGS | 8 | 8 | 0.08 | 1804.0 |
HeadPoseByImage | 1 | 1 | 0.24 | 1648.0 |
HeadPoseByImage | 8 | 1 | 0.15 | 1672.0 |
HeadPoseByImage | 8 | 8 | 0.06 | 1712.0 |
Warper | 1 | 1 | 2.1 | 1180.0 |
Warper | 8 | 1 | 2.2 | 1219.0 |
Warper | 8 | 8 | 0.9 | 1230.0 |
BlackWhite | 1 | 1 | 1.3 | 1249.0 |
BlackWhite | 8 | 1 | 0.7 | 1265.0 |
BlackWhite | 8 | 8 | 1.2 | 1263.0 |
BestShotQuality | 1 | 1 | 0.3 | 1238.0 |
BestShotQuality | 8 | 1 | 0.2 | 1259.0 |
BestShotQuality | 8 | 8 | 0.08 | 1299.0 |
MedicalMask | 1 | 1 | 5.6 | 1258.0 |
MedicalMask | 8 | 1 | 3.2 | 1287.0 |
MedicalMask | 8 | 8 | 2.8 | 1318.0 |
LivenessOneShotRGBEstimator | 1 | 1 | 174.91 | 1925.0 |
LivenessOneShotRGBEstimator | 8 | 1 | 51.27 | 1964.0 |
LivenessOneShotRGBEstimator | 8 | 8 | 54.2 | 2471.0 |
Orientation | 1 | 1 | 5.06 | 1609.0 |
Orientation | 8 | 1 | 3.33 | 1682.0 |
Orientation | 8 | 8 | 1.86 | 1875.0 |
CredibilityCheck | 1 | 1 | 120.3 | 1332.0 |
CredibilityCheck | 8 | 1 | 35.1 | 1351.0 |
CredibilityCheck | 8 | 8 | 34.1 | 1558.0 |
FacialHair | 1 | 1 | 12.86 | 1751.0 |
FacialHair | 8 | 1 | 4.84 | 1770.0 |
FacialHair | 8 | 8 | 4.24 | 1794.0 |
PortraitStyle | 1 | 1 | 1.54 | 1738.0 |
PortraitStyle | 8 | 1 | 2.2 | 1846.0 |
PortraitStyle | 8 | 8 | 0.95 | 1915.0 |
Background | 1 | 1 | 1.1 | 1239.0 |
Background | 8 | 1 | 1.2 | 1258.0 |
Background | 8 | 8 | 1.7 | 1305.0 |
NaturalLight | 1 | 1 | 2.37 | 1250.0 |
NaturalLight | 8 | 1 | 1.49 | 1267.0 |
NaturalLight | 8 | 8 | 1.97 | 1276.0 |
FishEye | 1 | 1 | 12.8 | 1747.0 |
FishEye | 8 | 1 | 4.8 | 1747.0 |
FishEye | 8 | 8 | 0.6 | 1771.0 |
RedEye | 1 | 1 | 5.7 | 1241.0 |
RedEye | 8 | 1 | 1.9 | 1260.0 |
RedEye | 8 | 8 | 1.6 | 1264.0 |
HeadWear | 1 | 1 | 4.09 | 1742.0 |
HeadWear | 8 | 1 | 2.63 | 1769.0 |
HeadWear | 8 | 8 | 1.2 | 1773.0 |
EyeBrowEstimator | 1 | 1 | 13.06 | 1751.0 |
EyeBrowEstimator | 8 | 1 | 4.82 | 1769.0 |
EyeBrowEstimator | 8 | 8 | 4.27 | 1781.0 |
HumanAttributeEstimator | 1 | 1 | 11.93 | 1624.0 |
HumanAttributeEstimator | 8 | 1 | 5.83 | 1651.0 |
HumanAttributeEstimator | 8 | 8 | 3.78 | 1699.0 |
Mouth | 1 | 1 | 6.64 | 1252.0 |
Mouth | 8 | 1 | 2.64 | 1271.0 |
Mouth | 8 | 8 | 2.12 | 1290.0 |
CrowdEstimator (Single, minHeadSize=6) | 1 | 1 | 2869.16 | 2383.0 |
CrowdEstimator (Single, minHeadSize=6) | 8 | 1 | 821.13 | 2409.0 |
CrowdEstimator (Single, minHeadSize=6) | 8 | 8 | 599.11 | 7156.0 |
CrowdEstimator (Single, minHeadSize=12) | 1 | 1 | 708.91 | 1902.0 |
CrowdEstimator (Single, minHeadSize=12) | 8 | 1 | 211.97 | 1920.0 |
CrowdEstimator (Single, minHeadSize=12) | 8 | 8 | 148.91 | 3156.0 |
CrowdEstimator (TwoNets, minHeadSize=6) | 1 | 1 | 2945.1 | 2606.0 |
CrowdEstimator (TwoNets, minHeadSize=6) | 8 | 1 | 832.58 | 2641.0 |
CrowdEstimator (TwoNets, minHeadSize=6) | 8 | 8 | 626.34 | 8823.0 |
CrowdEstimator (TwoNets, minHeadSize=12) | 1 | 1 | 735.29 | 2059.0 |
CrowdEstimator (TwoNets, minHeadSize=12) | 8 | 1 | 214.3 | 2101.0 |
CrowdEstimator (TwoNets, minHeadSize=12) | 8 | 8 | 160.87 | 3812.0 |
DynamicRange | 1 | 1 | 1.49 | 1721.0 |
DynamicRange | 8 | 1 | 1.61 | 1749.0 |
DynamicRange | 8 | 8 | 0.81 | 1793.0 |
LivenessDepthRGB | 1 | 1 | 8.06 | 1757.0 |
LivenessDepthRGB | 8 | 1 | 4.13 | 1796.0 |
LivenessDepthRGB | 8 | 8 | 2.96 | 1839.0 |
Glasses | 1 | 1 | 0.86 | 1743.0 |
Glasses | 8 | 1 | 1.01 | 1768.0 |
Glasses | 8 | 8 | 0.42 | 1768.0 |
DeepFake | 1 | 1 | 235.39 | 1993.0 |
DeepFake | 8 | 1 | 78.87 | 2123.0 |
DeepFake | 8 | 8 | 81.86 | 2457.0 |
NIRLivenessEstimator | 1 | 1 | 15.49 | 1625.0 |
NIRLivenessEstimator | 8 | 1 | 10.05 | 1639.0 |
NIRLivenessEstimator | 8 | 8 | 9.47 | 1747.0 |
LivenessRGBMEstimator | 1 | 1 | 27.89 | 1749.0 |
LivenessRGBMEstimator | 8 | 1 | 9.8 | 1766.0 |
LivenessRGBMEstimator | 8 | 8 | 10.94 | 2175.0 |
CPU. Estimations performance without batch interface#
The table below shows the performance of Estimations on the CPU for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50
.
Measurement | CPU threads | Average (ms) | RAM Memory (Mb) |
---|---|---|---|
EyesGaze | 1 | 2.2 | 1250 |
EyesGaze | 8 | 1.4 | 1270 |
Emotions | 1 | 13.6 | 1262 |
Emotions | 8 | 4.9 | 1275 |
Attributes | 1 | 64.18 | 1789.0 |
Attributes | 8 | 23.29 | 1809.0 |
Attributes | 8 | 18.19 | 2086.0 |
Quality | 1 | 1.2 | 1178 |
Quality | 8 | 0.6 | 1220 |
Overlap | 1 | 4.5 | 1248 |
Overlap | 8 | 1.3 | 1267 |
PPE | 1 | 10.04 | 1695 |
PPE | 8 | 4.71 | 1718 |
LivenessFlyingFaces | 1 | 15.07 | 1804 |
LivenessFlyingFaces | 8 | 7.21 | 1913 |
LivenessFPR | 1 | 44.2 | 1263 |
LivenessFPR | 8 | 19.9 | 1293 |
Fights | 1 | 250.26 | 1876 |
Fights | 8 | 63.9 | 1895 |
CPU. Extractor performance#
The table below shows the performance of Extractor on the CPU.
Model | CPU threads | Batch Size | Average (ms) | RAM Memory (Mb) |
---|---|---|---|---|
58 | 1 | 1 | 219.3 | 1470 |
58 | 8 | 8 | 58.0 | 1543 |
59 | 1 | 1 | 219.7 | 1473 |
59 | 8 | 8 | 58.2 | 1550 |
60 | 1 | 1 | 258.0 | 1473 |
60 | 8 | 8 | 51.1 | 1550 |
62 | 1 | 1 | 254.36 | 2007 |
62 | 8 | 1 | 67.54 | 2008 |
62 | 8 | 8 | 71.48 | 2025 |
105 | 1 | 1 | 1.66 | 1604 |
105 | 8 | 8 | 0.71 | 1657 |
106 | 1 | 1 | 140.76 | 1892 |
106 | 8 | 8 | 39.01 | 1954 |
107 | 1 | 1 | 12.0 | 1637 |
107 | 8 | 8 | 3.7 | 1723 |
108 | 1 | 1 | 1.69 | 1606 |
108 | 8 | 8 | 0.72 | 1671 |
109 | 1 | 1 | 133.7 | 1822 |
109 | 8 | 8 | 37.33 | 1889 |
110 | 1 | 1 | 15.53 | 1640 |
110 | 8 | 8 | 5.39 | 1733 |
112 | 1 | 1 | 112.33 | 1823.0 |
112 | 8 | 1 | 39.73 | 1839.0 |
112 | 8 | 8 | 32.95 | 1884.0 |
113 | 1 | 1 | 15.17 | 1640.0 |
113 | 8 | 1 | 6.57 | 1656.0 |
113 | 8 | 8 | 4.7 | 1727.0 |
CPU. Matcher performance#
The table below shows the performance of Matcher on the CPU. The table includes average matcher per second for descriptors received using the following CNN model versions:
- face descriptors: 59, 60, 62
- human body descriptors: 105, 106, 107, 108, 109, 110, 112, 113
Model | CPU threads | Batch Size | Average (matches/sec) | RAM Memory (Mb) |
---|---|---|---|---|
58 | 1 | 1000 | 28 M | 15.0 |
59 | 1 | 1000 | 28 M | 15.0 |
60 | 1 | 1000 | 28 M | 15.0 |
62 | 1 | 1000 | 28 M | 15.0 |
105 | 1 | 1000 | 27.78 M | 113 |
106 | 1 | 1000 | 28.67 M | 112 |
107 | 1 | 1000 | 27.34 M | 113 |
108 | 1 | 1000 | 31.89 M | 117 |
109 | 1 | 1000 | 29.23 M | 114 |
110 | 1 | 1000 | 27.41 M | 112 |
112 | 1 | 1000 | 30 M | 109.0 |
113 | 1 | 1000 | 28.32 | 112.0 |
Note: The above value is the maximum performance of the matcher on a particular piece of hardware. Performance in general does not depend on the size of the batch, but may be limited by memory performance at large values of the batch size.
GPU performance#
Benchmarking for GPU was performed on the following hardware configuration:
GPU: NVIDIA Tesla T4.
OS: CentOS Linux release 8.3.2011
GPU. Detector performance#
The table below shows the performance of FaceDetV3 Detector on the GPU.
Measurement | Batch Size | Average (ms) | GPU Memory (Mb) | RAM Memory (Mb) |
---|---|---|---|---|
Detector (minFaceSize=20) | 1 | 29.02 | 1485.0 | 1663.0 |
Detector (minFaceSize=20) | 4 | 34.37 | 3611.0 | 1691.0 |
Detector (minFaceSize=20) | 8 | 38.09 | 6539.0 | 1741.0 |
Detector (minFaceSize=50) | 1 | 7.46 | 847.0 | 1653.0 |
Detector (minFaceSize=50) | 4 | 6.56 | 1207.0 | 1682.0 |
Detector (minFaceSize=50) | 8 | 6.24 | 1779.0 | 1702.0 |
Detector (minFaceSize=90) | 1 | 4.95 | 835.0 | 1655.0 |
Detector (minFaceSize=90) | 4 | 3.44 | 907.0 | 1669.0 |
Detector (minFaceSize=90) | 8 | 3.17 | 1381.0 | 1694.0 |
Redetect | 1 | 2.52 | 847.0 | 1657.0 |
Redetect | 4 | 1.64 | 1207.0 | 1660.0 |
Redetect | 8 | 1.47 | 1779.0 | 1663.0 |
Redetect | 16 | 1.38 | 2781.0 | 1667.0 |
FaceLandmarks5Detector | 1 | 2.33 | 821.0 | 1651.0 |
FaceLandmarks5Detector | 8 | 0.32 | 821.0 | 1651.0 |
FaceLandmarks5Detector | 16 | 0.17 | 821.0 | 1657.0 |
FaceLandmarks68Detector | 1 | 2.6 | 821.0 | 1669.0 |
FaceLandmarks68Detector | 8 | 1.5 | 821.0 | 1668.3 |
FaceLandmarks68Detector | 16 | 1.4 | 949.0 | 1663.0 |
The table below shows the performance of FaceDetV3m Detector on the GPU.
Type | Batch Size | Average (ms) | GPU Memory (Mb) | RAM Memory (Mb) |
---|---|---|---|---|
Detector (minFaceSize=20) | 1 | 16.06 | 1245.0 | 1675.0 |
Detector (minFaceSize=20) | 4 | 16.15 | 2633.0 | 1712.0 |
Detector (minFaceSize=20) | 8 | 20.64 | 4573.0 | 1768.0 |
Detector (minFaceSize=50) | 1 | 5.52 | 857.0 | 1670.0 |
Detector (minFaceSize=50) | 4 | 4.02 | 1097.0 | 1697.0 |
Detector (minFaceSize=50) | 8 | 3.79 | 1527.0 | 1738.0 |
Detector (minFaceSize=90) | 1 | 4.41 | 809.0 | 1675.0 |
Detector (minFaceSize=90) | 4 | 2.6 | 905.0 | 1704.0 |
Detector (minFaceSize=90) | 8 | 2.32 | 1025.0 | 1724.0 |
Redetect | 1 | 2.69 | 857.0 | 1683.0 |
Redetect | 8 | 1.48 | 1527.0 | 1694.0 |
Redetect | 16 | 1.35 | 2199.0 | 1697.0 |
Landmarks5Detector | 1 | 2.39 | 857.0 | 1677.0 |
Landmarks5Detector | 8 | 0.33 | 857.0 | 1684.0 |
Landmarks5Detector | 16 | 0.18 | 857.0 | 1685.0 |
Landmarks68Detector | 1 | 2.85 | 857.0 | 1684.0 |
Landmarks68Detector | 8 | 0.44 | 857.0 | 1673.0 |
Landmarks68Detector | 16 | 0.26 | 889.0 | 1692.0 |
GPU. HumanDetector performance#
The table below shows the performance of HumanDetector on the GPU.
Measurement | Batch Size | Average (ms) | GPU Memory (Mb) | RAM Memory (Mb) |
---|---|---|---|---|
HumanDetector (resize to 320) | 1 | 4.17 | 779.0 | 1778.0 |
HumanDetector (resize to 320) | 4 | 2.46 | 819.0 | 1792.0 |
HumanDetector (resize to 320) | 8 | 2.17 | 909.0 | 1815.0 |
HumanDetector (resize to 640) | 1 | 5.42 | 827.0 | 1784.0 |
HumanDetector (resize to 640) | 4 | 4.14 | 1013.0 | 1796.0 |
HumanDetector (resize to 640) | 8 | 3.92 | 1371.0 | 1824.0 |
HumanRedetect | 1 | 2.74 | 789.0 | 1696.0 |
HumanRedetect | 4 | 1.67 | 1013.0 | 1695.0 |
HumanRedetect | 8 | 1.47 | 1251.0 | 1689.0 |
HumanRedetect | 16 | 1.4 | 1867.0 | 1709.0 |
GPU. HeadDetector performance#
The table below shows the performance of HeadDetector on the GPU.
Measurement | Batch Size | Average (ms) | GPU Memory (Mb) | RAM Memory (Mb) |
---|---|---|---|---|
HeadDetector (minHeadSize=20) | 1 | 28.29 | 1533 | 1679 |
HeadDetector (minHeadSize=20) | 4 | 33.81 | 3691 | 1698 |
HeadDetector (minHeadSize=20) | 8 | 37.59 | 6623 | 1755 |
HeadDetector (minHeadSize=50) | 1 | 7.07 | 915 | 1668 |
HeadDetector (minHeadSize=50) | 4 | 6.42 | 1255 | 1687 |
HeadDetector (minHeadSize=50) | 8 | 6.11 | 1827 | 1713 |
HeadDetector (minHeadSize=90) | 1 | 4.75 | 915 | 1674 |
HeadDetector (minHeadSize=90) | 4 | 3.27 | 955 | 1684 |
HeadDetector (minHeadSize=90) | 8 | 3.03 | 1129 | 1714 |
GPU. HumanFace detector performance#
The table below shows the performance of HumanFaceDetector on the GPU.
Measurement | Batch Size | Average (ms) | GPU Memory (Mb) | RAM Memory (Mb) |
---|---|---|---|---|
HumanFaceDetector (minFaceSize=20) | 1 | 34.1 | 1675.0 | 1703.0 |
HumanFaceDetector (minFaceSize=20) | 4 | 42.6 | 4415.0 | 1774.0 |
HumanFaceDetector (minFaceSize=20) | 8 | 50.32 | 8041.0 | 1889.0 |
HumanFaceDetector (minFaceSize=50) | 1 | 7.99 | 903.0 | 1674.0 |
HumanFaceDetector (minFaceSize=50) | 4 | 7.15 | 1487.0 | 1706.0 |
HumanFaceDetector (minFaceSize=50) | 8 | 6.83 | 2067.0 | 1764.0 |
HumanFaceDetector (minFaceSize=90) | 1 | 5.3 | 903.0 | 1672.0 |
HumanFaceDetector (minFaceSize=90) | 4 | 3.52 | 929.0 | 1685.0 |
HumanFaceDetector (minFaceSize=90) | 8 | 3.24 | 1125.0 | 1719.0 |
GPU. Estimations performance with batch interface#
The table below shows the performance of Estimations on the GPU for estimators that have a batch interface. All these measurements are performed with minFaceSize=50
.
Measurement | Batch Size | Average (ms) | GPU Memory (Mb) | RAM Memory (Mb) |
---|---|---|---|---|
HeadPoseByImage | 1 | 2.26 | 785.0 | 1692.0 |
HeadPoseByImage | 16 | 1.45 | 881.0 | 1775.0 |
HeadPoseByImage | 32 | 1.42 | 975.0 | 1873.0 |
Warper | 1 | 0.11 | 739.0 | 1672.0 |
Warper | 32 | 0.03 | 931.0 | 1672.0 |
Eyes (INFRA_RED, useStatusPlan=0) | 1 | 0.65 | 811.0 | 1668.0 |
Eyes (INFRA_RED, useStatusPlan=0) | 16 | 0.23 | 811.0 | 1667.0 |
Eyes (INFRA_RED, useStatusPlan=0) | 32 | 0.2 | 811.0 | 1674.0 |
Eyes (RGB, useStatusPlan=0) | 1 | 1.19 | 821.0 | 1681.0 |
Eyes (RGB, useStatusPlan=0) | 16 | 0.44 | 821.0 | 1669.0 |
Eyes (RGB, useStatusPlan=0) | 32 | 0.43 | 853.0 | 1683.0 |
Eyes (INFRA_RED, useStatusPlan=1) | 1 | 0.64 | 811.0 | 1666.0 |
Eyes (INFRA_RED, useStatusPlan=1) | 16 | 0.23 | 811.0 | 1678.0 |
Eyes (INFRA_RED, useStatusPlan=1) | 32 | 0.2 | 811.0 | 1672.0 |
Eyes (RGB, useStatusPlan=1) | 1 | 0.66 | 821.0 | 1671.0 |
Eyes (RGB, useStatusPlan=1) | 16 | 0.24 | 821.0 | 1673.0 |
Eyes (RGB, useStatusPlan=1) | 32 | 0.23 | 853.0 | 1680.0 |
Infra-Red | 1 | 1.11 | 811.0 | 1666.0 |
Infra-Red | 32 | 0.54 | 811.0 | 1679.0 |
AGS | 1 | 2.28 | 899.0 | 1689.0 |
AGS | 16 | 1.42 | 899.0 | 1777.0 |
AGS | 32 | 1.39 | 1089.0 | 1874.0 |
BlackWhite | 1 | 1.05 | 821.0 | 1676.0 |
BlackWhite | 16 | 0.4 | 853.0 | 1677.0 |
BestShotQuality | 1 | 2.31 | 821.0 | 1677.0 |
BestShotQuality | 16 | 1.45 | 917.0 | 1765.0 |
MedicalMask | 1 | 5.01 | 821.0 | 1702.0 |
MedicalMask | 16 | 1.69 | 917.0 | 1791.0 |
LivenessOneShotRGBEstimator | 1 | 14.48 | 1111.0 | 1858.0 |
LivenessOneShotRGBEstimator | 8 | 11.89 | 1747.0 | 1860.0 |
LivenessOneShotRGBEstimator | 16 | 11.69 | 2423.0 | 1860.0 |
Orientation | 1 | 3.12 | 799.0 | 1670.0 |
Orientation | 16 | 1.73 | 963.0 | 1664.0 |
Orientation | 32 | 1.69 | 1141.0 | 1669.0 |
CredibilityCheck | 1 | 5.54 | 947.0 | 1774.0 |
CredibilityCheck | 16 | 3.72 | 1339.0 | 1771.0 |
FacialHair | 1 | 1.86 | 853.0 | 1687.0 |
FacialHair | 16 | 0.32 | 853.0 | 1683.0 |
FacialHair | 32 | 0.28 | 853.0 | 1685.0 |
PortraitStyle | 1 | 2.84 | 895.0 | 1671.0 |
PortraitStyle | 16 | 1.51 | 915.0 | 1770.0 |
PortraitStyle | 32 | 1.48 | 1085.0 | 1861.0 |
Background | 1 | 2.6 | 821.0 | 1679.0 |
Background | 16 | 1.5 | 917.0 | 1770.0 |
NaturalLight | 1 | 3.61 | 853.0 | 1692.0 |
NaturalLight | 16 | 0.27 | 853.0 | 1695.0 |
FishEye | 1 | 2.37 | 895.0 | 1692.0 |
FishEye | 16 | 0.14 | 895.0 | 1694.0 |
RedEye | 1 | 1.1 | 821.0 | 1675.0 |
RedEye | 16 | 0.15 | 821.0 | 1675.0 |
HeadWear | 1 | 4.14 | 853.0 | 1696.0 |
HeadWear | 16 | 0.36 | 853.0 | 1699.0 |
HeadWear | 32 | 0.27 | 853.0 | 1697.0 |
EyeBrowEstimator | 1 | 2.56 | 895.0 | 1694.0 |
EyeBrowEstimator | 16 | 0.8 | 895.0 | 1693.0 |
EyeBrowEstimator | 32 | 0.76 | 803.0 | 1079.0 |
HumanAttributeEstimator | 1 | 5.53 | 853.0 | 1691.0 |
HumanAttributeEstimator | 16 | 0.57 | 853.0 | 1722.0 |
Mouth | 1 | 4.03 | 853.0 | 1690.0 |
Mouth | 16 | 0.42 | 949.0 | 1691.0 |
Mouth | 32 | 0.37 | 1043.0 | 1690.0 |
Glasses | 1 | 1.41 | 901.0 | 1695.0 |
Glasses | 16 | 0.2 | 901.0 | 1689.0 |
Glasses | 32 | 0.16 | 901.0 | 1686.0 |
CrowdEstimator (Single, minHeadSize=6) | 1 | 72.73 | 1617.0 | 1807.0 |
CrowdEstimator (Single, minHeadSize=6) | 4 | 74.16 | 3169.0 | 1816.0 |
CrowdEstimator (Single, minHeadSize=6) | 8 | 74.98 | 3317.0 | 1847.0 |
CrowdEstimator (Single, minHeadSize=12) | 1 | 24.79 | 1493.0 | 1804.0 |
CrowdEstimator (Single, minHeadSize=12) | 4 | 25.02 | 2123.0 | 1807.0 |
CrowdEstimator (Single, minHeadSize=12) | 8 | 24.75 | 2857.0 | 1847.0 |
CrowdEstimator (TwoNets, minHeadSize=6) | 1 | 82.54 | 1831.0 | 1816.0 |
CrowdEstimator (TwoNets, minHeadSize=6) | 4 | 83.21 | 4123.0 | 1838.0 |
CrowdEstimator (TwoNets, minHeadSize=6) | 8 | 84.62 | 5056.0 | 1901.0 |
CrowdEstimator (TwoNets, minHeadSize=12) | 1 | 29.07 | 1659.0 | 1812.0 |
CrowdEstimator (TwoNets, minHeadSize=12) | 4 | 25.8 | 2653.0 | 1827.0 |
CrowdEstimator (TwoNets, minHeadSize=12) | 8 | 26.5 | 3480.0 | 1854.0 |
DeepFake | 1 | 17.3 | 1005.0 | 1955.0 |
DeepFake | 16 | 14.49 | 1889.0 | 2046.0 |
DeepFake | 32 | 14.46 | 2951.0 | 2140.0 |
LivenessDepthRGB | 1 | 4.79 | 931.0 | 1717.0 |
LivenessDepthRGB | 16 | 3.91 | 975.0 | 1809.0 |
LivenessDepthRGB | 32 | 3.9 | 1127.0 | 1914.0 |
NIRLivenessEstimator | 1 | 8.36 | 817.0 | 1677.0 |
NIRLivenessEstimator | 16 | 7.74 | 915.0 | 1775.0 |
NIRLivenessEstimator | 32 | 7.65 | 1043.0 | 1878.0 |
LivenessRGBMEstimator | 1 | 6.51 | 899.0 | 1684.0 |
LivenessRGBMEstimator | 16 | 4.11 | 1609.0 | 1828.0 |
LivenessRGBMEstimator | 32 | 4.3 | 2365.0 | 1967.0 |
Attributes | 1 | 4.13 | 901.0 | 1744.0 |
Attributes | 16 | 2.05 | 1403.0 | 1753.0 |
Attributes | 32 | 1.97 | 1951.0 | 1753.0 |
GPU. Estimations performance without batch interface#
The table below shows the performance of Estimations on the GPU for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50
.
Measurement | Average (ms) | GPU Memory (Mb) | RAM Memory (Mb) |
---|---|---|---|
EyesGaze | 1.65 | 821 | 1675 |
Emotions | 1.99 | 821 | 1689 |
Quality | 0.98 | 731 | 1665 |
Overlap | 1.23 | 821 | 1688 |
PPE | 4.66 | 869 | 1696 |
LivenessFlyingFaces | 6.39 | 927 | 1694 |
LivenessFPR | 12.56 | 885 | 1697 |
Fights | 14.56 | 1093 | 1874 |
GPU. Extractor performance#
The table below shows the performance of Extractor on the GPU.
Model | Batch Size | Average (ms) | GPU Memory (Mb) | RAM Memory (Mb) |
---|---|---|---|---|
58 | 1 | 10.2 | 989.0 | 1835 |
58 | 16 | 6.4 | 1781.0 | 1825 |
59 | 1 | 10.2 | 929.0 | 1833 |
59 | 16 | 6.4 | 1341.0 | 1837 |
60 | 1 | 16.0 | 931.0 | 1840 |
60 | 16 | 8.9 | 1343.0 | 1845 |
62 | 1 | 11.23 | 1043.0 | 2009.0 |
62 | 8 | 7.81 | 1227.0 | 2006.0 |
62 | 16 | 7.75 | 1437.0 | 2016.0 |
105 | 1 | 3.48 | 785 | 1664 |
105 | 16 | 0.3 | 815 | 1673 |
106 | 1 | 6.28 | 973 | 1893 |
106 | 16 | 9.38 | 1371 | 1894 |
107 | 1 | 3.41 | 807 | 1698 |
107 | 16 | 0.59 | 911 | 1696 |
108 | 1 | 3.47 | 785 | 1654 |
108 | 16 | 0.3 | 815 | 1672 |
109 | 1 | 6.22 | 933 | 1833 |
109 | 16 | 7.83 | 1261 | 1833 |
110 | 1 | 3.38 | 809 | 1693 |
110 | 16 | 0.76 | 939 | 1693 |
112 | 1 | 6.52 | 901.0 | 1836.0 |
112 | 8 | 3.71 | 1029.0 | 1834.0 |
112 | 16 | 3.57 | 1209.0 | 1835.0 |
113 | 1 | 3.13 | 809.0 | 1696.0 |
113 | 8 | 0.82 | 873.0 | 1697.0 |
113 | 16 | 0.68 | 937.0 | 1703.0 |
NPU Performance#
Benchmarking for NPU was performed on the server with the following hardware configuration:
NPU: Huawei Atlas 300I (inference card).
OS: Ubuntu 18.04
CPU: Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz x 48
RAM: 64GB
NPU. Detector performance#
The table below shows the performance of Detector on the NPU.
Measurement | BatchSize | Average (ms) |
---|---|---|
Detector (minFaceSize=20) | 1 | 24.4 |
Detector (minFaceSize=20) | 4 | 18.01 |
Detector (minFaceSize=20) | 8 | 17.73 |
Detector (minFaceSize=50) | 1 | 24.53 |
Detector (minFaceSize=50) | 4 | 18.0 |
Detector (minFaceSize=50) | 8 | 17.74 |
Detector (minFaceSize=90) | 1 | 24.44 |
Detector (minFaceSize=90) | 4 | 17.91 |
Detector (minFaceSize=90) | 8 | 17.44 |
Redetect | 1 | 7.56 |
Redetect | 8 | 4.31 |
Redetect | 16 | 4.08 |
NPU. Estimations performance with batch interface#
The table below shows the performance of Estimations on the NPU for estimators that have a batch interface. All these measurements are performed with minFaceSize=50
.
Measurement | BatchSize | Average (ms) |
---|---|---|
HeadPoseByImage | 1 | 8.0 |
HeadPoseByImage | 16 | 4.2 |
HeadPoseByImage | 32 | 3.9 |
AGS | 1 | 6.6 |
AGS | 16 | 3.7 |
AGS | 32 | 3.7 |
BestShotQuality | 1 | 15.6 |
BestShotQuality | 16 | 7.8 |
BestShotQuality | 32 | 7.6 |
MedicalMask | 1 | 6.1 |
MedicalMask | 16 | 3.8 |
MedicalMask | 32 | 3.7 |
NPU. Estimations performance without batch interface#
The table below shows the performance of Estimations on the NPU for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50
.
Measurement | Average (ms) |
---|---|
Warper | 2.1 |
NPU. Extractor performance#
The table below shows the performance of Extractor on the NPU.
Type | Model | Batch Size | Average (ms) |
---|---|---|---|
Extractor | 57 | 1 | 10.9 |
Extractor | 57 | 16 | 7.4 |
Runtime performance for macOS environment#
Face detection performance depends on input image parameters such as resolution and bit depth as well as the size of the detected face.
Input data characteristics:
- Image resolution: 1920x1080px;
- Image format: 24 BPP RGB;
Performance measurements are presented for CPU execution modes in tables below. Measured values are averages of at least 1000 experiments.
The results for minimum batch size and optimal batch size are shown in the tables below. All the intermediate and non-optimal values are omitted.
Face detections are performed using FaceDetV3 NN.
Intel-based processor performance (x86_64)#
Benchmarking for CPU was performed on the device with the following configuration:
Hardware Overview:
- Model Name: Mac mini
- Processor Name: 6-Core Intel Core i5
- Processor Speed: 3 GHz
- Number of Processors: 1
- Total Number of Cores: 6
- Memory: 16 GB
- CPU with AVX2 instruction set was used
OS: macOS 11.2.1
In experiments listed in tables below face detection and descriptor extraction algorithms used all available CPU cores, whereas matching performance is specified per-core.
Intel-based processor. Detector performance#
The table below shows the performance of FaceDetV3 Detector on the Intel-based processor.
Measurement | CPU threads | BatchSize | Average (ms) |
---|---|---|---|
Detector (minFaceSize=20) | 1 | 1 | 284.79 |
Detector (minFaceSize=20) | 8 | 1 | 159.0 |
Detector (minFaceSize=20) | 8 | 4 | 168.68 |
Detector (minFaceSize=20) | 8 | 8 | 171.88 |
Detector (minFaceSize=50) | 1 | 1 | 45.3 |
Detector (minFaceSize=50) | 8 | 1 | 27.3 |
Detector (minFaceSize=50) | 8 | 4 | 28.58 |
Detector (minFaceSize=50) | 8 | 8 | 29.06 |
Detector (minFaceSize=90) | 1 | 1 | 18.91 |
Detector (minFaceSize=90) | 8 | 1 | 9.7 |
Detector (minFaceSize=90) | 8 | 4 | 9.6 |
Detector (minFaceSize=90) | 8 | 8 | 10.02 |
Redetect | 1 | 1 | 0.75 |
Redetect | 8 | 1 | 0.87 |
Redetect | 8 | 4 | 0.28 |
Redetect | 8 | 8 | 0.27 |
FaceLandmarks5Detector | 1 | 1 | 0.2 |
FaceLandmarks5Detector | 8 | 1 | 0.3 |
FaceLandmarks5Detector | 8 | 8 | 0.1 |
FaceLandmarks68Detector | 1 | 1 | 2.3 |
FaceLandmarks68Detector | 8 | 1 | 1.7 |
FaceLandmarks68Detector | 8 | 8 | 1.1 |
The table below shows the performance of FaceDetV3m Detector on the Intel-based processor.
Type | CPU threads | Batch Size | Average (ms) | RAM Memory (Mb) |
---|---|---|---|---|
Detector (minFaceSize=20) | 1 | 1 | 141.82 | 512.0 |
Detector (minFaceSize=20) | 8 | 1 | 89.47 | 944.0 |
Detector (minFaceSize=20) | 8 | 8 | 91.88 | 3882.0 |
Detector (minFaceSize=50) | 1 | 1 | 21.84 | 115.0 |
Detector (minFaceSize=50) | 8 | 1 | 13.11 | 120.0 |
Detector (minFaceSize=50) | 8 | 8 | 15.6 | 732.0 |
Detector (minFaceSize=90) | 1 | 1 | 8.13 | 64.0 |
Detector (minFaceSize=90) | 8 | 1 | 5.67 | 66.0 |
Detector (minFaceSize=90) | 8 | 8 | 5.56 | 343.0 |
Redetect | 1 | 1 | 0.38 | 116.0 |
Redetect | 8 | 1 | 0.49 | 117.0 |
Redetect | 8 | 8 | 0.2 | 649.0 |
Landmarks5Detector | 1 | 1 | 0.14 | 121.0 |
Landmarks5Detector | 8 | 1 | 0.21 | 121.0 |
Landmarks5Detector | 8 | 8 | 0.06 | 121.0 |
Landmarks68Detector | 1 | 1 | 2.02 | 121.0 |
Landmarks68Detector | 8 | 1 | 1.33 | 122.0 |
Landmarks68Detector | 8 | 8 | 0.67 | 130.0 |
Intel-based processor. HumanDetector performance#
The table below shows the performance of HumanDetector on the Intel-based processor.
Measurement | CPU threads | BatchSize | Average (ms) |
---|---|---|---|
HumanDetector (resize to 320) | 1 | 1 | 7.64 |
HumanDetector (resize to 320) | 8 | 1 | 6.36 |
HumanDetector (resize to 320) | 8 | 8 | 4.09 |
HumanDetector (resize to 640) | 1 | 1 | 23.19 |
HumanDetector (resize to 640) | 8 | 1 | 9.62 |
HumanDetector (resize to 640) | 8 | 8 | 9.87 |
HumanRedetect | 1 | 1 | 3.69 |
HumanRedetect | 8 | 1 | 1.96 |
HumanRedetect | 8 | 4 | 1.21 |
HumanRedetect | 8 | 8 | 1.33 |
Intel-based processor. HumanFaceDetector performance#
The table below shows the performance of HumanFaceDetector on the Intel-based processor.
Measurement | CPU threads | BatchSize | Average (ms) |
---|---|---|---|
HumanFaceDetector (minFaceSize=20) | 1 | 1 | 331.58 |
HumanFaceDetector (minFaceSize=20) | 4 | 1 | 192.95 |
HumanFaceDetector (minFaceSize=20) | 4 | 4 | 200.72 |
HumanFaceDetector (minFaceSize=20) | 4 | 8 | 203.85 |
HumanFaceDetector (minFaceSize=50) | 1 | 1 | 52.02 |
HumanFaceDetector (minFaceSize=50) | 4 | 1 | 28.25 |
HumanFaceDetector (minFaceSize=50) | 4 | 4 | 31.48 |
HumanFaceDetector (minFaceSize=50) | 4 | 8 | 32.59 |
HumanFaceDetector (minFaceSize=90) | 1 | 1 | 20.5 |
HumanFaceDetector (minFaceSize=90) | 4 | 1 | 9.8 |
HumanFaceDetector (minFaceSize=90) | 4 | 4 | 9.55 |
HumanFaceDetector (minFaceSize=90) | 4 | 8 | 9.91 |
Intel-based processor. HeadDetector performance#
The table below shows the performance of HeadDetector on the Intel-based processor.
Measurement | CPU threads | BatchSize | Average (ms) |
---|---|---|---|
HeadDetector (minHeadSize=20) | 1 | 1 | 281.76 |
HeadDetector (minHeadSize=20) | 8 | 1 | 150.13 |
HeadDetector (minHeadSize=20) | 8 | 8 | 167.13 |
HeadDetector (minHeadSize=50) | 1 | 1 | 44.18 |
HeadDetector (minHeadSize=50) | 8 | 1 | 24.76 |
HeadDetector (minHeadSize=50) | 8 | 8 | 28.49 |
HeadDetector (minHeadSize=90) | 1 | 1 | 21.22 |
HeadDetector (minHeadSize=90) | 8 | 1 | 7.92 |
HeadDetector (minHeadSize=90) | 8 | 8 | 10.1 |
Intel-based processor. Estimations performance with batch interface#
The table below shows the performance of Estimations on the Intel-based processor for estimators that have a batch interface. All these measurements are performed with minFaceSize=50
.
Measurement | CPU threads | BatchSize | Average (ms) |
---|---|---|---|
HeadPoseByImage | 1 | 1 | 0.16 |
HeadPoseByImage | 8 | 1 | 0.49 |
HeadPoseByImage | 8 | 8 | 0.12 |
Eyes (INFRA_RED, useStatusPlan=0) | 1 | 1 | 0.45 |
Eyes (INFRA_RED, useStatusPlan=0) | 8 | 1 | 0.29 |
Eyes (INFRA_RED, useStatusPlan=0) | 8 | 8 | 0.23 |
Eyes (RGB, useStatusPlan=0) | 1 | 1 | 0.85 |
Eyes (RGB, useStatusPlan=0) | 8 | 1 | 0.58 |
Eyes (RGB, useStatusPlan=0) | 8 | 8 | 0.48 |
Eyes (INFRA_RED, useStatusPlan=1) | 1 | 1 | 0.85 |
Eyes (INFRA_RED, useStatusPlan=1) | 8 | 1 | 0.57 |
Eyes (INFRA_RED, useStatusPlan=1) | 8 | 8 | 0.47 |
Eyes (RGB, useStatusPlan=1) | 1 | 1 | 0.85 |
Eyes (RGB, useStatusPlan=1) | 8 | 1 | 0.58 |
Eyes (RGB, useStatusPlan=1) | 8 | 8 | 0.47 |
Infra-Red | 1 | 1 | 1.51 |
Infra-Red | 8 | 1 | 0.71 |
Infra-Red | 8 | 8 | 0.81 |
AGS | 1 | 1 | 0.29 |
AGS | 8 | 1 | 0.21 |
AGS | 8 | 8 | 0.09 |
BlackWhite | 1 | 1 | 1.0 |
BlackWhite | 8 | 1 | 0.4 |
BlackWhite | 8 | 8 | 1.0 |
BestShotQuality | 1 | 1 | 0.18 |
BestShotQuality | 8 | 1 | 0.14 |
BestShotQuality | 8 | 8 | 0.06 |
MedicalMask | 1 | 1 | 4.15 |
MedicalMask | 8 | 1 | 2.2 |
MedicalMask | 8 | 8 | 2.2 |
LivenessOneShotRGBEstimator | 1 | 1 | 126.6 |
LivenessOneShotRGBEstimator | 8 | 1 | 53.94 |
LivenessOneShotRGBEstimator | 8 | 8 | 57.83 |
Orientation | 1 | 1 | 5.58 |
Orientation | 8 | 1 | 3.07 |
Orientation | 8 | 8 | 2.37 |
CredibilityCheck | 1 | 1 | 2.37 |
CredibilityCheck | 8 | 1 | 36.0 |
CredibilityCheck | 8 | 8 | 37.9 |
PortraitStyle | 1 | 1 | 1.97 |
PortraitStyle | 8 | 1 | 2.0 |
PortraitStyle | 8 | 8 | 1.09 |
Background | 1 | 1 | 0.7 |
Background | 8 | 1 | 0.7 |
Background | 8 | 8 | 1.4 |
NaturalLight | 1 | 1 | 2.9 |
NaturalLight | 8 | 1 | 1.7 |
NaturalLight | 8 | 8 | 1.8 |
FishEye | 1 | 1 | 16.7 |
FishEye | 8 | 1 | 4.0 |
FishEye | 8 | 8 | 0.7 |
RedEye | 1 | 1 | 7.5 |
RedEye | 8 | 1 | 1.4 |
RedEye | 8 | 8 | 1.5 |
HeadWear | 1 | 1 | 3.07 |
HeadWear | 8 | 1 | 2.07 |
HeadWear | 8 | 8 | 1.07 |
EyeBrowEstimator | 1 | 1 | 14.04 |
EyeBrowEstimator | 8 | 1 | 6.06 |
EyeBrowEstimator | 8 | 8 | 4.84 |
HumanAttributeEstimator | 1 | 1 | 12.13 |
HumanAttributeEstimator | 8 | 1 | 5.78 |
HumanAttributeEstimator | 8 | 8 | 4.22 |
Mouth | 1 | 1 | 5.24 |
Mouth | 8 | 1 | 2.16 |
Mouth | 8 | 8 | 2.36 |
DynamicRange | 1 | 1 | 0.29 |
DynamicRange | 8 | 1 | 0.3 |
DynamicRange | 8 | 8 | 0.09 |
Glasses | 1 | 1 | 0.62 |
Glasses | 8 | 1 | 0.65 |
Glasses | 8 | 8 | 0.31 |
NIRLivenessEstimator | 1 | 1 | 10.66 |
NIRLivenessEstimator | 8 | 1 | 7.38 |
NIRLivenessEstimator | 8 | 8 | 8.46 |
LivenessRGBMEstimator | 1 | 1 | 18.86 |
LivenessRGBMEstimator | 8 | 1 | 9.17 |
LivenessRGBMEstimator | 8 | 8 | 13.34 |
Attributes | 1 | 1 | 42.74 |
Attributes | 8 | 1 | 14.92 |
Attributes | 8 | 8 | 13.5 |
Intel-based processor. Estimations performance without batch interface#
The table below shows the performance of Estimations on the Intel-based processor for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50
.
Measurement | CPU threads | Average (ms) |
---|---|---|
EyesGaze | 1 | 1.7 |
EyesGaze | 8 | 0.9 |
Emotions | 1 | 10.5 |
Emotions | 8 | 4.1 |
Quality | 1 | 1.4 |
Quality | 8 | 0.6 |
Warper | 1 | 1.3 |
Warper | 8 | 1.0 |
Overlap | 1 | 3.4 |
Overlap | 8 | 1.1 |
PPE | 1 | 8.83 |
PPE | 8 | 4.83 |
LivenessFlyingFaces | 1 | 17.86 |
LivenessFlyingFaces | 8 | 5.96 |
LivenessFPR | 1 | 31.2 |
LivenessFPR | 8 | 18.0 |
Intel-based processor. Extractor performance#
The table below shows the performance of Extractor on the Intel-based processor.
Model | CPU threads | Batch Size | Average (ms) |
---|---|---|---|
59 | 1 | 1 | 159.34 |
59 | 8 | 1 | 49.84 |
59 | 8 | 8 | 46.81 |
60 | 1 | 1 | 151.54 |
60 | 8 | 1 | 49.41 |
60 | 8 | 8 | 46.59 |
62 | 1 | 1 | 178.57 |
62 | 8 | 1 | 59.47 |
62 | 8 | 8 | 54.32 |
105 | 1 | 1 | 1.37 |
105 | 8 | 8 | 0.63 |
106 | 1 | 1 | 107.78 |
106 | 8 | 8 | 39.31 |
107 | 1 | 1 | 13.12 |
107 | 8 | 8 | 3.78 |
108 | 1 | 1 | 1.39 |
108 | 8 | 8 | 0.78 |
109 | 1 | 1 | 94.9 |
109 | 8 | 8 | 37.09 |
110 | 1 | 1 | 15.57 |
110 | 8 | 8 | 5.16 |
112 | 1 | 1 | 80.38 |
112 | 8 | 1 | 27.2 |
112 | 8 | 8 | 24.62 |
113 | 1 | 1 | 11.33 |
113 | 8 | 1 | 4.44 |
113 | 8 | 8 | 3.5 |
Intel-based processor. Matcher performance#
The table below shows the performance of Matcher on the Intel-based processor. The table includes average matcher per second for descriptors received using CNN 59, 60, 62, 105, 106, 107, 108, 109, 110, 112, 113.
Model | CPU threads | Batch Size | Average (matches/sec) |
---|---|---|---|
59, 60, 62 | 1 | 10000 | 28 M |
105 | 1 | 10000 | 44.36 M |
106, 107 | 1 | 10000 | 45.28 M |
108, 109 | 1 | 10000 | 45.1 M |
110 | 1 | 10000 | 45.85 M |
112 | 1 | 1000 | 51.68 M |
113 | 1 | 1000 | 50.91 M |
Note: The above value is the maximum performance of the matcher on a particular piece of hardware. Performance in general does not depend on the size of the batch, but may be limited by memory performance at large values of the batch size.
ARM-based processor performance (aarch64)#
Benchmarking for CPU was performed on the device with the following configuration:
Hardware Overview:
- Model Name: Mac mini
- Chip: Apple M1
- Total Number of Cores: 8 (4 performance and 4 efficiency)
- Memory: 16 GB
- AVX2 instruction is not available
OS: macOS 11.2
In experiments listed in tables below face detection and descriptor extraction algorithms used all available CPU cores, whereas matching performance is specified per-core.
ARM-based processor. Detector performance#
The table below shows the performance of FaceDetV3 Detector on the ARM-based processor.
Measurement | CPU threads | BatchSize | Average (ms) |
---|---|---|---|
Detector (minFaceSize=20) | 1 | 1 | 627.78 |
Detector (minFaceSize=20) | 8 | 1 | 201.45 |
Detector (minFaceSize=20) | 8 | 4 | 202.34 |
Detector (minFaceSize=20) | 8 | 8 | 205.36 |
Detector (minFaceSize=50) | 1 | 1 | 107.57 |
Detector (minFaceSize=50) | 8 | 1 | 40.73 |
Detector (minFaceSize=50) | 8 | 4 | 34.36 |
Detector (minFaceSize=50) | 8 | 8 | 33.91 |
Detector (minFaceSize=90) | 1 | 1 | 37.9 |
Detector (minFaceSize=90) | 8 | 1 | 17.99 |
Detector (minFaceSize=90) | 8 | 4 | 13.28 |
Detector (minFaceSize=90) | 8 | 8 | 12.35 |
Redetect | 1 | 1 | 1.01 |
Redetect | 8 | 1 | 1.13 |
Redetect | 8 | 4 | 0.55 |
Redetect | 8 | 8 | 0.49 |
FaceLandmarks5Detector | 1 | 1 | 0.4 |
FaceLandmarks5Detector | 8 | 1 | 0.6 |
FaceLandmarks5Detector | 8 | 8 | 0.2 |
FaceLandmarks68Detector | 1 | 1 | 3.4 |
FaceLandmarks68Detector | 8 | 1 | 2.3 |
FaceLandmarks68Detector | 8 | 8 | 1.6 |
The table below shows the performance of FaceDetV3m Detector on the ARM-based processor.
Type | CPU threads | Batch Size | Average (ms) | RAM Memory (Mb) |
---|---|---|---|---|
Detector (minFaceSize=20) | 1 | 1 | 357.11 | 412.0 |
Detector (minFaceSize=20) | 8 | 1 | 116.87 | 765.0 |
Detector (minFaceSize=20) | 8 | 8 | 112.21 | 2875.0 |
Detector (minFaceSize=50) | 1 | 1 | 60.07 | 107.0 |
Detector (minFaceSize=50) | 8 | 1 | 25.43 | 156.0 |
Detector (minFaceSize=50) | 8 | 8 | 19.52 | 1052.0 |
Detector (minFaceSize=90) | 1 | 1 | 21.17 | 63.0 |
Detector (minFaceSize=90) | 8 | 1 | 12.23 | 117.0 |
Detector (minFaceSize=90) | 8 | 8 | 7.53 | 510.0 |
Redetect | 1 | 1 | 1.16 | 92.0 |
Redetect | 8 | 1 | 1.14 | 103.0 |
Redetect | 8 | 8 | 0.58 | 494.0 |
Landmarks5Detector | 1 | 1 | 0.45 | 102.0 |
Landmarks5Detector | 8 | 1 | 0.57 | 103.0 |
Landmarks5Detector | 8 | 8 | 0.26 | 103.0 |
Landmarks68Detector | 1 | 1 | 3.11 | 102.0 |
Landmarks68Detector | 8 | 1 | 2.37 | 104.0 |
Landmarks68Detector | 8 | 8 | 1.6 | 105.0 |
ARM-based processor. HumanDetector performance#
The table below shows the performance of HumanDetector on the ARM-based processor.
Measurement | CPU threads | BatchSize | Average (ms) |
---|---|---|---|
HumanDetector (resize to 320) | 1 | 1 | 25.02 |
HumanDetector (resize to 320) | 8 | 1 | 21.45 |
HumanDetector (resize to 320) | 8 | 8 | 9.34 |
HumanDetector (resize to 640) | 1 | 1 | 93.53 |
HumanDetector (resize to 640) | 8 | 1 | 38.99 |
HumanDetector (resize to 640) | 8 | 8 | 32.29 |
HumanRedetect | 1 | 1 | 3.51 |
HumanRedetect | 8 | 1 | 2.31 |
HumanRedetect | 8 | 4 | 1.67 |
HumanRedetect | 8 | 8 | 1.51 |
ARM-based processor. HumanFaceDetector performance#
The table below shows the performance of HumanFaceDetector on the ARM-based processor.
Measurement | CPU threads | BatchSize | Average (ms) |
---|---|---|---|
HumanFaceDetector (minFaceSize=20) | 1 | 1 | 715.22 |
HumanFaceDetector (minFaceSize=20) | 8 | 1 | 237.4 |
HumanFaceDetector (minFaceSize=20) | 8 | 8 | 239.92 |
HumanFaceDetector (minFaceSize=50) | 1 | 1 | 121.26 |
HumanFaceDetector (minFaceSize=50) | 8 | 1 | 49.4 |
HumanFaceDetector (minFaceSize=50) | 8 | 8 | 39.39 |
HumanFaceDetector (minFaceSize=90) | 1 | 1 | 41.92 |
HumanFaceDetector (minFaceSize=90) | 8 | 1 | 22.01 |
HumanFaceDetector (minFaceSize=90) | 8 | 8 | 14.1 |
ARM-based processor. HeadDetector performance#
The table below shows the performance of HeadDetector on the ARM-based processor.
Measurement | CPU threads | BatchSize | Average (ms) |
---|---|---|---|
HeadDetector (minHeadSize=20) | 1 | 1 | 631.68 |
HeadDetector (minHeadSize=20) | 8 | 1 | 207.81 |
HeadDetector (minHeadSize=20) | 8 | 8 | 213.2 |
HeadDetector (minHeadSize=50) | 1 | 1 | 108.83 |
HeadDetector (minHeadSize=50) | 8 | 1 | 42.13 |
HeadDetector (minHeadSize=50) | 8 | 8 | 34.64 |
HeadDetector (minHeadSize=90) | 1 | 1 | 37.87 |
HeadDetector (minHeadSize=90) | 8 | 1 | 18.67 |
HeadDetector (minHeadSize=90) | 8 | 8 | 12.46 |
ARM-based processor. Estimations performance with batch interface#
The table below shows the performance of Estimations on the ARM-based processor for estimators that have a batch interface. All these measurements are performed with minFaceSize=50
.
Measurement | CPU threads | BatchSize | Average (ms) |
---|---|---|---|
HeadPoseByImage | 1 | 1 | 0.42 |
HeadPoseByImage | 8 | 1 | 0.61 |
HeadPoseByImage | 8 | 8 | 0.27 |
Eyes (INFRA_RED, useStatusPlan=0) | 1 | 1 | 0.92 |
Eyes (INFRA_RED, useStatusPlan=0) | 8 | 1 | 0.59 |
Eyes (INFRA_RED, useStatusPlan=0) | 8 | 8 | 0.4 |
Eyes (RGB, useStatusPlan=0) | 1 | 1 | 1.07 |
Eyes (RGB, useStatusPlan=0) | 8 | 1 | 0.63 |
Eyes (RGB, useStatusPlan=0) | 8 | 8 | 0.48 |
Eyes (INFRA_RED, useStatusPlan=1) | 1 | 1 | 1.07 |
Eyes (INFRA_RED, useStatusPlan=1) | 8 | 1 | 0.63 |
Eyes (INFRA_RED, useStatusPlan=1) | 8 | 8 | 0.48 |
Eyes (RGB, useStatusPlan=1) | 1 | 1 | 2.06 |
Eyes (RGB, useStatusPlan=1) | 8 | 1 | 1.24 |
Eyes (RGB, useStatusPlan=1) | 8 | 8 | 0.95 |
Infra-Red | 1 | 1 | 4.72 |
Infra-Red | 8 | 1 | 2.4 |
Infra-Red | 8 | 8 | 1.82 |
AGS | 1 | 1 | 0.43 |
AGS | 8 | 1 | 0.33 |
AGS | 8 | 8 | 0.16 |
BlackWhite | 1 | 1 | 2.9 |
BlackWhite | 8 | 1 | 1.3 |
BlackWhite | 8 | 8 | 1.3 |
BestShotQuality | 1 | 1 | 0.39 |
BestShotQuality | 8 | 1 | 0.31 |
BestShotQuality | 8 | 8 | 0.18 |
MedicalMask | 1 | 1 | 12.5 |
MedicalMask | 8 | 1 | 6.28 |
MedicalMask | 8 | 8 | 4.53 |
LivenessOneShotRGBEstimator | 1 | 1 | 497.7 |
LivenessOneShotRGBEstimator | 8 | 1 | 121.08 |
LivenessOneShotRGBEstimator | 8 | 8 | 115.44 |
Orientation | 1 | 1 | 10.28 |
Orientation | 8 | 1 | 5.93 |
Orientation | 8 | 8 | 3.51 |
CredibilityCheck | 1 | 1 | 296.2 |
CredibilityCheck | 8 | 1 | 103.2 |
CredibilityCheck | 8 | 8 | 105.9 |
PortraitStyle | 1 | 1 | 2.66 |
PortraitStyle | 8 | 1 | 3.32 |
PortraitStyle | 8 | 8 | 1.86 |
Background | 1 | 1 | 2.4 |
Background | 8 | 1 | 2.3 |
Background | 8 | 8 | 2.1 |
NaturalLight | 1 | 1 | 5.2 |
NaturalLight | 8 | 1 | 2.9 |
NaturalLight | 8 | 8 | 2.6 |
FishEye | 1 | 1 | 35.7 |
FishEye | 8 | 1 | 13.2 |
FishEye | 8 | 8 | 1.7 |
RedEye | 1 | 1 | 10.5 |
RedEye | 8 | 1 | 3.3 |
RedEye | 8 | 8 | 3.0 |
HeadWear | 1 | 1 | 8.07 |
HeadWear | 8 | 1 | 6.45 |
HeadWear | 8 | 8 | 2.76 |
EyeBrowEstimator | 1 | 1 | 35.77 |
EyeBrowEstimator | 8 | 1 | 13.34 |
EyeBrowEstimator | 8 | 8 | 10.5 |
HumanAttributeEstimator | 1 | 1 | 28.09 |
HumanAttributeEstimator | 8 | 1 | 14.91 |
HumanAttributeEstimator | 8 | 8 | 9.02 |
Mouth | 1 | 1 | 16.26 |
Mouth | 8 | 1 | 7.86 |
Mouth | 8 | 8 | 5.36 |
DynamicRange | 1 | 1 | 0.28 |
DynamicRange | 8 | 1 | 0.29 |
DynamicRange | 8 | 8 | 0.16 |
Glasses | 1 | 1 | 2.29 |
Glasses | 8 | 1 | 1.71 |
Glasses | 8 | 8 | 0.96 |
DeepFake | 1 | 1 | 356.17 |
DeepFake | 8 | 1 | 107.95 |
DeepFake | 8 | 8 | 97.42 |
NIRLivenessEstimator | 1 | 1 | 27.63 |
NIRLivenessEstimator | 8 | 1 | 12.29 |
NIRLivenessEstimator | 8 | 8 | 9.32 |
LivenessRGBMEstimator | 1 | 1 | 59.32 |
LivenessRGBMEstimator | 8 | 1 | 31.18 |
LivenessRGBMEstimator | 8 | 8 | 20.48 |
Attributes | 1 | 1 | 171.32 |
Attributes | 8 | 1 | 63.07 |
Attributes | 8 | 8 | 57.87 |
ARM-based processor. Estimations performance without batch interface#
The table below shows the performance of Estimations on the ARM-based processor for estimators that do not have a batch interface. All these measurements are performed with minFaceSize=50
.
Measurement | CPU threads | Average (ms) |
---|---|---|
EyesGaze | 1 | 5.3 |
EyesGaze | 8 | 3.0 |
Emotions | 1 | 36.9 |
Emotions | 8 | 15.8 |
Quality | 1 | 2.1 |
Quality | 8 | 1.2 |
Warper | 1 | 0.9 |
Warper | 8 | 1.9 |
Overlap | 1 | 9.0 |
Overlap | 8 | 3.6 |
PPE | 1 | 22.33 |
PPE | 8 | 13.12 |
LivenessFlyingFaces | 1 | 24.55 |
LivenessFlyingFaces | 8 | 10.85 |
LivenessFPR | 8 | 103.7 |
LivenessFPR | 8 | 56.3 |
ARM-based processor. Extractor performance#
The table below shows the performance of Extractor on the ARM-based processor.
Type | Model | CPU threads | Average (ms) |
---|---|---|---|
Extractor | 57 | 1 | 547.2 |
Extractor | 57 | 8 | 189.3 |
Extractor | 58 | 1 | 547.2 |
Extractor | 58 | 8 | 189.2 |
Extractor | 59 | 1 | 535.6 |
Extractor | 59 | 8 | 188.6 |
Model | CPU threads | Batch Size | Average (ms) | RAM Memory (Mb) |
---|---|---|---|---|
59 | 1 | 1 | 597.78 | 348.0 |
59 | 8 | 1 | 169.58 | 346.0 |
59 | 8 | 8 | 165.92 | 346.0 |
60 | 1 | 1 | 603.45 | 344.0 |
60 | 8 | 1 | 170.02 | 345.0 |
60 | 8 | 8 | 165.97 | 345.0 |
62 | 1 | 1 | 749.42 | 430.0 |
62 | 8 | 1 | 208.62 | 431.0 |
62 | 8 | 8 | 206.9 | 431.0 |
ARM-based processor. Matcher performance#
The table below shows the performance of Matcher on the ARM-based processor. The table includes average matcher per second for descriptors received using CNN 57, 58 59, 60 and 62.
Type | Model | CPU threads | Batch Size | Average (matches/sec) |
---|---|---|---|---|
Matcher | 57, 58 | 1 | 10000 | 2.08 M |
Matcher | 59, 60, 62 | 1 | 10000 | 2.08 M |
Note: The above value is the maximum performance of the matcher on a particular piece of hardware. Performance in general does not depend on the size of the batch, but may be limited by memory performance at large values of the batch size.
Runtime performance for embedded environment#
Face detection performance depends on input image parameters such as resolution and bit depth as well as the size of the detected face.
Input data characteristics:
- Image resolution: 640x480px;
- Image format: 24 BPP RGB;
The results for minimum batch size and optimal batch size are shown in the tables below. All the intermediate and non-optimal values are omitted.
Face detections are performed using FaceDetV3 NN.
Descriptor size#
Table below shows size of serialized face descriptors to estimate memory requirements.
"Descriptor size"
Face descriptor version | Data size (bytes) | Metadata size (bytes) | Total size |
---|---|---|---|
CNN 54 | 512 | 8 | 520 |
CNN 56 | 512 | 8 | 520 |
CNN 57 | 512 | 8 | 520 |
CNN 58 | 512 | 8 | 520 |
CNN 59 | 512 | 8 | 520 |
CNN 60 | 512 | 8 | 520 |
CNN 62 | 512 | 8 | 520 |
Table below shows size of serialized human descriptors to estimate memory requirements. Human descriptors are used only for reidentification tasks.
"Human descriptor size (used only for reidentification tasks)"
Human descriptor version | Data size (bytes) | Metadata size (bytes) | Total size |
---|---|---|---|
CNN 102 (deprecated) | 2048 | 8 | 2056 |
CNN 103 (deprecated) | 2048 | 8 | 2056 |
CNN 104 (deprecated) | 2048 | 8 | 2056 |
CNN 105 | 512 | 8 | 520 |
CNN 106 | 512 | 8 | 520 |
CNN 107 | 512 | 8 | 520 |
CNN 108 | 512 | 8 | 520 |
CNN 109 | 512 | 8 | 520 |
CNN 110 | 512 | 8 | 520 |
CNN 112 | 512 | 8 | 520 |
CNN 113 | 512 | 8 | 520 |
Metadata includes signature and version information that may be omitted during serialization if the NoSignature flag is specified.
When estimating individual descriptor size in memory or serialization storage requirements with default options, consider using values from the "Total size" column.
When estimating memory requirements for descriptor batches, use values from the "Data size" column instead, since a descriptor batch does not duplicate metadata per descriptor and thus is more memory-efficient.
These numbers are for approximate computation only, since they do not include overhead like memory alignment for accelerated SIMD processing and the like.