Skip to content

Hardware requirements#

Server / PC installations#

See "Appendix A. Specifications" for information about hardware used for performance measurements.

General considerations#

Be warned, that not all algorithms in the SDK have GPU implementations. If the desired algorithm doesn’t have a GPU implementation, a fallback to the CPU implementation has to be made. In this case, one should take care of possible memory transfers and latency they cause. Please see the algorithm implementation matrix for details.

Some neural networks are unavailable for particular architectures.

Neural networks and available architectures

Neural network CPU CPU AVX2 GPU ARM
FaceDet_v1_first.plan yes yes yes
FaceDet_v1_second.plan yes yes yes yes
FaceDet_v1_third.plan yes yes yes yes
FaceDet_v2_first.plan yes yes yes
FaceDet_v2_second.plan yes yes yes yes
FaceDet_v2_third.plan yes yes yes yes
FaceDet_v3__.plan yes yes yes yes
FaceDet_v3_redetect__.plan yes yes yes yes
model_subjective_quality__.plan yes yes yes yes
ags_angle_estimation_flwr_.plan yes yes yes yes
angle_estimation_flwr_.plan yes yes yes yes
ags_estimation_flwr_.plan yes yes yes yes
attributes_estimation_.plan yes yes yes yes
childnet_estimation_flwr_.plan yes yes yes yes
emotion_recognition__.plan yes yes yes yes
glasses_estimation_flwr_.plan yes yes yes yes
eyes_estimation_flwr8_.plan yes yes yes yes
eye_status_estimation_flwr_.plan yes yes yes yes
eyes_estimation_ir_.plan yes yes yes yes
gaze__.plan yes yes yes yes
gaze_ir__.plan yes yes yes yes
overlap_estimation_flwr_.plan yes yes yes yes
mouth_estimation_.plan yes yes yes yes
mask_clf__.plan yes yes yes yes
ppe_estimation__.plan yes yes yes yes
orientation_.plan yes yes yes yes
LNet_precise__.plan yes yes yes yes
LNet_ir_precise__.plan yes yes yes yes
slnet__.plan yes yes yes yes
liveness_model__.plan yes yes yes yes
depth_estimation_.plan yes yes yes yes
faceflow_model_1_.plan yes yes yes yes
faceflow_model_2_.plan yes yes yes yes
ir_liveness_universal_.plan yes yes yes yes
ir_liveness_ambarella_.plan yes yes yes yes
hs_shoulders_liveness_estimation_flwr_.plan yes yes yes yes
hs_head_liveness_estimation_flwr_.plan yes yes yes yes
flying_faces_liveness_v2_.plan yes yes yes yes
rgbm_liveness_.plan yes yes yes yes
rgbm_liveness_pp_hand_frg_.plan yes yes yes yes
human_keypoints__.plan yes yes yes yes
human__.plan yes yes yes yes
human_redetect_.plan yes yes yes yes
reid_.plan yes yes yes
credibility_Check_.plan yes yes yes yes
cnn54b_.plan yes yes yes yes
cnn54m_.plan yes yes yes yes
cnn56b_.plan yes yes yes yes
cnn56m_.plan yes yes yes yes
cnn57b_.plan yes yes yes yes
cnn58b_.plan yes yes yes yes
cnn59b_.plan yes yes yes yes

CPU requirements#

For NN with "*_cpu.plan" in names, CPU should support at least the SSE4.2 instruction set.

For NN with "*_cpu-avx2.plan" in names, AVX2 instruction set support is required for the best performance.

Only 64-bit CPUs are supported.

If in doubt, consider checking your CPU specifications at the following websites:

GPU requirements#

For GPU acceleration an NVIDIA GPU is required. The following architectures are supported:

  • Pascal or newer.

A minimum of 6GB or dedicated video RAM is required. 8 GB or more VRAM recommended.

The number of actually created threads while using GPU#

The total number of threads can be calculated by such expression:

totalNumberOfThreads = numThreads + 2*numGpuDevices + 1 (and 1 optional), 

where

  • numThreads is the value of setting <param name="numThreads" type="Value::Int1" x="12" />. Description can be found in "Configuration Guide - Runtime settings";
  • numGpuDevices is the number of GPU devices;
  • One of threads for CUDA in runtime;
  • And besides 1 optional thread depending on internal settings LUNA-SDK API;

Example: if numThreads==4 and there are 2 GPU devices in system the total number of threads will be 9 where 4 - are numThreads, 2 + 2 for every GPU and 1 thread for CUDA.

For decreasing of threads number can be set the environment variable CUDA_VISIBLE_DEVICES=-1.

RAM requirements#

System memory consumption differs depending on a usage scenario and is proportional to the number of worker threads. This is true for both CPU (think system RAM) and GPU (think VRAM) execution modes.

For example, in CPU execution mode 1GB RAM is enough for a typical pipeline, which consists of a face detector and a face descriptor extractor running on a single core (one worker thread) and processing 1080p input images with 10-12 faces on average. If this setup is scaled up to 8 worker threads, overall memory consumption grows up to 8GB.

It is recommended to assume at least 1GB of free RAM per worker thread.

Storage requirements#

FaceEngine requires 1GB of free space to install. This includes model data for both CPU and GPU execution modes that should be redistributed with your application. If only one execution mode is planned, reduce space requirements by half.

Requirements for GPU acceleration#

Recommended versions of CUDA

The most current version of these release notes can be found online at http://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html.

Cuda version on Linux can be found using command below:

$nvidia-smi

Cuda version on Windows can be found in Control Panel\Programs\Programs and Features as in fugure below

CUDA version on Win
CUDA version on Win

We recommend to use suggested version of CUDA for your operating system. But if your version is older than required, we can't give guaranties, that it will work successfuly. More detials about CUDA Compatibility, can be found online at https://docs.nvidia.com/deploy/cuda- compatibility/index.html.

Embedded installations#

CPU requirements#

Supported CPU architectures:

  • ARMv7-A;
  • ARMv8-A (ARM64).
Back to top