FaceStream Workflow Description#

FaceStream modes#

FaceStream has two modes: normal and server. The main differences between the modes are shown in table below.

Table: Normal and server modes differences

	Normal mode	Server mode
Sources	Video streams from IP and USB cameras, video files, photos	Video streams from IP cameras
Requests to API for processing of video streams (see FaceStreamApi.html)	no	yes
LUNA Configurator usage	no	yes
Operating systems	Windows, CentOS	CentOS
Dynamic creation, editing, and deletion of video stream sources via API requests	no	yes
Real time video streams preview in a browser for the streams with specified parameters	no	yes
Video stream metrics (number of streams, number of errors, number of faces, number of skipped frames, FPS)	no	yes

See "docs/FaceStreamApi.html" for more information about the FaceStream API and returned data.

FaceStream normal mode#

FaceStream application workflow:

FaceStream receives video from a source (IP camera, web-camera, video file). FaceStream can work with several sources of video streams. You can set sources for the normal mode in the "input.json" file;
FaceStream decodes video frames;
The ROI area is cut out from the frame if the "roi" parameter is specified;
The received image is scaled to the "scale-result-size" size if the "detector-scaling" is set in the "trackengine.conf" configuration file;
Faces are detected in the frame;
The face is redetected in the frame instead of detection if the "detector-step" parameter ("trackengine.conf" file) is set;
A track is created for each new face in the stream; then it is reinforced with new detections of this face from the subsequent frames.

The track is interrupted if the face disappears from the frame. You can set the "skip-frames" parameter ("trackengine.conf") so the track will not be interrupted immediately, and the system will wait for the face to appear in the area for several frames;

FaceStream filters the frames of low quality and selects bestshots. There are several algorithms for choosing the best detection(s) in the track. See the "Filtering" section;
If the frame is bestshot, it is added to the collection of bestshots.

Depending on the "number-of-bestshots-to-send" setting one or several best detections are collected from each track;

Optional. If the "warp" type is set in the "portrait-type" parameter, the bestshots are normalized to the LUNA PLATFORM standard, and normalized images are created. Normalized image is better for processing using LUNA PLATFORM;
The bestshots are sent to an external service via HTTP-request. The image may be sent as it is or transformed into the normalized image.

The frequency of images sending is specified in the "sending" (input.json) section.

The sending parameters and external service address are specified in sections "output" (input.json) and "sending" (fs3Config.conf).

FaceStream Server Mode#

The general principle of FaceStream server mode operation corresponds to that shown in the figure above. Only video streams are processed by FaceStream in the server mode.

In server mode FaceStream receives requests, processes them and returns response. More details about FaceStream server mode, requests to FaceStream and description of cameras management for developers can be found in "./docs/FaceStreamApi.html document".

Server mode enables to work only with video streams from USB and IP cameras, but has the following advantages:

Dynamic creation, deletion and modification of processed video streams. FaceStream API enables to configure the transfer and processing of video streams as needed;
View video streams in real time;
Obtaining global metrics for all video streams or for each stream separately;
The ability to use the LUNA Configurator service, which stores FaceStream startup parameters and allows you to continue processing the current video even after restarting FaceStream in the case of an emergency shutdown (see "LUNA Configurator Usage").

Server mode enables to use LUNA Configurator service.

General recommendations for FaceStream configuration#

This section provides general guidelines for setting up FaceStream.

The names of the configuration file, which describes the configured parameters, are mentioned in this section. When configuring FaceStream in server mode using the Configurator service, you should specify these parameters in the service and not in the configuration files.

Before starting configuration#

To start the configuration process, you should run FaceStream in normal mode with the activated option "show-window" (the configuration file fs3Config.conf). Thus you can visually determine faces that are detected.

You should perform the FaceStream configuration for each camera used separately. FaceStream should work with the video stream of the camera, located in the standard operating conditions. The following reasons lead to these requirements:

Frames with different cameras may differ by:
noise level,
frame size,
light,
blurring,
etc.;
FaceStream settings depend on the lighting conditions, therefore, will be different for the cameras placed in a dark room and a light;
FaceStream performance depends on the number of faces in the frame. Therefore, the settings for the camera, which detects one face every 10 seconds, will be different from the settings for the camera detecting 10 faces per second;
The number of detected faces and the quality of these detections depend on correct location of the camera. When the camera is at a wrong angle, faces are not detected in frames. Moreover, head angles can also exceed the acceptable degree hence the frame with the detected face could not be used for further processing.
Faces in the zone of camera view can be partially or completely blocked by some objects. There can be background objects that can prevent the proper functioning of recognition algorithms.

The camera can be positioned so that the lighting or shooting conditions change throughout the day. It is recommended to test FaceStream work under different conditions and choose the best mode, providing reliable FaceStream operation under any conditions.

Sometimes it is impossible to configure FaceStream directly on the client video camera during its operation. In this case, it is necessary to record a video using this camera. Use this video to configure FaceStream.

If there are few faces in the zone of view of the video camera, you should record a short video on your own simulating the expected movement of people in front of the camera. Use this video to configure FaceStream.

You can specify the FPS for video processing using the "Real_time_mode_fps" parameter.

To use a video as a video source for FaceStream you should indicate the source of "video-sources" and set the "url" of the video in the configuration file "input.json".

The video cameras tested with FaceStream are listed in section "Appendix A: Cameras Compatibility".

FaceStream performance configuration#

The mentioned above parameters have the greatest impact on the FaceStream performance.

Reduction of search area#

Not all the areas of the frame contain faces. Besides, not all the faces in the frame have the required size and quality. For example, the sizes of faces in the background may be too small, and the faces near the edge of the frame may have unacceptable pitch, roll, or yaw angles.

The "roi" parameter (configuration file input.json section "input"), enables you to specify a rectangular area to search for faces.

The specified rectangular area is cut out from the frame and FaceStream performs further processing using this image.

The smaller the search area, the less resources are required for processing each frame.

Correct exploitation of the "roi" parameter significantly improves the performance of FaceStream.

Frame scaling#

The "detector-scaling" option (the "trackengine.conf" configuration file) enables you to scale the frame before processing.

The appropriate frame size should be selected using the "scale-result-size" parameter (the trackengine.conf configuration file). This parameter sets the maximum frame size after scaling the largest side of the frame. If the source frame had a size of 1920x1080 and the value of "scale-result-size" is equal to 640, then FaceStream will process the frame of 640x360 size.

If the frame was cut out using the "roi" parameter, the scaling will be applied to this cropped frame. In this case, you should specify the "scale-result-size" parameter value according to the greater ROI side.

You should gradually scale the frame and check whether face detection occurs on the frame, to select the optimal "scale-result-size" value. You should set the minimum image size at which all faces in the area of interest are detected.

Further extending our example, images below depict a video frame without resize (at original 1920x1080 resolution) and after resize to 960x640 with face detections visualized as bounding boxes.

Six faces can be detected when the source image resolution is 1920x1080.

Three faces are detected after the image is scaled to the 960x640 resolution. The faces in the background are smaller in size and are of poor quality.

The smaller the frame resolution, the less resources are consumed.

Defining area with movement#

	frg-subtractor	frg-regions-alignment	frg-regions-square-alignment
Recommended value when utilizing CPU	1	0	0
Recommended value when utilizing GPU	1	360	0

When the "frg-subtractor" parameter (trackengine.conf) is enabled, motion in the frame is considered. The following face detection will be performed in the area with motion, not in the entire frame.

The areas with motion are determined after the frame is scaled.

When the "frg-subtractor" is enabled, the performance of FaceStream is increased.

The "frg-regions-alignment" parameter (trackengine.conf) enables you to set the alignment for the area with motion.

When the "frg-regions-square-alignment" parameter (trackengine.conf) is enabled, the width and height of the area with motion will always be equal.

Batch processing of frames#

The following parameters configure frames batches processing. The parameters are set in trackengine.conf.

The "batched-processing" enables batch processing of frames.

When working with several video cameras, a frame is collected from each frame. Then the batch of frames is processed.

When the parameter is disabled, the frames are processed one by one.

When using batch processing mode, the delay before processing increases, but the processing itself is faster.

It is recommended to enable the parameter both when using the GPU and when using the CPU.

The "min-frames-batch-size" parameter sets the minimal number of frames collected from all the cameras before processing.

It is recommended to set the "min-frames-batch-size" parameter value equal to the number of streams when using the GPU.

It is recommended to set the "min-frames-batch-size" parameter value equal to "2" when using the CPU.

The "max-frames-batch-gather-timeout" parameter specifies the time between processing of the batches.

If a single frame is processed within the specified time and there is an additional time margin, FaceStream waits for additional frames to increase GPU utilization.

If the "max-frames-batch-gather-timeout" parameter is set to "20", this time is used to process the previous batch and collect a new one. After 20 seconds, the processing begins even if the number of frames equal to "min-frames-batch-size" was not collected. Processing of the next batch cannot begin before the processing of the previous one is finished.

There is no timeout for collecting frames to the batch if the parameter is set to "0" and "min-frames-batch-size" is ignored.

It is recommended to set the "max-frames-batch-gather-timeout" parameter value equal to "0" both when using the GPU and when using the CPU.

Minimal face size#

You should configure the "minFaceSize" parameter in the "faceengine.conf" file to specify the minimal face size for detection.

You should set the maximum possible face size. The larger the face, the fewer resources are required to perform detections.

Note that the face size will depend on the actual frame size set by the "scale-result-size" parameter (the trackengine.conf configuration file). A face with a size equal to 100 pixels on a 1280x760 frame will have a size equal to 50 pixels on a 640x480 frame.

General configuration information#

Working with track#

A new track is created for each detected face. Bestshots are defined and sent for each track.

In general, the track is interrupted when the face can no longer be found in the frame.

If a track was interrupted and the same person appears in the frame, a new track is created.

There can be a situation when two faces interact in a frame (one person behind the other). In this case, the tracks for both persons are interrupted, and new tracks are created.

There can be a situation when a person turns away, or a face is temporarily blocked. In this case, you can specify the "skip-frames" parameter ("trackengine.conf") instead of interrupting the track immediately. The parameter sets the number of frames during which the system will wait for the face to reappear in the area where it disappeared.

The "detector-step" parameter in "trackengine.conf" enables you to specify the number of frames on which face redetection will be performed in the specified area before face detection is performed. Redetection requires fewer resources, but the face may be lost if you set a large number of frames for redetection.

Bestshot sending#

The "sending" (input.json) parameters group enables you to set parameters for the bestshot sending. FaceStream sends the received bestshots to LUNA PLATFORM (see "Image Sending to LUNA PLATFORM").

You can send several bestshots for the same face to increase the recognition accuracy. You should enable the "number-of-bestshots-to-send" (input.json) parameters in this case.

LUNA PLATFORM enables you to aggregate the bestshots and create a single descriptor of a better quality using them.

If the required number of bestshots was not collected during the specified period or when the track was interrupted the collected bestshots are sent.

The "time-period-of-searching" and "silent-period" parameters can be specified in seconds or in frames. Use the "type" parameter to choose the type.

The general options for configuring the "time-period-of-searching" and "silent-period" parameters of the "sending" group are listed below.

The bestshot is sent after the track is interrupted and the person left the video camera zone of view.

All the frames with the person's face are processed and the bestshot is selected.

time-period-of-searching = -1 silent-period = 0

It is required to quickly receive the bestshot and then send bestshots with the specified frequency.

For example, it is required to send a bestshot soon after an intruder entered the shop. The intruder will be identified by the blacklist.

The mode is also used for the demonstration of FaceStream capabilities in real-time.

The bestshot will be sent after the track is interrupted even if the specified period did not exceed.

time-period-of-searching = 3 silent-period = 0

It is required to quickly send the bestshot and then send the bestshot only if the person is in the frame for a long time.

time-period-of-searching = 3 silent-period = 20

It is required to quickly send the bestshot and never send the bestshot from this track again.

time-period-of-searching = 3 silent-period = -1

Frames filtration#

The filtration of frames is performed by three main criteria (they are all set in the "input.json" file):

Head angles ("detection-yaw-threshold", "detection-pitch-threshold", "detection-roll-threshold").

The "yaw-number" and "yaw-collection-mode" parameters are additionally set for the yaw angle. The parameters reduce the possibility of the error occurrence when the "0" angle is returned instead of a large angle.

Frame quality for further processing ("min-score");
Mouth occlusion ("mouth-occlusion-threshold").

If a frame did not pass at least one of the specified filters, it cannot be selected as a bestshot.

If the "number-of-bestshots-to-send" parameter is set, the frame is added to the array of bestshots to send. If the required number of bestshots to send was already collected, the one with the lowest frame quality score is replaced with the new bestshot if its quality is higher.

Working with ACMS#

Use the "primary-track-policy" settings when working with ACMS. The settings enables you to activate the mode for working with a single face, which has the largest size. It is considered, that the face of interest is close to the camera.

The track of the largest face in the frame becomes primary. Other faces in the frame are detected but they are not processed. Bestshots are not sent for these faces.

As soon as another face reaches a larger size than the face from the primary track, this face track becomes primary and the processing is performed for it.

The mode is enabled using the "use_primary_track_policy" parameter.

The definition of the bestshots is performed only after the size (vertical) of the face reaches the value specified in the "best_shot_min_size" parameter. Frames with smaller faces can't be the bestshots.

When the face detection vertical size reached the value set in the "best_shot_proper_size" parameter the bestshot is sent as a bestshot at once.

The "best_shot_min_size" and "best_shot_proper_size" are set depending on the video camera used and its location.

The examples below shows configuration of the "sending" group parameters for working with ACMS.

The turnstile will only open once. To re-open the turnstile you should interrupt the track (move away from the video camera zone of view).

time-period-of-searching = -1
silent-period  = 0

The turnstile will open at certain intervals (in this case, every three seconds) if a person stands directly in front of it.

time-period-of-searching = 3    
silent-period  = 0

If the "use_primary_track_policy" parameter is enabled, the bestshot is never sent when the track is interrupted.

Additional information#

Image Sending to LUNA PLATFORM#

The settings for sending images to LUNA PLATFORM 5 using Luna 3 and 4 are slightly different, see tables below.

Table: Settings for sending the image to Luna 3

File	Parameter	Luna 3
fs3Config.conf	luna-api	4
	luna-account-id	Login, Password or Token
input.json, "Output" section	Login, Password or Token	It is required to specify Login and Password of a user account or Token for a device
	Url	Specify URL that forms "search" request (see requests description in Luna 3 API documentation of LP5 distrubution package)

Table: Settings for sending the image to Luna 4

File	Parameter	Luna 4
fs3Config.conf	luna-api	5
	luna-account-id	It is required to specify the "Luna-account-id" header in the request
input.json, "Output" section	Login, Password or Token	Not used
	Url	Specify URL that forms "generate events" request (see requests description in Luna 4 API documentation of LP5 distribution package)

Formats, video compression standards, and protocols#

FaceStream utilizes the FFMPEG library to convert videos and get a video stream using various protocols. All the main formats, video compression standards, and protocols that were tested when working with FaceStream are listed in this section.

The tests were performed for the CentOS build of FaceStream.

FFMPEG supports more formats and video compression standards. They are not listed in this section, because they are rarely used when working with FaceStream.

Video formats#

Video formats that are processed using FaceStream:

AVI,
MP4,
MOV,
MKV,
FLV.

Encodings#

Basic video compression standards that FaceStream works with:

MPEG4,
MS MPEG4,
MS MPEG4v2,
MJPEG,
H.264,
H.265.

Protocols#

Basic protocols used by FaceStream for data receiving:

HTTP,
RTP,
RTSP,
TCP,
UDP.

Memory consumption when running FaceStream#

This section lists the reasons for increasing RAM consumption when running FaceStream.

Each stream increases memory consumption. The amount of the consumed memory depends on the settings set for FaceStream:
the number of Ffmpeg threads in the "input" section in the "numberOfFfmpegThreads" parameter (input.json file),
image cache size in the "performance" section in the "stream-images-buffer-max-size" parameter (fs3config.conf file),
set buffer sizes in the "other" section: "frame-buffer-max-size", "fragment-buffer-size", "callback-buffer-size" (trackengine.conf file).
If the number of threads specified in the "numberOfFfmpegThreads" parameter is greater than "1" (input.json file), the memory consumption increases significantly. At the same time, the increase in consumption is extremely slow and can be noticed after several hours of operation only.

For RTSP streams, you can set the "numberOfFfmpegThreads" parameter to "0" or "1" (input.json file). In this case, memory growth is not noticed.

Memory consumption increases after FaceStream starts. Growth occurs within 1-2 hours. This is related to caches filling (see point 1). If no new streams are created and step 2 is not executed, the memory consumption stops growing.
Memory consumption increases when settings in the Debug section are enabled (fs3config.conf and trackengine.conf files).