Human tracking algorithm#

The human tracking algorithm is derived from the face and body tracking algorithms and based on HumanFaceDetector.

ReIdentification#

ReIdentification is an optional feature, that improves tracking accuracy (config parameter use-body-reid). ReIdentification is intended to solve the problem, described in section Body tracking algorithm. It matches two different tracks, if needed, and merges them into one track with the ID of the older one. The feature's behavior is regulated by config parameters reid-matching-threshold, reid-matching-detections-count. Two tracks will be matched only if similarity between them higher than reid-matching-threshold.

Note: The current version of TrackEngine supports the ReIdentification feature only for human/body tracking.

Receiving tracking results#

Tracking results for a specific frame can be calculated only after several more frames are submitted to TrackEngine for both callback-mode types. The reason for such a delay is that tracking may require several frames to get results.

Such logic implies internal buffer of tracking results for several frames. Config parameter tracking-results-buffer-max-size regulates the maximum size of this buffer and guarantees that only this count of frames is stored internally. It limits all other config parameters, such as reid-matching-detections-count, which can define how many frames are required to calculate tracking results.

Required frame counts to return tracking results for one frame:

1 for face tracking or if ReId feature is disabled for Body/Human tracking.
Maximum reid-matching-detections-count if ReId feature is enabled for Body/Human tracking. This is the maximum value, actually results can be ready earlier.

If callback-mode = 1, logic of buffered tracking results adds small latency between frame is pushed. Its tracking results are ready from any callback.

If callback-mode = 0, you should expect that track may not return any results for the Stream until the required count of frames has been passed to the Stream.

Memory consumption#

TrackEngine does not allocate much memory for internal calculations. Yet, it holds images in the current track data (one image per stream) and in frame or callback queues for callback-mode = 1.

Tip: To reduce memory consumption, set frames-buffer-size, callback-buffer-size, and skip-frames low enough.

To achieve high optimized minimum memory consumption solution:

Use estimator API ITrackEngine::track.
Do not keep images in any queues or minimize them in maximum.

Threading#

TrackEngine is multi-threaded. Count of threads is configurable. It also depends on the currently bound FaceEngine settings and observer type (batched or single) being used.

TrackEngine calls the Observers functions in separate threads. If you use batched observers, TrackEngine will create only one additional thread and use it for all batched callbacks and streams.

If you use per-stream single observers, TrackEngine creates a separate callback thread for each stream and uses it for its callbacks invocations. In this case, all callbacks are invoked from one thread per-stream.

Note: Regardless of the callback type you use, we recommend that you avoid long-time running tasks in these functions. The reason is that pushing to a callback buffer blocks the main processing thread. The main processing thread always waits until there is a free slot in that buffer to push a callback. You can set a buffer size in the callback-buffer-size parameter.

The checkBestShot and needRGBImage functions are called in the main frame processing thread. We also recommend that you avoid expensive computations in these functions.

These predicates should take zero performance cost.

Excluding calculating SDK threads, thread count guarantees the following:

Using batched observers guarantees that TrackEngine uses only 2-3 threads itself.
Using per-stream single observers guarantees that TrackEngine uses only 1-2 + number of created streams threads itself.

Tracker#

TrackEngine uses a tracker to update current detections in the case of detect or redetect fails. TrackEngine supports the following trackers:

vlTracker - Neural network based tracker. You can use the tracker for:
- GPU/NPU processing.
  *Processing concurrently running multiple streams. It has a batching implementation, so provides better CPU utilization.
kcf and opencv - Simple CPU trackers. You should use them only in case of few tracks in total for all streams at the moment.
none - Disables the tracking feature. It leads to better performance, but also to degradation of tracking quality. Supports GPU/NPU.

Some platforms don't support all trackers.

To specify the tacker to be used, set the tracker-type parameter.

ROI#

TrackEngine supports region of interest (ROI) of tracking. You can set it only with the humanRelativeROI per stream parameter from tsdk::StreamParams/tsdk::StreamParamsOpt.

Note, that it is a relative ROI, so it sets a rectangle as relative to frame size. If ROI is set, then detector finds faces or humans only in that area, while tracker can move tracks outside of ROI for several frames. You can specify the maximum number of frames in detector-step.

If detector-scaling is 1, TrackEngine extracts ROI and makes scaling as one operation. Firstly, it extracts ROI, and then scaling, so there is no any overhead on extracting ROI.

A common rule to achieve better performance is:

Find a frame size ratio (width / height) for most streams without ROI.
Set ROI-s for different streams with equal width or height ratio.

The most optimal case implies that all streams within one application instance have equal ratio of universal ROI (humanRelativeROI or original frame size if ROI is not set) and detector-scaling is 1.

Consider the TrackEngine example of code to work with ROI.

Note: We recommend that you to use this feature with detector-scaling set 1 instead of extracting ROI of original frame before pushFrame*.