Human tracking algorithm#

Human tracking algorithm is derived from face and body tracking algorithms and based on HumanFaceDetector.

ReIdentification#

ReIdentification is a optional feature, that improves tracking accuracy (config parameter use-body-reid). ReIdentification is intended to solve problem, described in section Body tracking algorithm. It matches two different tracks if need and merges them into one track with id of the older one. The feature's behavior is regulated by config parameters "reid-matching-threshold", "reid-matching-detections-count". Two tracks will be matched only if similarity between them higher then "reid-matching-threshold".

Note: current version of the TrackEngine supports ReIdentification feature only for human/body tracking.

Receiving tracking results#

As mentioned earlier, tracking results for specific frame, may be calculated only after several more frames submitted to TE (for both callback-mode types). The reason of such delay is that, generally, tracking may require several frames to get results. Such logic implies internal buffer of tracking results for several frames. Config parameter tracking-results-buffer-max-size is used to regulate max size of this buffer, so users have a guarantee, that only this count of frames are stored internally. It limits all other config parameters, which can affect on how many frames are required to calculate tracking results (e.g. reid-matching-detections-count). Required frames count to return tracking results for one frame: 1. 1 for Face tracking or if ReId feature is disabled for Body/Human tracking. 2. maximum reid-matching-detections-count if ReId feature is enabled for Body/Human tracking. This is maximum value, actually results can be ready earlier.

When callback-mode = 1, logic of buffered tracking results just adds small latency between frame is pushed and it's tracking results are ready from any callback. For callback-mode = 0, users should expect that, generally, track may not return any results for the Stream until required count of frames been passed to the Stream.

Memory consumption#

TrackEngine itself doesn't allocate much memory for internal calculations, but it permanently holds images in the current tracks data (actually, it holds one image per each Stream) and, additionaly, in frame/callback queues for callback-mode = 1. The main tips to reduce memory consumption is to set frames-buffer-size, callback-buffer-size and skip-frames low enough. To achieve high optimized minimum memory consumption solution users should use estimator API ITrackEngine::track and don't keep images in any queues or minimize that in maximum.

Threading#

TrackEngine is multi-threaded. Count of threads is configurable and depends on the currently bound FaceEngine settings and type of observers been used (batched or single). TrackEngine calls Observers functions in separate threads. If batched observers are used, then only one additional thread will be created and used for all batched callbacks and all streams. If per-stream single observers are used, then for each stream it's own separate callback thread will be created and used for it's callbacks invocations. In this case all callbacks are invoked from the one thread per-stream. Whatever callback type is used, it is recommended to avoid long-time running tasks in these functions, because pushing to callback buffer blocks main processing thread, so main processing thread always waits until there is free slot in that buffer to push a callback (buffer's size is set by parameter callback-buffer-size, see below). The checkBestShot and needRGBImage functions are called in the main frame processing thread. It is also recommended to avoid expensive computations in these functions. Perfectly, these predicates should take zero performance cost.

Threads count guarantees (excluding calculating threads of SDK): - If batched observers are used, then users have guarantee, that TrackEngine uses only 2-3 threads itself. - If per-stream single observers are used, then users have guarantee, that TrackEngine uses only 1-2 + number of created streams threads itself.

Tracker#

TrackEngine uses tracker to update the current detections in the case of detect/redetect fail. TrackEngine supports several trackers (see tracker-type parameter in the config, section Settings). Some platforms don't support all trackers. vlTracker is the tracker based on neural networks. It's the only tracker, that can be used for GPU/NPU processing (other trackers, except of none, don't support GPU/NPU) and for processing concurrently running multiple streams (it has batching implementation, so provides better CPU utilization). KCF/opencv trackers are simple CPU trackers, that should be used only in case of few tracks in total for all streams at the moment. None tracker choosen disables tracking feature at all, so it leads to better performance, but degradation of tracking quality.

ROI#

TrackEngine supports ROI of tracking: it can be set only with per Stream parameter humanRelativeROI from tsdk::StreamParams/tsdk::StreamParamsOpt. Note, that it's relative ROI, so it sets rect as relative of frame size. If ROI is set, then detector finds faces/humans only in that area, while tracker can move tracks outside of ROI for several frames (maximum detector-step frames). If detector-scaling is 1 then TE extracts ROI and makes scaling as one operation (firstly extracting ROI and then scaling), so there isn't any overhead on extracting ROI. There's common rule to achieve better performance: find frame size ratio (width / height) for most Streams without ROI and set ROI-s for different streams with equal ratio width / height. The most optimal case implies all Streams within one application instance have equal ratio of universal ROI (humanRelativeROI or original frame size if ROI isn't set) and detector-scaling is 1. Consider TE example of code to work with ROI at the end of doc.

Note: It's recommended to use this feature with detector-scaling set 1 instead of extracting ROI of original frame before pushFrame*.