2D ML-powered Visual Object Tracking

Overview: Given the initial state (bounding box) of a target in the first frame of a video sequence, the aim of Visual Object Tracking is to automatically obtain the states of the object in the subsequent video frames.

How is this used for making labeling easy?

2d VOT uses computer vision algorithms to predict the bounding box of an object in all frames given the initial manual box in a particular frame. The following action path is used to leverage this capability for 2d labeling.

  1. Draw a bounding box on the required target object.

  2. Apply 2d VOT to predict bounding box in subsequent and previous frames and draw them in the canvas.

  3. User iterates through each frame to verify the predictions and make adjustments as necessary.

Steps to track an object:

  1. Draw bounding on the required target object

2. Uncheck the “Fill new labels” option and click on the “Track label” button

3. Labels will be automatically visible on the timeline once Tracking is complete.

4. If tracking is not complete and the user will see a warning message stating the same, try clicking on “Track Label” again to resume progress.

Tips on using Track label

  • For each “Track label” click, tracking is done for the selected label from the active frame in both directions.

  • Predicted labels do not overwrite manual labels and stop updating frames in both directions once a manual frame is encountered in the timeline.

  • Recommended using for objects that are visible across many frames to avoid the overhead of re-adjusting in that many frames.

  • Predicted labels take into consideration the scale change of the object and not the orientation changes of the object. Hence, if an object changes its orientation drastically from the initial state, then it is recommended to re-adjust the frame at the beginning of the new orientation and apply it to track again.

Last updated