Deepen AI - Enterprise
Deepen AI
  • Deepen AI Overview
  • FAQ
  • Saas & On Premise
  • Data Management
    • Data Management Overview
    • Creating/Uploading a dataset
    • Create a dataset profile
    • Auto Task Assignments
    • Task Life Cycle
    • Tasks Assignments
    • Creating a group
    • Adding a user
    • Embed labeled dataset
    • Export
    • Export labels
    • Import Labels
    • Import profile
    • Import profile via JSON
    • Access token for APIs
    • Data Streaming
    • Reports
    • Assessments
  • 2D/3D editors
    • Editor Content
    • AI sense (ML-assisted labeling)
    • Assisted 2d Segmentation
    • Scene Labeling
  • 2D Editor
    • 2D Editor Overview
    • 2D Bounding Boxes
    • 2D Polyline/Line
    • 2D Polygon
      • 2D Semantic/Instance Segmentation
        • 2D Segmentation (foreground/background)
    • 2D Points
    • 2D Semantic Painting
      • Segment Anything
      • Propagate Labels in Semantic Segementation
      • 2D Semantic Painting/Segmentation Output Format
    • 3D Bounding boxes on images
    • 2D ML-powered Visual Object Tracking
    • 2D Shortcut Keys
    • 2D Customer Review
  • 3D Editor
    • 3D Editor Overview
    • 3D Bounding Boxes — Single Frame/Individual Frame
    • 3D Bounding Boxes_Sequence
    • 3D Bounding Boxes Features
      • Label View
      • One-Click Bounding Box
      • Sequence Timeline
      • Show Ground Mesh
      • Secondary Views
      • Camera Views
      • Hide/UnHide Points in 3D Lidar
    • 3D Lines
    • 3D Polygons
    • 3D Semantic Segmentation/Painting
    • 3D Instance Segmentation/Painting
    • Fused Cloud
    • 3D Segmentation (Smart Brush)
    • 3D Segmentation (Polygon)
    • 3D Segmentation (Brush)
    • 3D Segmentation (Ground Polygon)
    • 3D Painting (Foreground/Background)
    • 3D Segmentation(3D Brush/Cube)
    • Label Set
    • 3D Shortcut Keys
    • 3D Customer Review
  • 3D input/output
    • JSON input format for uploading a dataset in a point cloud project.
    • How to convert ROS bag into JSON data for annotation
    • Data Output Format - 3D Semantic Segmentation
    • Data Output Format - 3D Instance Segmentation
  • Quality Assurance
    • Issue Creation
    • Automatic QA
  • Calibration
    • Calibration
    • Charuco Dictionary
    • Calibration FAQ
    • Data Collection for Camera intrinsic Calibration
    • Camera Intrinsic calibration
    • Data Collection for Lidar-Camera Calibration (Single Target)
    • Lidar-Camera Calibration (Single target)
    • Data Collection for Lidar-Camera Calibration (Targetless)
    • Lidar-Camera Calibration (Targetless)
    • Data Collection for Multi Target Lidar-Camera Calibration
    • Multi Target Lidar-Camera Calibration
    • Lidar-Camera Calibration(Old)
    • Vehicle-Camera Calibration
      • Data Collection for Vehicle-Camera Calibration
      • Vehicle Camera Targetless Calibration
      • Data collection for lane based targetless vehicle-camera calibration
      • Lane based Targetless Vehicle Camera Calibration
    • Data Collection for Rough Terrain Vehicle-Camera Calibration
    • Rough Terrain Vehicle-Camera Calibration
    • Calibration Toolbar options
    • Calibration Profile
    • Data Collection for Overlapping-Camera Calibration
    • Overlapping-Camera Calibration
    • Data collection guide for Overlapping Camera Calibration (Multiple-Targets)
    • Overlapping Camera Calibration (Multiple-Targets)
    • Data Collection for Vehicle-3D Lidar calibration
    • Data Collection for Vehicle-2D Lidar calibration
    • Vehicle Lidar (3D and 2D) Calibration
    • Data Collection for Vehicle Lidar Targetless Calibration
    • Data Collection for IMU Lidar Targetless Calibration
    • Vehicle Lidar Targetless Calibration
    • Data Collection for Non Overlapping Camera Calibration
    • Non-Overlapping-Camera Calibration
    • Multi Sensor Visualization
    • Data Collection for LiDAR-LiDAR Calibration
    • LiDAR-LiDAR Calibration
    • Data Collection for IMU Intrinsic calibration
    • IMU Intrinsic Calibration
    • Data Collection for Radar-Camera Calibration
    • Radar-Camera Calibration
    • Data Collection for IMU Vehicle calibration
    • Lidar-IMU Calibration
    • IMU Vehicle Calibration
    • Data Collection for vehicle radar calibration
    • Vehicle radar calibration
    • Calibration Optimiser
    • Calibration list page
    • Data collection for rough terrain vehicle-Lidar calibration
    • Rough terrain vehicle Lidar calibration
    • Surround view camera correction calibration
    • Data Collection for Surround view camera correction calibration
    • Data Collection for Lidar-Radar calibration
    • Lidar Radar Calibration
    • Vehicle Lidar Calibration
    • API Documentation
      • Targetless Overlapping Camera Calibration API
      • Target Overlapping Camera Calibration API
      • Lidar Camera Calibration API
      • LiDAR-LiDAR Calibration API
      • Vehicle Lidar Calibration API
      • Global Optimiser
      • Radar Camera Calibration API
      • Target Camera-Vehicle Calibration API
      • Targetless Camera-Vehicle Calibration API
      • Calibration groups
      • Delete Calibrations
      • Access token for APIs
    • Target Generator
  • API Reference
    • Introduction and Quickstart
    • Datasets
      • Create new dataset
      • Delete dataset
    • Issues
    • Tasks
    • Process uploaded data
    • Import 2D labels for a dataset
    • Import 3D labels for a dataset
    • Download labels
    • Labeling profiles
    • Paint labels
    • User groups
    • User / User Group Scopes
    • Download datasets
    • Label sets
    • Resources
    • 2D box pre-labeling model API
    • 3D box pre-labeling model API
    • Output JSON format
Powered by GitBook
On this page
Export as PDF
  1. 3D input/output

How to convert ROS bag into JSON data for annotation

How to convert ROS bag into JSON data for 3D annotation

PreviousJSON input format for uploading a dataset in a point cloud project.NextData Output Format - 3D Semantic Segmentation

Last updated 2 years ago

Annotation data is often collected by ROS and stored as ROS bags. In order to use deepen.ai’s system for annotation, the data must be first converted into the JSON format:

This JSON file specifies the exact data for annotation. It includes point clouds, images, timestamps, intrinsic and extrinsic calibrations, as well as localization information.

Usually, each user writes a script to convert their bag data into the JSON format. We cannot use a common script for conversion for several reasons:

  1. Each ROS bag may have many topics. The user needs to specify the exact topics for annotation.

  2. Some information such as intrinsic and extrinsic calibrations may not be in the ROS bags.

  3. If a special camera model is used, the user may wish to only send the rectified images for annotation.

  4. The JSON format links the point cloud and images shown to the annotators. It does not sync using timestamps like RVIZ.

  5. If accurate localization is available, the point cloud is transformed into a fixed world coordinate.

  6. Advanced processing such as LiDAR ego-motion compensation is typically performed in the script.

Users not familiar with the JSON format may find the conversion script difficult to develop. At deepen, we have worked with many clients on their data conversion scripts. Although each script is different, we have found many common themes in these scripts. If you are developing a new script, this tutorial will help you to get started and get your data ready for annotation. ROS has good Python support. Most conversion scripts are written in Python, so this tutorial also assumes that. We will walk you through the various steps of the conversion script. We will try to describe the simplest processing. There are many advanced techniques which can improve the data, but we will not cover them here.

Reading the Bag file(s)

The first step is very obvious. You need to specify the path of the files and ROS topics to annotate. The topics should include point clouds and images. Images are very important for annotation speed and accuracy.

Synchronizing the point clouds and images

In the 3D annotation tool, the annotator is shown a point cloud and one image from each camera. Thus, the JSON format specifies the connection between the point cloud and images. To calculate this, our script needs to explicitly synchronize the data. Often, the synchronization is done by ROS timestamps. Let us assume that there is only one LiDAR. Thus, there is only a single point cloud sequence. In this stage, we make a single pass through the point cloud data and collect its timestamps. This is our primary timestamp sequence.

We then make a pass through each image topic that we can show to the annotators. As we go through the images, we find the closest timestamp to each LiDAR timestamp. This will be the image attached to each point cloud. Thus, there is one image from each camera for each point cloud. The image sequence is synchronized to the LiDAR sequence.

Temporal Calibration (Advanced Topic)

Technically, each timestamp occurs when the LiDAR finishes a spin or a camera closes its shutter and ROS records its data. The sensors, network, and ROS itself all add latency to this process. Thus, the timestamps usually occur a few milliseconds after the actual events. This latency is typically different for each sensor and introduces inconsistency in timestamps. Temporal calibration is the method to adjust the timestamps, so all sensors have consistent timestamps. We will not cover it here, and you may skip it for your initial conversion script.

Multi-LiDAR

In the case when we have multiple LiDARs that we need to annotate, we can pick one LiDAR as the primary one and synchronize all other LiDARs to it. Note that there is the “d” field for each point in deepen’s JSON format, to which we can assign the LiDAR ID. If you use the “d” field, our annotation UI allows you to distinguish between point clouds from different LiDARs.

Localization

The JSON format specifies all points in a single world coordinate because it would make annotation much easier and more accurate: All static objects would be in a fixed location. All dynamic objects would have their simple motions, without complications from the motion of the ego vehicle itself. In order to achieve this, we would need accurate localization data. The best localization usually comes from an expensive INS/GNSS. Otherwise, LiDAR SLAM combined with other sensors such as IMU and odometers can also give accurate localization in most situations.

If an accurate localization is available as a ROS topic, we just need to find the LiDAR pose with a timestamp closest to that of the LiDAR point cloud. We can use the pose to transform the point cloud into the world coordinate. Note that accurate LiDAR poses require LiDAR to INS calibration, which we will not cover here.

If localization is unavailable, we suggest that you try one of the LiDAR SLAM algorithms. If you skip this step and use the LiDAR point cloud as-is, it is still possible to annotate the data, but the cost and accuracy would both suffer.

Ego Motion Compensation (Advanced topic)

Most LiDARs have a slow “rolling shutter” where each scan or spin can take tens or hundreds of milliseconds. During this time interval, the LiDAR itself may be going through a complicated motion. If we treat the LiDAR as stationary for the entire time interval, the point cloud would be inaccurate. Ego motion compensation is the technique to solve this problem, but we will not cover it here. You can ignore this issue unless your vehicle was moving at a high speed such as on a highway.

Intrinsic and Extrinsic Calibrations

Accurate intrinsic calibration is required for each camera. We support the common plumb-bob (Brown & Conrady) and equidistant fisheye (Kannaly & Brandt) camera models. If you choose a model we don’t currently support, you can submit rectified images instead.

Extrinsic calibration is for specifying the camera pose for each image. If localization and ROS tf transforms are available, you just need to obtain the camera pose at the timestamp of the image. Note that we are using the camera coordinate system in OpenCV. It is identical to the ROS optical coordinate system, not the standard ROS coordinate system.

If the tf transforms cannot give you correct camera poses, but you have the extrinsic LiDAR-camera calibration, you can apply the extrinsic calibration to obtain the camera pose from the LiDAR pose. It is just a simple matrix multiplication. Since the point cloud and image usually have different timestamps, it would be more accurate to interpolate the LiDAR or Camera poses, but we will skip it for this tutorial.

Output

After obtaining the information above, we just need to make a pass through the point clouds and images to output the JSON and image files. Note that the JSON format supports BASE64 encoding, which can make your JSON files much smaller and the script run much faster.

Debugging

Debugging your first conversion script often takes a long time. We have some tips to find the most common bugs.

The first tip to debug is to output a small number of frames. Your script will finish quickly. You can upload this subset into deepen’s tool quickly and visualize the data.

Localization

To verify your localization accuracy, the easiest way is to load a moving point cloud sequence into deepen’s annotation tool. You can use the “fuse” option to accumulate multiple point clouds together. If the static objects are sharp in the fused point cloud, you are likely to have the correct localization. If not, there are many common mistakes:

  1. Low quality or misconfigured INS/GNSS

  2. Wrong usage of INS output

  3. Wrong coordinate frame is used

  4. Bad output from LiDAR SLAM algorithm

Calibration

To validate your camera calibrations, you can load a sequence into the annotation tool. Add a 3D box to an object. If the projection of the box on the image looks correct, you probably have the correct calibrations. Otherwise, there are many common mistakes:

  1. Giving rectified images and also non-zero distortion coefficients

  2. Using the ROS coordinate system instead of the optical coordinate system

  3. Scaled the images but did not change the intrinsic parameters

  4. The pose of the camera may be off due to localization error.

Synchronization

In order to validate the synchronization of your data, you can load a sequence into the annotation tool. Add a box to a moving object. If the projections on all camera images look correct, you should have a proper synchronization between point cloud and camera. For a high-precision check, the object should have a high angular velocity relative to the sensors. Therefore, you can use an object moving quickly from left to right or use a sequence where the ego vehicle is turning rapidly. If you found a synchronization error, there are several common mistakes:

  1. For big errors, there is likely to be some bugs in the synchronization logic in the script.

  2. For small errors, temporal calibration may be needed.

  3. For very small errors, we may need to consider the exact time the LiDAR scans the object which is different from the timestamp we assigned to the entire point cloud. This is an advanced topic.

Visualization code

Sample Code

We will release a sample conversion script soon.

Please contact us if you run into problems with your conversion script. We will gladly help you debug your script.

Bad LiDAR-camera calibration. You can use deepen’s calibration tool to redo or correct the calibration.

We have developed an offline visualization script to visualize the JSON files. Please find the tool and documentation for it .

https://help.deepen.ai/documentation/3d-input-output/3d-upload-json-format
https://help.deepen.ai/documentation/calibration/camera-lidar
here