Introduction and Quickstart


To add a dataset (submit a new labelling job), you first need to obtain a client id and auth token from deepen team. A dataset here represents a logical labelling job and encapsulates all the labelling requirements. At the dataset level, you need to specify what you are interested in labelling (for example:2d bounding boxes, lanes, polygons, 3d bounding boxes, 3d segmentation etc), called label types. For each label type, you then need to specify the categories of labels/objects (for example bus, car, truck, pedestrian) and attributes (for example occlusion, direction) that you are interested in labelling. You can create one or more datasets inside the client that get labelled as per the labelling requirements specified at the dataset level.


Obtain a client id and auth token from Deepen AI. If you are a client admin, you can also create different Access Tokens using the UI and use those instead. These are required in all API calls. Client Id is part of the path parameters in most of the API calls and the auth token should be prefixed with “Bearer “ and must be passed as the value of the ‘Authorization’ header in all API requests.

Create a zip file of the images/json to be uploaded as part of the first dataset. We need the exact size of the zip file to create a new dataset. On Linux, you can use the following command to get the exact bytes in a zip file.

ls -l <path to zip file> | awk '{print $5}'

Run the following command to create a new dataset

$ curl -X POST{client_id}/datasets -H 'Authorization: Bearer <token>' -H 'Content-Type: application/json' -d '{"dataset_name" : "test_api_dataset", "dataset_type":"3d", "labelling_mode":"frame_by_frame", "dataset_format":"default", "files": [{"file_size" : 14309374, "file_type":"application/zip"}] }'

Please note that “dataset_format”: ”json” along with “dataset_type”: ”3d” will be used to upload 3D LiDAR data in JSON format. For image datasets, “dataset_type” needs to be “images”, we do not need “dataset_format” for “images”. The response will have a resumable upload url and dataset id to which the zip file needs to be uploaded.

Run the following command to upload the zip file.

$ curl -v -X PUT -H 'Expect:' --upload-file '<path to zip file>' '<resumable upload url>'

You can also follow the google cloud storage documentation at to upload the zip file in multiple chunks.

Once the dataset zip file is uploaded, you need to call the process uploaded data api to process the zip file and make it ready for labelling. Here “file_name” will be “” for 3d with dataset_format as json and “” for 2d dataset_type.

$ curl -X POST{datasetId}/process_uploaded_data -H 'Authorization: Bearer <auth token>' -H 'Content-Type: application/json' -d '{"file_name": ""}'

Run the following command to check the status of the dataset.

$ curl -X GET{datasetId} -H 'Authorization: Bearer <auth token>' { "dataset_id":"{dataset_id}", "client_id":"{client_id}", "dataset_format":"default", "dataset_name":"test_api_dataset_2", "dataset_type":"3d", "labelling_mode":"frame_by_frame", "processing_status":"files_getting_labelled", "pipeline_stage_status":{"Labelling":{"done":0,"failed":1,"in_progress":0,"ready":9,"waiting":0},"QA":{"done":0,"failed":0,"in_progress":0,"ready":0,"waiting":10}}, "pipeline_status":{"Labelling":0,"QA":0,"__ALL_STAGES_DONE":0,"__TOTAL":10}}

If the dataset is successfully preprocessed, you should see the processing_status as files_getting_labelled. Once the dataset is successfully preprocessed, you can see the files in the dataset with the following command.

$ curl -X GET{datasetId}/files -H 'Authorization: Bearer <auth token>'

{ "files": [ { "client_id": "{clientId}", "dataset_id": "{datasetId}", "file_id": "000009.png", "status": "labelling_in_progress" }, {

"client_id": "{clientId}", "dataset_id": "{datasetId}", "file_id": "000019.png", "status": "labelling_in_progress" }, {

"client_id": "{clientId}", "dataset_id": "{datasetId}", "file_id": "inner_kitty/000059.png", "status": "labelling_in_progress" }, { "client_id": "{clientId}", "dataset_id": "{datasetId}", "file_id": "inner_kitty/000069.png", "status": "labelling_in_progress" } ] }

Get the labelling status of a particular file using the following command

$ curl -X GET{datasetId}/files/000019.png -H 'Authorization: Bearer <auth token>'

{ "client_id": "{clientId}", "dataset_id": "{datasetId}", "file_id": "000019.png", "status": "labelling_in_progress", }

Get the current labels of a file with the following command

curl -X GET{datasetId}/files/000019.png/labels -H 'Authorization: Bearer <auth token>'

{ "labels": [ { "file_id": "000019.png", "label_category_id": "bus", "label_id": "bus:1", "label_type": "box", "attributes": { "attribute2": "attribute2_value1", "attribute1": "attribute_value1" }, "box": [366, 87, 174, 116], }, { "file_id": "000019.png", "label_category_id": "white double solid", "label_id": "white double solid:1", "label_type": "lane", "attributes": {}, "labeller_email": "{clientId}", "polygons": [[[338, 330], [460, 253], [567, 179], [773, 91], [ 862,59]]] } ] }

Get all the labels of a dataset using the following command

curl -X GET{datasetId}/labels -H 'Authorization: Bearer <auth token>'