This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Formats

1: ADE20k (v2017)
2: ADE20k (v2020)
3: CIFAR
4: Cityscapes
5: COCO
6: Image zip
7: Velodyne Points / KITTI Raw 3D
8: KITTI
9: MNIST
10: Open Images
11: Pascal VOC
12: Supervisely Point Cloud
13: YOLO

1 - ADE20k (v2017)

Format specification

The original ADE20K 2017 dataset is available here.
Also the consistency set (for checking the annotation consistency) is available here.

Supported annotation types:

Masks

Supported annotation attributes:

occluded (boolean): whether the object is occluded by another object
other arbitrary boolean attributes, which can be specified in the annotation file <image_name>_atr.txt

Load ADE20K 2017 dataset

There are two ways to create Datumaro project and add ADE20K to it:

datum import --format ade20k2017 --input-path <path/to/dataset>
# or
datum create
datum add path -f ade20k2017 <path/to/dataset>

Also it is possible to load dataset using Python API:

from datumaro.components.dataset import Dataset

ade20k_dataset = Dataset.import_from('<path/to/dataset>', 'ade20k2017')

ADE20K dataset directory should have the following structure:

dataset/
├── subset1/
│   └── super_label_1/
│       ├── img1.jpg
│       ├── img1_atr.txt
│       ├── img1_parts_1.png
│       ├── img1_seg.png
│       ├── img2.jpg
│       ├── img2_atr.txt
│       └── ...
└── subset2/
    ├── img3.jpg
    ├── img3_atr.txt
    ├── img3_parts_1.png
    ├── img3_parts_2.png
    ├── img4.jpg
    ├── img4_atr.txt
    ├── img4_seg.png
    └── ...

The mask images <image_name>_seg.png contain information about the object class segmentation masks and also separates each class into instances. The channels R and G encode the objects class masks. The channel B encodes the instance object masks.

The mask images <image_name>_parts_N.png contain segmentation mask for parts of objects, where N is a number indicating the level in the part hierarchy.

The annotation files <image_name>_atr.txt describing the content of each image. Each line in the text file contains:

column 1: instance number,
column 2: part level (0 for objects),
column 3: occluded (1 for true),
column 4: original raw name (might provide a more detailed categorization),
column 5: class name (parsed using wordnet),
column 6: double-quoted list of attributes, separated by commas. Each column is separated by a #. See example of dataset here.

Export to other formats

Datumaro can convert ADE20K into any other format Datumaro supports. To get the expected result, convert the dataset to a format that supports segmentation masks.

There are a few ways to convert ADE20k 2017 to other dataset format using CLI:

datum import -f ade20k2017 -i <path/to/dataset>
datum export -f coco -o ./save_dir -- --save-images
# or
datum convert -if ade20k2017 -i <path/to/dataset> -f coco -o ./save_dir \
    --save-images

Or using Python API

from datumaro.components.dataset import Dataset

dataset = Dataset.import_from('<path/to/dataset>', 'ade202017')
dataset.export('save_dir', 'coco')

Examples

Examples of using this format from the code can be found in the format tests

2 - ADE20k (v2020)

Format specification

The original ADE20K 2020 dataset is available here.

Also the consistency set (for checking the annotation consistency) is available here.

Supported annotation types:

Masks

Supported annotation attributes:

occluded (boolean): whether the object is occluded by another object
other arbitrary boolean attributes, which can be specified in the annotation file <image_name>.json

Load ADE20K dataset

There are two ways to create Datumaro project and add ADE20K to it:

datum import --format ade20k2020 --input-path <path/to/dataset>
# or
datum create
datum add path -f ade20k2020 <path/to/dataset>

Also it is possible to load dataset using Python API:

from datumaro.components.dataset import Dataset

ade20k_dataset = Dataset.import_from('<path/to/dataset>', 'ade20k2020')

ADE20K dataset directory should has the following structure:

dataset/
├── subset1/
│   ├── img1/  # directory with instance masks for img1
│   |    ├── instance_001_img1.png
│   |    ├── instance_002_img1.png
│   |    └── ...
│   ├── img1.jpg
│   ├── img1.json
│   ├── img1_seg.png
│   ├── img1_parts_1.png
│   |
│   ├── img2/  # directory with instance masks for img2
│   |    ├── instance_001_img2.png
│   |    ├── instance_002_img2.png
│   |    └── ...
│   ├── img2.jpg
│   ├── img2.json
│   └── ...
│
└── subset2/
    ├── super_label_1/
    |   ├── img3/  # directory with instance masks for img3
    |   |    ├── instance_001_img3.png
    |   |    ├── instance_002_img3.png
    |   |    └── ...
    |   ├── img3.jpg
    |   ├── img3.json
    |   ├── img3_seg.png
    |   ├── img3_parts_1.png
    |   └── ...
    |
    ├── img4/  # directory with instance masks for img4
    |   ├── instance_001_img4.png
    |   ├── instance_002_img4.png
    |   └── ...
    ├── img4.jpg
    ├── img4.json
    ├── img4_seg.png
    └── ...

The mask images <image_name>_parts_N.png contain segmentation mask for parts of objects, where N is a number indicating the level in the part hierarchy.

The <image_name> directory contains instance masks for each object in the image, these masks represent one-channel images, each pixel of which indicates an affinity to a specific object.

The annotation files <image_name>.json describing the content of each image. See our tests asset for example of this file, or check ADE20K toolkit for it.

Export to other formats

Datumaro can convert ADE20K into any other format Datumaro supports. To get the expected result, convert the dataset to a format that supports segmentation masks.

There are a few ways to convert ADE20k to other dataset format using CLI:

datum import -f ade20k2020 -i <path/to/dataset>
datum export -f coco -o ./save_dir -- --save-images
# or
datum convert -if ade20k2020 -i <path/to/dataset> -f coco -o ./save_dir \
    --save-images

Or using Python API

from datumaro.components.dataset import Dataset

dataset = Dataset.import_from('<path/to/dataset>', 'ade20k2020')
dataset.export('save_dir', 'voc')

Examples

Examples of using this format from the code can be found in the format tests

3 - CIFAR

Format specification

CIFAR format specification is available here.

Supported annotation types:

Label

Datumaro supports Python version CIFAR-10/100. The difference between CIFAR-10 and CIFAR-100 is how labels are stored in the meta files (batches.meta or meta) and in the annotation files. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a “fine” label (the class to which it belongs) and a “coarse” label (the superclass to which it belongs). In CIFAR-10 there are no superclasses.

CIFAR formats contains 32 x 32 images. As an extension, Datumaro supports reading and writing of arbitrary-sized images.

Load CIFAR dataset

The CIFAR dataset is available for free download:

cifar-10-python.tar.gz: CIFAR-10 python version
cifar-100-python.tar.gz: CIFAR-100 python version

There are two ways to create Datumaro project and add CIFAR dataset to it:

datum import --format cifar --input-path <path/to/dataset>
# or
datum create
datum add path -f cifar <path/to/dataset>

It is possible to specify project name and project directory run datum create --help for more information.

CIFAR-10 dataset directory should have the following structure:

└─ Dataset/
    ├── batches.meta
    ├── <subset_name1>
    ├── <subset_name2>
    └── ...

CIFAR-100 dataset directory should have the following structure:

└─ Dataset/
    ├── meta
    ├── <subset_name1>
    ├── <subset_name2>
    └── ...

Dataset files use Pickle data format.

Meta files:

CIFAR-10:
    num_cases_per_batch: 1000
    label_names: list of strings (['airplane', 'automobile', 'bird', ...])
    num_vis: 3072

CIFAR-100:
    fine_label_names: list of strings (['apple', 'aquarium_fish', ...])
    coarse_label_names: list of strings (['aquatic_mammals', 'fish', ...])

Annotation files:

Common:
    'batch_label': 'training batch 1 of <N>'
    'data': numpy.ndarray of uint8, layout N x C x H x W
    'filenames': list of strings

    If images have non-default size (32x32) (Datumaro extension):
        'image_sizes': list of (H, W) tuples

CIFAR-10:
    'labels': list of strings

CIFAR-100:
    'fine_labels': list of integers
    'coarse_labels': list of integers

Export to other formats

Datumaro can convert CIFAR dataset into any other format Datumaro supports. To get the expected result, convert the dataset to formats that support the classification task (e.g. MNIST, ImageNet, PascalVOC, etc.) There are few ways to convert CIFAR dataset to other dataset format:

datum project import -f cifar -i <path/to/cifar>
datum export -f imagenet -o <path/to/output/dir>
# or
datum convert -if cifar -i <path/to/cifar> -f imagenet -o <path/to/output/dir>

Export to CIFAR

There are few ways to convert dataset to CIFAR format:

# export dataset into CIFAR format from existing project
datum export -p <path/to/project> -f cifar -o <path/to/export/dir> \
    -- --save-images
# converting to CIFAR format from other format
datum convert -if imagenet -i <path/to/imagenet/dataset> \
    -f cifar -o <path/to/export/dir> -- --save-images

Extra options for export to CIFAR format:

--save-images allow to export dataset with saving images (by default False);
--image-ext <IMAGE_EXT> allow to specify image extension for exporting dataset (by default .png).

The format (CIFAR-10 or CIFAR-100) in which the dataset will be exported depends on the presence of superclasses in the LabelCategories.

Examples

Datumaro supports filtering, transformation, merging etc. for all formats and for the CIFAR format in particular. Follow user manual to get more information about these operations.

There are few examples of using Datumaro operations to solve particular problems with CIFAR dataset:

Example 1. How to create custom CIFAR-like dataset

from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Label, DatasetItem

dataset = Dataset.from_iterable([
    DatasetItem(id=0, image=np.ones((32, 32, 3)),
        annotations=[Label(3)]
    ),
    DatasetItem(id=1, image=np.ones((32, 32, 3)),
        annotations=[Label(8)]
    )
], categories=['airplane', 'automobile', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck'])

dataset.export('./dataset', format='cifar')

Example 2. How to filter and convert CIFAR dataset to ImageNet

Convert CIFAR dataset to ImageNet format, keep only images with dog class presented:

# Download CIFAR-10 dataset:
# https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
datum convert --input-format cifar --input-path <path/to/cifar> \
              --output-format imagenet \
              --filter '/item[annotation/label="dog"]'

Examples of using this format from the code can be found in the format tests

4 - Cityscapes

Format specification

Cityscapes format overview is available here. Cityscapes format specification is available here.

Supported annotation types:

Masks

Supported annotation attributes:

is_crowd (boolean). Specifies if the annotation label can distinguish between different instances. If False, the annotation id field encodes the instance id.

Load Cityscapes dataset

The Cityscapes dataset is available for free download.

There are two ways to create Datumaro project and add Cityscapes dataset to it:

datum import --format cityscapes --input-path <path/to/dataset>
# or
datum create
datum add path -f cityscapes <path/to/dataset>

It is possible to specify project name and project directory run datum create --help for more information.

Cityscapes dataset directory should have the following structure:

└─ Dataset/
    ├── imgsFine/
    │   ├── leftImg8bit
    │   │   ├── <split: train,val, ...>
    │   │   |   ├── {city1}
    │   │   │   |   ├── {city1}_{seq:[0...6]}_{frame:[0...6]}_leftImg8bit.png
    │   │   │   │   └── ...
    │   │   |   ├── {city2}
    │   │   │   └── ...
    │   │   └── ...
    └── gtFine/
        ├── <split: train,val, ...>
        │   ├── {city1}
        │   |   ├── {city1}_{seq:[0...6]}_{frame:[0...6]}_gtFine_color.png
        │   |   ├── {city1}_{seq:[0...6]}_{frame:[0...6]}_gtFine_instanceIds.png
        │   |   ├── {city1}_{seq:[0...6]}_{frame:[0...6]}_gtFine_labelIds.png
        │   │   └── ...
        │   ├── {city2}
        │   └── ...
        └── ...

Annotated files description:

*_leftImg8bit.png - left images in 8-bit LDR format
*_color.png - class labels encoded by its color
*_labelIds.png - class labels are encoded by its index
*_instanceIds.png - class and instance labels encoded by an instance ID. The pixel values encode class and the individual instance: the integer part of a division by 1000 of each ID provides class ID, the remainder is the instance ID. If a certain annotation describes multiple instances, then the pixels have the regular ID of that class

To make sure that the selected dataset has been added to the project, you can run datum info, which will display the project and dataset information.

Export to other formats

Datumaro can convert Cityscapes dataset into any other format Datumaro supports. To get the expected result, convert the dataset to formats that support the segmentation task (e.g. PascalVOC, CamVID, etc.) There are few ways to convert Cityscapes dataset to other dataset format:

datum project import -f cityscapes -i <path/to/cityscapes>
datum export -f voc -o <path/to/output/dir>
# or
datum convert -if cityscapes -i <path/to/cityscapes> -f voc -o <path/to/output/dir>

Some formats provide extra options for conversion. These options are passed after double dash (--) in the command line. To get information about them, run

datum export -f <FORMAT> -- -h

Export to Cityscapes

There are few ways to convert dataset to Cityscapes format:

# export dataset into Cityscapes format from existing project
datum export -p <path/to/project> -f cityscapes -o <path/to/export/dir> \
    -- --save-images
# converting to Cityscapes format from other format
datum convert -if voc -i <path/to/voc/dataset> \
    -f cityscapes -o <path/to/export/dir> -- --save-images

Extra options for export to cityscapes format:

--save-images allow to export dataset with saving images (by default False);
--image-ext IMAGE_EXT allow to specify image extension for exporting dataset (by default - keep original or use .png, if none).
--label_map allow to define a custom colormap. Example

# mycolormap.txt :
# 0 0 255 sky
# 255 0 0 person
#...
datum export -f cityscapes -- --label-map mycolormap.txt

# or you can use original cityscapes colomap:
datum export -f cityscapes -- --label-map cityscapes

Examples

Datumaro supports filtering, transformation, merging etc. for all formats and for the Cityscapes format in particular. Follow user manual to get more information about these operations.

There are few examples of using Datumaro operations to solve particular problems with Cityscapes dataset:

Example 1. Load the original Cityscapes dataset and convert to Pascal VOC

datum create -o project
datum add path -p project -f cityscapes ./Cityscapes/
datum stats -p project
datum export -p final_project -o dataset -f voc -- --save-images

Example 2. Create a custom Cityscapes-like dataset

import numpy as np
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Mask, DatasetItem

import datumaro.plugins.cityscapes_format as Cityscapes

label_map = OrderedDict()
label_map['background'] = (0, 0, 0)
label_map['label_1'] = (1, 2, 3)
label_map['label_2'] = (3, 2, 1)
categories = Cityscapes.make_cityscapes_categories(label_map)

dataset = Dataset.from_iterable([
    DatasetItem(id=1,
        image=np.ones((1, 5, 3)),
        annotations=[
            Mask(image=np.array([[1, 0, 0, 1, 1]]), label=1),
            Mask(image=np.array([[0, 1, 1, 0, 0]]), label=2, id=2,
                attributes={'is_crowd': False}),
        ]
    ),
], categories=categories)

dataset.export('./dataset', format='cityscapes')

Examples of using this format from the code can be found in the format tests

5 - COCO

Format specification

COCO format specification available here.

The dataset has annotations for multiple tasks. Each task has its own format in Datumaro, and there is also a combined coco format, which includes all the available tasks. The sub-formats have the same options as the “main” format and only limit the set of annotation files they work with. To work with multiple formats, use the corresponding option of the coco format.

Supported tasks / formats:

The combined format - coco
Image Captioning - coco_caption
Object Detection - coco_instances
Panoptic Segmentation - coco_panoptic
Keypoint Detection - coco_person_keypoints
Stuff Segmentation - coco_stuff
Image Info - coco_image_info
Image classification (Datumaro extension) - coco_labels. The format is like Object Detection, but uses only category_id and score annotation fields.

Supported annotation types (depending on the task):

Caption (captions)
Label (label, Datumaro extension)
Bbox (instances, person keypoints)
Polygon (instances, person keypoints)
Mask (instances, person keypoints, panoptic, stuff)
Points (person keypoints)

Supported annotation attributes:

is_crowd (boolean; on bbox, polygon and mask annotations) - Indicates that the annotation covers multiple instances of the same class.
score (number; range [0; 1]) - Indicates the confidence in this annotation. Ground truth annotations always have 1.
arbitrary attributes (string/number) - A Datumaro extension. Stored in the attributes section of the annotation descriptor.

Load COCO dataset

The COCO dataset is available for free download:

Images:

Annotations:

There are two ways to create Datumaro project and add COCO dataset to it:

datum import --format coco --input-path <path/to/dataset>
# or
datum create
datum add path -f coco <path/to/dataset>

It is possible to specify project name and project directory, run datum create --help for more information.

A COCO dataset directory should have the following layout:

└─ Dataset/
    ├── images/
    │   ├── train<year>/
    │   │   ├── <image_name1.ext>
    │   │   ├── <image_name2.ext>
    │   │   └── ...
    │   └── val<year>/
    │       ├── <image_name1.ext>
    │       ├── <image_name2.ext>
    │       └── ...
    └── annotations/
        ├── <task>_<subset_name><year>.json
        └── ...

For the panoptic task, a dataset directory should have the following layout:

└─ Dataset/
    ├── images/
    │   ├── train<year>
    │   │   ├── <image_name1.ext>
    │   │   ├── <image_name2.ext>
    │   │   └── ...
    │   ├── val<year>
    │   │   ├── <image_name1.ext>
    │   │   ├── <image_name2.ext>
    │   │   └── ...
    └── annotations/
        ├── panoptic_train<year>/
        │   ├── <image_name1.ext>
        │   ├── <image_name2.ext>
        │   └── ...
        ├── panoptic_train<year>.json
        ├── panoptic_val<year>/
        │   ├── <image_name1.ext>
        │   ├── <image_name2.ext>
        │   └── ...
        └── panoptic_val<year>.json

Annotation files must have the names like <task>_<subset_name><year>.json.

You can import dataset for one or few tasks instead of the whole dataset. This option also allows to import annotation files with non-default names. For example:

datum import --format coco_stuff --input-path <path/to/stuff.json>

To make sure that the selected dataset has been added to the project, you can run datum info, which will display the project and dataset information.

Notes:

COCO categories can have any integer ids, however, Datumaro will count annotation category id 0 as “not specified”. This does not contradict the original annotations, because they have category indices starting from 1.

Export to other formats

Datumaro can convert COCO dataset into any other format Datumaro supports. To get the expected result, convert the dataset to formats that support the specified task (e.g. for panoptic segmentation - VOC, CamVID) There are few ways to convert COCO dataset to other dataset format:

datum project import -f coco -i <path/to/coco>
datum export -f voc -o <path/to/output/dir>
# or
datum convert -if coco -i <path/to/coco> -f voc -o <path/to/output/dir>

Some formats provide extra options for conversion. These options are passed after double dash (--) in the command line. To get information about them, run

datum export -f <FORMAT> -- -h

Export to COCO

There are few ways to convert dataset to COCO format:

# export dataset into COCO format from existing project
datum export -p <path/to/project> -f coco -o <path/to/export/dir> \
    -- --save-images
# converting to COCO format from other format
datum convert -if voc -i <path/to/voc/dataset> \
    -f coco -o <path/to/export/dir> -- --save-images

Extra options for export to COCO format:

--save-images allow to export dataset with saving images (by default False);
--image-ext IMAGE_EXT allow to specify image extension for exporting dataset (by default - keep original or use .jpg, if none);
--segmentation-mode MODE allow to specify save mode for instance segmentation:
- ‘guess’: guess the mode for each instance (using ‘is_crowd’ attribute as hint)
- ‘polygons’: save polygons( merge and convert masks, prefer polygons)
- ‘mask’: save masks (merge and convert polygons, prefer masks) (by default guess);
--crop-covered allow to crop covered segments so that background objects segmentation was more accurate (by default False);
--allow-attributes ALLOW_ATTRIBUTES allow export of attributes (by default True);
--reindex REINDEX allow to assign new indices to images and annotations, useful to avoid merge conflicts (by default False);
--merge-images allow to save all images into a single directory (by default False);
--tasks TASKS allow to specify tasks for export dataset, by default Datumaro uses all tasks. Example:

datum import -o project -f coco -i <dataset>
datum export -p project -f coco -- --tasks instances,stuff

Examples

Datumaro supports filtering, transformation, merging etc. for all formats and for the COCO format in particular. Follow user manual to get more information about these operations.

There are few examples of using Datumaro operations to solve particular problems with COCO dataset:

Example 1. How to load an original panoptic COCO dataset and convert to Pascal VOC

datum create -o project
datum add path -p project -f coco_panoptic ./COCO/annotations/panoptic_val2017.json
datum stats -p project
datum export -p final_project -o dataset -f voc  --overwrite  -- --save-images

Example 2. How to create custom COCO-like dataset

import numpy as np
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Mask, DatasetItem

dataset = Dataset.from_iterable([
    DatasetItem(id='000000000001',
                image=np.ones((1, 5, 3)),
                subset='val',
                attributes={'id': 40},
                annotations=[
                    Mask(image=np.array([[0, 0, 1, 1, 0]]), label=3,
                        id=7, group=7, attributes={'is_crowd': False}),
                    Mask(image=np.array([[0, 1, 0, 0, 1]]), label=1,
                        id=20, group=20, attributes={'is_crowd': True}),
                ]
            ),
    ], categories=['a', 'b', 'c', 'd'])

dataset.export('./dataset', format='coco_panoptic')

Examples of using this format from the code can be found in the format tests

6 - Image zip

Format specification

The image zip format allow to export/import unannotated datasets with images to/from zip archive. The format doesn’t support any annotations or attributes.

Load Image zip dataset

Few ways to load unannotated datasets to your Datumaro project:

From existing archive:

datum import -o project -f image_zip -i ./images.zip

From directory with zip archives. Datumaro will loaded images from all zip files in the directory:

datum import -o project -f image_zip -i ./foo

The directory with zip archives should have the following structure:

└── foo/
    ├── archive1.zip/
    |   ├── image_1.jpg
    |   ├── image_2.png
    |   ├── subdir/
    |   |   ├── image_3.jpg
    |   |   └── ...
    |   └── ...
    ├── archive2.zip/
    |   ├── image_101.jpg
    |   ├── image_102.jpg
    |   └── ...
    ...

Images in a archives should have supported extension, follow the user manual to see the supported extensions.

Export to other formats

Datumaro can load dataset images from a zip archive and convert it to another supported dataset format, for example:

datum import -o project -f image_zip -i ./images.zip
datum export -f coco -o ./new_dir -- --save-images

Export unannotated dataset to zip archive

Example: exporting images from VOC dataset to zip archives:

datum import -o project -f voc -i ./VOC2012
datum export -f image_zip -o ./ --overwrite -- --name voc_images.zip \
    --compression ZIP_DEFLATED

Extra options for export to image_zip format:

--save-images allow to export dataset with saving images (default: False);
--image-ext <IMAGE_EXT> allow to specify image extension for exporting dataset (default: use original or .jpg, if none);
--name name of output zipfile (default: default.zip);
--compression allow to specify archive compression method. Available methods: ZIP_STORED, ZIP_DEFLATED, ZIP_BZIP2, ZIP_LZMA (default: ZIP_STORED). Follow zip documentation for more information.

Examples

Examples of using this format from the code can be found in the format tests

7 - Velodyne Points / KITTI Raw 3D

Format specification

Velodyne Points / KITTI Raw 3D data format:

Supported annotation types:

Cuboid3d (represent tracks)

Supported annotation attributes:

truncation (write, string), possible values: truncation_unset, in_image, truncated, out_image, behind_image (case-independent).
occlusion (write, string), possible values: occlusion_unset, visible, partly, fully (case-independent). This attribute has priority over occluded.
occluded (read/write, boolean)
keyframe (read/write, boolean). Responsible for occlusion_kf field.
track_id (read/write, integer). Indicates the group over frames for annotations, represent tracks.

Supported image attributes:

frame (read/write, integer). Indicates frame number of the image.

Import KITTI Raw dataset

The velodyne points/KITTI Raw dataset is available for downloading here and here.

KITTI Raw dataset directory should have the following structure:

└─ Dataset/
    ├── image_00/ # optional, aligned images from different cameras
    │   └── data/
    │       ├── <name1.ext>
    │       └── <name2.ext>
    ├── image_01/
    │   └── data/
    │       ├── <name1.ext>
    │       └── <name2.ext>
    ...
    │
    ├── velodyne_points/ # optional, 3d point clouds
    │   └── data/
    │       ├── <name1.pcd>
    │       └── <name2.pcd>
    ├── tracklet_labels.xml
    └── frame_list.txt # optional, required for custom image names

The format does not support arbitrary image names and paths, but Datumaro provides an option to use a special index file to allow this.

frame_list.txt contents:

12345 relative/path/to/name1/from/data
46 relative/path/to/name2/from/data
...

There are two ways to create Datumaro project and add KITTI dataset to it:

datum import --format kitti_raw --input-path <path/to/dataset>
# or
datum create
datum add path -f kitti_raw <path/to/dataset>

To make sure that the selected dataset has been added to the project, you can run datum info, which will display the project and dataset information.

Export to other formats

Datumaro can convert KITTI Raw dataset into any other format Datumaro supports.

Such conversion will only be successful if the output format can represent the type of dataset you want to convert, e.g. 3D point clouds can be saved in Supervisely Point Clouds format, but not in COCO keypoints.

There are few ways to convert KITTI Raw dataset to other dataset format:

datum import -f kitti_raw -i <path/to/kitti_raw> -o proj/
datum export -f sly_pointcloud -o <path/to/output/dir> -p proj/
# or
datum convert -if kitti_raw -i <path/to/kitti_raw> -f sly_pointcloud

Some formats provide extra options for conversion. These options are passed after double dash (--) in the command line. To get information about them, run

datum export -f <FORMAT> -- -h

Export to KITTI Raw

There are few ways to convert dataset to KITTI Raw format:

# export dataset into KITTI Raw format from existing project
datum export -p <path/to/project> -f kitti_raw -o <path/to/export/dir> \
    -- --save-images
# converting to KITTI Raw format from other format
datum convert -if sly_pointcloud -i <path/to/sly_pcd/dataset> \
    -f kitti_raw -o <path/to/export/dir> -- --save-images --reindex

Extra options for exporting in KITTI Raw format:

--save-images allow to export dataset with saving images. This will include point clouds and related images (by default False)
--image-ext IMAGE_EXT allow to specify image extension for exporting dataset (by default - keep original or use .png, if none)
--reindex assigns new indices to frames and tracks. Allows annotations without track_id attribute (they will be exported as single-frame tracks).
--allow-attrs allows writing arbitrary annotation attributes. They will be written in <annotations> section of <poses><item> (disabled by default)

Examples

Example 1. Import dataset, compute statistics

datum create -o project
datum add path -p project -f kitti_raw ../../kitti_raw/
datum stats -p project

Example 2. Convert Supervisely Pointclouds to KITTI Raw

datum convert -if sly_pointcloud -i ../sly_pcd/ \
    -f kitti_raw -o my_kitti/ -- --save-images --allow-attrs

Example 3. Create a custom dataset

from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Cuboid3d, DatasetItem

dataset = Dataset.from_iterable([
    DatasetItem(id='some/name/qq',
        annotations=[
            Cuboid3d(position=[13.54, -9.41, 0.24], label=0,
                attributes={'occluded': False, 'track_id': 1}),

            Cuboid3d(position=[3.4, -2.11, 4.4], label=1,
                attributes={'occluded': True, 'track_id': 2})
        ],
        pcd='path/to/pcd1.pcd',
        related_images=[np.ones((10, 10)), 'path/to/image2.png', 'image3.jpg'],
        attributes={'frame': 0}
    ),
], categories=['cat', 'dog'])

dataset.export('my_dataset/', format='kitti_raw', save_images=True)

Examples of using this format from the code can be found in the format tests

8 - KITTI

Format specification

The KITTI dataset has many annotations for different tasks. Datumaro supports only few of them.

Supported tasks / formats:

Object Detection - kitti_detection The format specification is available in README.md here.
Segmentation - kitti_segmentation The format specification is available in README.md here.
Raw 3D / Velodyne Points - described here

Supported annotation types:

Bbox (object detection)
Mask (segmentation)

Supported attributes:

truncated (boolean) - indicates that the bounding box specified for the object does not correspond to the full extent of the object
occluded (boolean) - indicates that a significant portion of the object within the bounding box is occluded by another object

Load KITTI dataset

The KITTI left color images for object detection are available here. The KITTI object detection labels are available here. The KITTI segmentation dataset is available here.

There are two ways to create Datumaro project and add KITTI dataset to it:

datum import --format kitti --input-path <path/to/dataset>
# or
datum create
datum add path -f kitti <path/to/dataset>

It is possible to specify project name and project directory run datum create --help for more information.

KITTI segmentation dataset directory should have the following structure:

└─ Dataset/
    ├── testing/
    │   └── image_2/
    │       ├── <name_1>.<img_ext>
    │       ├── <name_2>.<img_ext>
    │       └── ...
    └── training/
        ├── image_2/ # left color camera images
        │   ├── <name_1>.<img_ext>
        │   ├── <name_2>.<img_ext>
        │   └── ...
        ├── label_2/ # left color camera label files
        │   ├── <name_1>.txt
        │   ├── <name_2>.txt
        │   └── ...
        ├── instance/ # instance segmentation masks
        │   ├── <name_1>.png
        │   ├── <name_2>.png
        │   └── ...
        ├── semantic/ # semantic segmentation masks (labels are encoded by its id)
        │   ├── <name_1>.png
        │   ├── <name_2>.png
        │   └── ...
        └── semantic_rgb/ # semantic segmentation masks (labels are encoded by its color)
            ├── <name_1>.png
            ├── <name_2>.png
            └── ...

You can import dataset for specific tasks of KITTI dataset instead of the whole dataset, for example:

datum add path -f kitti_detection <path/to/dataset>

To make sure that the selected dataset has been added to the project, you can run datum info, which will display the project and dataset information.

Export to other formats

Datumaro can convert KITTI dataset into any other format Datumaro supports.

Such conversion will only be successful if the output format can represent the type of dataset you want to convert, e.g. segmentation annotations can be saved in Cityscapes format, but no as COCO keypoints.

There are few ways to convert KITTI dataset to other dataset format:

datum project import -f kitti -i <path/to/kitti>
datum export -f cityscapes -o <path/to/output/dir>
# or
datum convert -if kitti -i <path/to/kitti> -f cityscapes -o <path/to/output/dir>

Some formats provide extra options for conversion. These options are passed after double dash (--) in the command line. To get information about them, run

datum export -f <FORMAT> -- -h

Export to KITTI

There are few ways to convert dataset to KITTI format:

# export dataset into KITTI format from existing project
datum export -p <path/to/project> -f kitti -o <path/to/export/dir> \
    -- --save-images
# converting to KITTI format from other format
datum convert -if cityscapes -i <path/to/cityscapes/dataset> \
    -f kitti -o <path/to/export/dir> -- --save-images

Extra options for export to KITTI format:

--save-images allow to export dataset with saving images (by default False);
--image-ext IMAGE_EXT allow to specify image extension for exporting dataset (by default - keep original or use .png, if none).
--apply-colormap APPLY_COLORMAP allow to use colormap for class masks (in folder semantic_rgb, by default True);
--label_map allow to define a custom colormap. Example

# mycolormap.txt :
# 0 0 255 sky
# 255 0 0 person
#...
datum export -f kitti -- --label-map mycolormap.txt

# or you can use original kitti colomap:
datum export -f kitti -- --label-map kitti

--tasks TASKS allow to specify tasks for export dataset, by default Datumaro uses all tasks. Example:

datum import -o project -f kitti -i <dataset>
datum export -p project -f kitti -- --tasks detection

--allow-attributes ALLOW_ATTRIBUTES allow export of attributes (by default True).

Examples

Datumaro supports filtering, transformation, merging etc. for all formats and for the KITTI format in particular. Follow user manual to get more information about these operations.

There are few examples of using Datumaro operations to solve particular problems with KITTI dataset:

Example 1. How to load an original KITTI dataset and convert to Cityscapes

datum create -o project
datum add path -p project -f kitti ./KITTI/
datum stats -p project
datum export -p final_project -o dataset -f cityscapes -- --save-images

Example 2. How to create custom KITTI-like dataset

import numpy as np
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Mask, DatasetItem

import datumaro.plugins.kitti_format as KITTI

label_map = {}
label_map['background'] = (0, 0, 0)
label_map['label_1'] = (1, 2, 3)
label_map['label_2'] = (3, 2, 1)
categories = KITTI.make_kitti_categories(label_map)

dataset = Dataset.from_iterable([
    DatasetItem(id=1,
                image=np.ones((1, 5, 3)),
                annotations=[
                    Mask(image=np.array([[1, 0, 0, 1, 1]]), label=1, id=0,
                        attributes={'is_crowd': False}),
                    Mask(image=np.array([[0, 1, 1, 0, 0]]), label=2, id=0,
                        attributes={'is_crowd': False}),
                ]
            ),
    ], categories=categories)

dataset.export('./dataset', format='kitti')

Examples of using this format from the code can be found in the format tests

9 - MNIST

Format specification

MNIST format specification is available here. Fashion MNIST format specification is available here. MNIST in CSV format specification is available here.

The dataset has few data formats available. Datumaro supports the binary (Python pickle) format and the CSV variant. Each data format is covered by a separate Datumaro format.

Supported formats:

Binary (Python pickle) - mnist
CSV - mnist_csv

Supported annotation types:

Label

The format only supports single channel 28 x 28 images.

Load MNIST dataset

The MNIST dataset is available for free download:

train-images-idx3-ubyte.gz: training set images
train-labels-idx1-ubyte.gz: training set labels
t10k-images-idx3-ubyte.gz: test set images
t10k-labels-idx1-ubyte.gz: test set labels

The Fashion MNIST dataset is available for free download:

train-images-idx3-ubyte.gz: training set images
train-labels-idx1-ubyte.gz: training set labels
t10k-images-idx3-ubyte.gz: test set images
t10k-labels-idx1-ubyte.gz: test set labels

The MNIST in CSV dataset is available for free download:

There are two ways to create Datumaro project and add MNIST dataset to it:

datum import --format mnist --input-path <path/to/dataset>
# or
datum create
datum add path -f mnist <path/to/dataset>

There are two ways to create Datumaro project and add MNIST in CSV dataset to it:

datum import --format mnist_csv --input-path <path/to/dataset>
# or
datum create
datum add path -f mnist_csv <path/to/dataset>

It is possible to specify project name and project directory run datum create --help for more information.

MNIST dataset directory should have the following structure:

└─ Dataset/
    ├── labels.txt # list of non-digit labels (optional)
    ├── t10k-images-idx3-ubyte.gz
    ├── t10k-labels-idx1-ubyte.gz
    ├── train-images-idx3-ubyte.gz
    └── train-labels-idx1-ubyte.gz

MNIST in CSV dataset directory should have the following structure:

└─ Dataset/
    ├── labels.txt # list of non-digit labels (optional)
    ├── mnist_test.csv
    └── mnist_train.csv

If the dataset needs non-digit labels, you need to add the labels.txt to the dataset folder. For example, labels.txt for Fashion MNIST the following contents:

T-shirt/top
Trouser
Pullover
Dress
Coat
Sandal
Shirt
Sneaker
Bag
Ankle boot

Export to other formats

Datumaro can convert MNIST dataset into any other format Datumaro supports. To get the expected result, convert the dataset to formats that support the classification task (e.g. CIFAR-10/100, ImageNet, PascalVOC, etc.) There are few ways to convert MNIST dataset to other dataset format:

datum project import -f mnist -i <path/to/mnist>
datum export -f imagenet -o <path/to/output/dir>
# or
datum convert -if mnist -i <path/to/mnist> -f imagenet -o <path/to/output/dir>

These commands also work for MNIST in CSV if you use mnist_csv instead of mnist.

Export to MNIST

There are few ways to convert dataset to MNIST format:

# export dataset into MNIST format from existing project
datum export -p <path/to/project> -f mnist -o <path/to/export/dir> \
    -- --save-images
# converting to MNIST format from other format
datum convert -if imagenet -i <path/to/imagenet/dataset> \
    -f mnist -o <path/to/export/dir> -- --save-images

Extra options for export to MNIST format:

--save-images allow to export dataset with saving images (by default False);
--image-ext <IMAGE_EXT> allow to specify image extension for exporting dataset (by default .png).

These commands also work for MNIST in CSV if you use mnist_csv instead of mnist.

Examples

Datumaro supports filtering, transformation, merging etc. for all formats and for the MNIST format in particular. Follow user manual to get more information about these operations.

There are few examples of using Datumaro operations to solve particular problems with MNIST dataset:

Example 1. How to create custom MNIST-like dataset

from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Label, DatasetItem

dataset = Dataset.from_iterable([
    DatasetItem(id=0, image=np.ones((28, 28)),
        annotations=[Label(2)]
    ),
    DatasetItem(id=1, image=np.ones((28, 28)),
        annotations=[Label(7)]
    )
], categories=[str(label) for label in range(10)])

dataset.export('./dataset', format='mnist')

Example 2. How to filter and convert MNIST dataset to ImageNet

Convert MNIST dataset to ImageNet format, keep only images with 3 class presented:

# Download MNIST dataset:
# https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
# https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
datum convert --input-format mnist --input-path <path/to/mnist> \
              --output-format imagenet \
              --filter '/item[annotation/label="3"]'

Examples of using this format from the code can be found in the binary format tests and csv format tests

10 - Open Images

Format specification

A description of the Open Images Dataset (OID) format is available on its website. Datumaro supports versions 4, 5 and 6.

Supported annotation types:

Label (human-verified image-level labels)
Bbox (bounding boxes)
Mask (segmentation masks)

Supported annotation attributes:

Labels
- score (read/write, float). The confidence level from 0 to 1. A score of 0 indicates that the image does not contain objects of the corresponding class.
Bounding boxes
- score (read/write, float). The confidence level from 0 to 1. In the original dataset this is always equal to 1, but custom datasets may be created with arbitrary values.
- occluded (read/write, boolean). Whether the object is occluded by another object.
- truncated (read/write, boolean). Whether the object extends beyond the boundary of the image.
- is_group_of (read/write, boolean). Whether the object represents a group of objects of the same class.
- is_depiction (read/write, boolean). Whether the object is a depiction (such as a drawing) rather than a real object.
- is_inside (read/write, boolean). Whether the object is seen from the inside.
Masks
- box_id (read/write, string). An identifier for the bounding box associated with the mask.
- predicted_iou (read/write, float). Predicted IoU value with respect to the ground truth.

Load Open Images dataset

The Open Images dataset is available for free download.

See the open-images-dataset GitHub repository for information on how to download the images.

Datumaro also requires the image description files, which can be downloaded from the following URLs:

Datumaro expects at least one of the files above to be present.

In addition, the following metadata file must be present as well:

class descriptions

You can optionally download the following additional metadata file:

class hierarchy

Annotations can be downloaded from the following URLs:

train image labels
validation image labels
test image labels
train bounding boxes
validation bounding boxes
test bounding boxes
train segmentation masks (metadata)
train segmentation masks (images): 0 1 2 3 4 5 6 7 8 9 a b c d e f
validation segmentation masks (metadata)
validation segmentation masks (images): 0 1 2 3 4 5 6 7 8 9 a b c d e f
test segmentation masks (metadata)
test segmentation masks (images): 0 1 2 3 4 5 6 7 8 9 a b c d e f

All annotation files are optional, except that if the mask metadata files for a given subset are downloaded, all corresponding images must be downloaded as well, and vice versa.

There are two ways to create Datumaro project and add OID to it:

datum import --format open_images --input-path <path/to/dataset>
# or
datum create
datum add path -f open_images <path/to/dataset>

It is possible to specify project name and project directory; run datum create --help for more information.

Open Images dataset directory should have the following structure:

└─ Dataset/
    ├── annotations/
    │   └── bbox_labels_600_hierarchy.json
    │   └── image_ids_and_rotation.csv
    │   └── oidv6-class-descriptions.csv
    │   └── *-annotations-bbox.csv
    │   └── *-annotations-human-imagelabels.csv
    │   └── *-annotations-object-segmentation.csv
    ├── images/
    |   ├── test/
    |   │   ├── <image_name1.jpg>
    |   │   ├── <image_name2.jpg>
    |   │   └── ...
    |   ├── train/
    |   │   ├── <image_name1.jpg>
    |   │   ├── <image_name2.jpg>
    |   │   └── ...
    |   └── validation/
    |       ├── <image_name1.jpg>
    |       ├── <image_name2.jpg>
    |       └── ...
    └── masks/
        ├── test/
        │   ├── <mask_name1.png>
        │   ├── <mask_name2.png>
        │   └── ...
        ├── train/
        │   ├── <mask_name1.png>
        │   ├── <mask_name2.png>
        │   └── ...
        └── validation/
            ├── <mask_name1.png>
            ├── <mask_name2.png>
            └── ...

The mask images must be extracted from the ZIP archives linked above.

To use per-subset image description files instead of image_ids_and_rotation.csv, place them in the annotations subdirectory.

Creating an image metadata file

To load bounding box and segmentation mask annotations, Datumaro needs to know the sizes of the corresponding images. By default, it will determine these sizes by loading each image from disk, which requires the images to be present and makes the loading process slow.

If you want to load the aforementioned annotations on a machine where the images are not available, or just to speed up the dataset loading process, you can extract the image size information in advance and record it in an image metadata file. This file must be placed at annotations/images.meta, and must contain one line per image, with the following structure:

<ID> <height> <width>

Where <ID> is the file name of the image without the extension, and <height> and <width> are the dimensions of that image. <ID> may be quoted with either single or double quotes.

The image metadata file, if present, will be used to determine the image sizes without loading the images themselves.

Here’s one way to create the images.meta file using ImageMagick, assuming that the images are present on the current machine:

# run this from the dataset directory
find images -name '*.jpg' -exec \
    identify -format '"%[basename]" %[height] %[width]\n' {} + \
    > annotations/images.meta

Export to other formats

Datumaro can convert OID into any other format Datumaro supports. To get the expected result, convert the dataset to a format that supports image-level labels. There are a few ways to convert OID to other dataset format:

datum project import -f open_images -i <path/to/open_images>
datum export -f cvat -o <path/to/output/dir>
# or
datum convert -if open_images -i <path/to/open_images> -f cvat -o <path/to/output/dir>

Some formats provide extra options for conversion. These options are passed after double dash (--) in the command line. To get information about them, run

datum export -f <FORMAT> -- -h

Export to Open Images

There are few ways to convert an existing dataset to the Open Images format:

# export dataset into Open Images format from existing project
datum export -p <path/to/project> -f open_images -o <path/to/export/dir> \
  -- --save_images

# convert a dataset in another format to the Open Images format
datum convert -if imagenet -i <path/to/imagenet/dataset> \
    -f open_images -o <path/to/export/dir> \
    -- --save-images

Extra options for export to the Open Images format:

--save-images - save image files when exporting the dataset (by default, False)
--image-ext IMAGE_EXT - save image files with the specified extension when exporting the dataset (by default, uses the original extension or .jpg if there isn’t one)

Examples

Datumaro supports filtering, transformation, merging etc. for all formats and for the Open Images format in particular. Follow user manual to get more information about these operations.

Here are a few examples of using Datumaro operations to solve particular problems with the Open Images dataset:

Example 1. Load the Open Images dataset and convert to the CVAT format

datum create -o project
datum add path -p project -f open_images ./open-images-dataset/
datum stats -p project
datum export -p project -o dataset -f cvat --overwrite -- --save-images

Example 2. Create a custom OID-like dataset

import numpy as np
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import (
    AnnotationType, Label, LabelCategories, DatasetItem,
)

dataset = Dataset.from_iterable(
    [
        DatasetItem(
            id='0000000000000001',
            image=np.ones((1, 5, 3)),
            subset='validation',
            annotations=[
                Label(0, attributes={'score': 1}),
                Label(1, attributes={'score': 0}),
            ],
        ),
    ],
    categories=['/m/0', '/m/1'],
)
dataset.export('./dataset', format='open_images')

Examples of using this format from the code can be found in the format tests.

11 - Pascal VOC

Format specification

Pascal VOC format specification is available here.

The dataset has annotations for multiple tasks. Each task has its own format in Datumaro, and there is also a combined voc format, which includes all the available tasks. The sub-formats have the same options as the “main” format and only limit the set of annotation files they work with. To work with multiple formats, use the corresponding option of the voc format.

Supported tasks / formats:

The combined format - voc
Image classification - voc_classification
Object detection - voc_detection
Action classification - voc_action
Class and instance segmentation - voc_segmentation
Person layout detection - voc_layout

Supported annotation types:

Label (classification)
Bbox (detection, action detection and person layout)
Mask (segmentation)

Supported annotation attributes:

occluded (boolean) - indicates that a significant portion of the object within the bounding box is occluded by another object
truncated (boolean) - indicates that the bounding box specified for the object does not correspond to the full extent of the object
difficult (boolean) - indicates that the object is considered difficult to recognize
action attributes (boolean) - jumping, reading and others. Indicate that the object does the corresponding action.
arbitrary attributes (string/number) - A Datumaro extension. Stored in the attributes section of the annotation xml file. Available for bbox annotations only.

Load Pascal VOC dataset

The Pascal VOC dataset is available for free download here

There are two ways to create Datumaro project and add Pascal VOC dataset to it:

datum import --format voc --input-path <path/to/dataset>
# or
datum create
datum add path -f voc <path/to/dataset>

It is possible to specify project name and project directory run datum create --help for more information. Pascal VOC dataset directory should have the following structure:

└─ Dataset/
   ├── label_map.txt # a list of non-Pascal labels (optional)
   │
   ├── Annotations/
   │     ├── ann1.xml # Pascal VOC format annotation file
   │     ├── ann2.xml
   │     └── ...
   ├── JPEGImages/
   │    ├── img1.jpg
   │    ├── img2.jpg
   │    └── ...
   ├── SegmentationClass/ # directory with semantic segmentation masks
   │    ├── img1.png
   │    ├── img2.png
   │    └── ...
   ├── SegmentationObject/ # directory with instance segmentation masks
   │    ├── img1.png
   │    ├── img2.png
   │    └── ...
   │
   └── ImageSets/
        ├── Main/ # directory with list of images for detection and classification task
        │   ├── test.txt  # list of image names in test subset  (without extension)
        |   ├── train.txt # list of image names in train subset (without extension)
        |   └── ...
        ├── Layout/ # directory with list of images for person layout task
        │   ├── test.txt
        |   ├── train.txt
        |   └── ...
        ├── Action/ # directory with list of images for action classification task
        │   ├── test.txt
        |   ├── train.txt
        |   └── ...
        └── Segmentation/ # directory with list of images for segmentation task
            ├── test.txt
            ├── train.txt
            └── ...

The ImageSets directory should contain at least one of the directories: Main, Layout, Action, Segmentation. These directories contain .txt files with a list of images in a subset, the subset name is the same as the .txt file name. Subset names can be arbitrary.

In label_map.txt you can define custom color map and non-pascal labels, for example:

# label_map [label : color_rgb : parts : actions]
helicopter:::
elephant:0:124:134:head,ear,foot:

It is also possible to import grayscale (1-channel) PNG masks. For grayscale masks provide a list of labels with the number of lines equal to the maximum color index on images. The lines must be in the right order so that line index is equal to the color index. Lines can have arbitrary, but different, colors. If there are gaps in the used color indices in the annotations, they must be filled with arbitrary dummy labels. Example:

car:0,128,0:: # color index 0
aeroplane:10,10,128:: # color index 1
_dummy2:2,2,2:: # filler for color index 2
_dummy3:3,3,3:: # filler for color index 3
boat:108,0,100:: # color index 3
...
_dummy198:198,198,198:: # filler for color index 198
_dummy199:199,199,199:: # filler for color index 199
the_last_label:12,28,0:: # color index 200

You can import dataset for specific tasks of Pascal VOC dataset instead of the whole dataset, for example:

datum add path -f voc_detection <path/to/dataset/ImageSets/Main/train.txt>

To make sure that the selected dataset has been added to the project, you can run datum info, which will display the project and dataset information.

Export to other formats

Datumaro can convert Pascal VOC dataset into any other format Datumaro supports.

Such conversion will only be successful if the output format can represent the type of dataset you want to convert, e.g. image classification annotations can be saved in ImageNet format, but no as COCO keypoints.

There are few ways to convert Pascal VOC dataset to other dataset format:

datum import -f voc -i <path/to/voc>
datum export -f coco -o <path/to/output/dir>
# or
datum convert -if voc -i <path/to/voc> -f coco -o <path/to/output/dir>

Some formats provide extra options for conversion. These options are passed after double dash (--) in the command line. To get information about them, run

datum export -f <FORMAT> -- -h

Export to Pascal VOC

There are few ways to convert an existing dataset to Pascal VOC format:

# export dataset into Pascal VOC format (classification) from existing project
datum export -p <path/to/project> -f voc -o <path/to/export/dir> -- --tasks classification

# converting to Pascal VOC format from other format
datum convert -if imagenet -i <path/to/imagenet/dataset> \
    -f voc -o <path/to/export/dir> \
    -- --label_map voc --save-images

Extra options for export to Pascal VOC format:

--save-images - allow to export dataset with saving images (by default False)
--image-ext IMAGE_EXT - allow to specify image extension for exporting dataset (by default use original or .jpg if none)
--apply-colormap APPLY_COLORMAP - allow to use colormap for class and instance masks (by default True)
--allow-attributes ALLOW_ATTRIBUTES - allow export of attributes (by default True)
--keep-empty KEEP_EMPTY - write subset lists even if they are empty (by default: False)
--tasks TASKS - allow to specify tasks for export dataset, by default Datumaro uses all tasks. Example:

datum import -o project -f voc -i ./VOC2012
datum export -p project -f voc -- --tasks detection,classification

--label_map allow to define a custom colormap. Example

# mycolormap.txt [label : color_rgb : parts : actions]:
# cat:0,0,255::
# person:255,0,0:head:
datum export -f voc_segmentation -- --label-map mycolormap.txt

# or you can use original voc colomap:
datum export -f voc_segmentation -- --label-map voc

Examples

Datumaro supports filtering, transformation, merging etc. for all formats and for the Pascal VOC format in particular. Follow user manual to get more information about these operations.

There are few examples of using Datumaro operations to solve particular problems with Pascal VOC dataset:

Example 1. How to prepare an original dataset for training.

In this example, preparing the original dataset to train the semantic segmentation model includes: loading, checking duplicate images, setting the number of images, splitting into subsets, export the result to Pascal VOC format.

datum create -o project
datum add path -p project -f voc_segmentation ./VOC2012/ImageSets/Segmentation/trainval.txt
datum stats -p project # check statisctics.json -> repeated images
datum transform -p project -o ndr_project -t ndr -- -w trainval -k 2500
datum filter -p ndr_project -o trainval2500 -e '/item[subset="trainval"]'
datum transform -p trainval2500 -o final_project -t random_split -- -s train:.8 -s val:.2
datum export -p final_project -o dataset -f voc -- --label-map voc --save-images

Example 2. How to create custom dataset

from datumaro.components.dataset import Dataset
from datumaro.util.image import Image
from datumaro.components.extractor import Bbox, Polygon, Label, DatasetItem

dataset = Dataset.from_iterable([
    DatasetItem(id='image1', image=Image(path='image1.jpg', size=(10, 20)),
       annotations=[Label(3),
           Bbox(1.0, 1.0, 10.0, 8.0, label=0, attributes={'difficult': True, 'running': True}),
           Polygon([1, 2, 3, 2, 4, 4], label=2, attributes={'occluded': True}),
           Polygon([6, 7, 8, 8, 9, 7, 9, 6], label=2),
        ]
    ),
], categories=['person', 'sky', 'water', 'lion'])

dataset.transform('polygons_to_masks')
dataset.export('./mydataset', format='voc', label_map='my_labelmap.txt')

"""
my_labelmap.txt:
# label:color_rgb:parts:actions
person:0,0,255:hand,foot:jumping,running
sky:128,0,0::
water:0,128,0::
lion:255,128,0::
"""

Example 3. Load, filter and convert from code

Load Pascal VOC dataset, and export train subset with items which has jumping attribute:

from datumaro.components.dataset import Dataset

dataset = Dataset.import_from('./VOC2012', format='voc')

train_dataset = dataset.get_subset('train').as_dataset()

def only_jumping(item):
    for ann in item.annotations:
        if ann.attributes.get('jumping'):
            return True
    return False

train_dataset.select(only_jumping)

train_dataset.export('./jumping_label_me', format='label_me', save_images=True)

Example 4. Get information about items in Pascal VOC 2012 dataset for segmentation task:

from datumaro.components.dataset import Dataset
from datumaro.components.extractor import AnnotationType

dataset = Dataset.import_from('./VOC2012', format='voc')

def has_mask(item):
    for ann in item.annotations:
        if ann.type == AnnotationType.mask:
            return True
    return False

dataset.select(has_mask)

print("Pascal VOC 2012 has %s images for segmentation task:" % len(dataset))
for subset_name, subset in dataset.subsets().items():
    for item in subset:
        print(item.id, subset_name, end=";")

After executing this code, we can see that there are 5826 images in Pascal VOC 2012 has for segmentation task and this result is the same as the official documentation

Examples of using this format from the code can be found in tests

12 - Supervisely Point Cloud

Format specification

Point Cloud data format:

specification.
example.

Supported annotation types:

cuboid_3d

Supported annotation attributes:

track_id (read/write, integer), responsible for object field
createdAt (write, string),
updatedAt (write, string),
labelerLogin (write, string), responsible for the corresponding fields in the annotation file.
arbitrary attributes

Supported image attributes:

description (read/write, string),
createdAt (write, string),
updatedAt (write, string),
labelerLogin (write, string), responsible for the corresponding fields in the annotation file.
frame (read/write, integer). Indicates frame number of the image.
arbitrary attributes

Import Supervisely Point Cloud dataset

An example dataset in Supervisely Point Cloud format is available for download:

https://drive.google.com/u/0/uc?id=1BtZyffWtWNR-mk_PHNPMnGgSlAkkQpBl&export=download

Point Cloud dataset directory should have the following structure:

└─ Dataset/
    ├── ds0/
    │   ├── ann/
    │   │   ├── <pcdname1.pcd.json>
    │   │   ├── <pcdname2.pcd.json>
    │   │   └── ...
    │   ├── pointcloud/
    │   │   ├── <pcdname1.pcd>
    │   │   ├── <pcdname1.pcd>
    │   │   └── ...
    │   ├── related_images/
    │   │   ├── <pcdname1_pcd>/
    │   │   |  ├── <image_name.ext.json>
    │   │   |  ├── <image_name.ext.json>
    │   │   └── ...
    ├── key_id_map.json
    └── meta.json

There are two ways to import Supervisely Point Cloud dataset:

datum import --format sly_pointcloud --input-path <path/to/dataset>
# or
datum create
datum add path -f sly_pointcloud <path/to/dataset>

To make sure that the selected dataset has been added to the project, you can run datum info, which will display the project and dataset information.

Export to other formats

Datumaro can convert Supervisely Point Cloud dataset into any other format Datumaro supports.

Such conversion will only be successful if the output format can represent the type of dataset you want to convert, e.g. 3D point clouds can be saved in KITTI Raw format, but not in COCO keypoints.

There are few ways to convert Supervisely Point Cloud dataset to other dataset formats:

datum import -f sly_pointcloud -i <path/to/sly_pcd/> -o proj/
datum export -f kitti_raw -o <path/to/output/dir> -p proj/
# or
datum convert -if sly_pointcloud -i <path/to/sly_pcd/> -f kitti_raw

Some formats provide extra options for conversion. These options are passed after double dash (--) in the command line. To get information about them, run

datum export -f <FORMAT> -- -h

Export to Supervisely Point Cloud

There are few ways to convert dataset to Supervisely Point Cloud format:

# export dataset into Supervisely Point Cloud format from existing project
datum export -p <path/to/project> -f sly_pointcloud -o <path/to/export/dir> \
    -- --save-images
# converting to Supervisely Point Cloud format from other format
datum convert -if kitti_raw -i <path/to/kitti_raw/dataset> \
    -f sly_pointcloud -o <path/to/export/dir> -- --save-images

Extra options for exporting in Supervisely Point Cloud format:

--save-images allow to export dataset with saving images. This will include point clouds and related images (by default False)
--image-ext IMAGE_EXT allow to specify image extension for exporting dataset (by default - keep original or use .png, if none)
--reindex assigns new indices to frames and annotations.
--allow-undeclared-attrs allows writing arbitrary annotation attributes. By default, only attributes specified in the input dataset metainfo will be written.

Examples

Example 1. Import dataset, compute statistics

datum create -o project
datum add path -p project -f sly_pointcloud ../sly_dataset/
datum stats -p project

Example 2. Convert Supervisely Point Clouds to KITTI Raw

datum convert -if sly_pointcloud -i ../sly_pcd/ \
    -f kitti_raw -o my_kitti/ -- --save-images --reindex --allow-attrs

Example 3. Create a custom dataset

from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Cuboid3d, DatasetItem

dataset = Dataset.from_iterable([
    DatasetItem(id='frame_1',
        annotations=[
            Cuboid3d(id=206, label=0,
                position=[320.86, 979.18, 1.04],
                attributes={'occluded': False, 'track_id': 1, 'x': 1}),

            Cuboid3d(id=207, label=1,
                position=[318.19, 974.65, 1.29],
                attributes={'occluded': True, 'track_id': 2}),
        ],
        pcd='path/to/pcd1.pcd',
        attributes={'frame': 0, 'description': 'zzz'}
    ),

    DatasetItem(id='frm2',
        annotations=[
            Cuboid3d(id=208, label=1,
                position=[23.04, 8.75, -0.78],
                attributes={'occluded': False, 'track_id': 2})
        ],
        pcd='path/to/pcd2.pcd', related_images=['image2.png'],
        attributes={'frame': 1}
    ),
], categories=['cat', 'dog'])

dataset.export('my_dataset/', format='sly_pointcloud', save_images=True,
    allow_undeclared_attrs=True)

Examples of using this format from the code can be found in the format tests

13 - YOLO

Format specification

The YOLO dataset format is for training and validating object detection models. Specification for this format available here. And also you can find some official examples on working with YOLO dataset here;
The YOLO dataset format support the following types of annotations:
- Bounding boxes
YOLO format doesn’t support attributes for annotations;
The format only supports subsets named train or valid.

Load YOLO dataset

Few ways to create Datumaro project and add YOLO dataset to it:

datum import -o project -f yolo -i <path/to/yolo/dataset>

# another way to do the same:
datum create -o project
datum add path -p project -f yolo -i <path/to/yolo/dataset>

# and you can add another one yolo dataset:
datum add path -p project -f yolo -i <path/to/other/yolo/dataset>

YOLO dataset directory should have the following structure:

└─ yolo_dataset/
   │
   ├── obj.names  # file with list of classes
   ├── obj.data   # file with dataset information
   ├── train.txt  # list of image paths in train subset
   ├── valid.txt  # list of image paths in valid subset
   │
   ├── obj_train_data/  # directory with annotations and images for train subset
   │    ├── image1.txt  # list of labeled bounding boxes for image1
   │    ├── image1.jpg
   │    ├── image2.txt
   │    ├── image2.jpg
   │    ├── ...
   │
   ├── obj_valid_data/  # directory with annotations and images for valid subset
   │    ├── image101.txt
   │    ├── image101.jpg
   │    ├── image102.txt
   │    ├── image102.jpg
   │    ├── ...

YOLO dataset cannot contain a subset with a name other than train or valid. If imported dataset contains such subsets, they will be ignored. If you are exporting a project into yolo format, all subsets different from train and valid will be skipped. If there is no subset separation in a project, the data will be saved in train subset.

obj.data should have the following content, it is not necessary to have both subsets, but necessary to have one of them:

classes = 5 # optional
names = <path/to/obj.names>
train = <path/to/train.txt>
valid = <path/to/valid.txt>
backup = backup/ # optional

obj.names contain list of classes. The line number for the class is the same as its index:

label1  # label1 has index 0
label2  # label2 has index 1
label3  # label2 has index 2
...

Files train.txt and valid.txt should have the following structure:

<path/to/image1.jpg>
<path/to/image2.jpg>
...

Files in directories obj_train_data/ and obj_valid_data/ should contain information about labeled bounding boxes for images:

# image1.txt:
# <label_index> <x_center> <y_center> <width> <height>
0 0.250000 0.400000 0.300000 0.400000
3 0.600000 0.400000 0.400000 0.266667

Here x_center, y_center, width, and height are relative to the image’s width and height. The x_center and y_center are center of rectangle (are not top-left corner).

Export to other formats

Datumaro can convert YOLO dataset into any other format Datumaro supports. For successful conversion the output format should support object detection task (e.g. Pascal VOC, COCO, TF Detection API etc.)

Examples:

datum import -o project -f yolo -i <path/to/yolo/dataset>
datum export -p project -f voc -o <path/to/output/voc/dataset>

datum convert -if yolo -i <path/to/yolo/dataset> \
              -f coco_instances -o <path/to/output/coco/dataset>

Export to YOLO format

Datumaro can convert an existing dataset to YOLO format, if the dataset supports object detection task.

Example:

datum import -p project -f coco_instances -i <path/to/coco/dataset>
datum export -p project -f yolo -o <path/to/output/yolo/dataset> -- --save-images

Extra options for export to YOLO format:

--save-images allow to export dataset with saving images (default: False);
--image-ext <IMAGE_EXT> allow to specify image extension for exporting dataset (default: use original or .jpg, if none).

Examples

Example 1. Prepare PASCAL VOC dataset for exporting to YOLO format dataset

datum import -o project -f voc -i ./VOC2012
datum filter -p project -e '/item[subset="train" or subset="val"]' -o trainval_voc
datum transform -p trainval_voc -o trainvalid_voc \
    -t map_subsets -- -s train:train -s val:valid
datum export -p trainvalid_voc -f yolo -o ./yolo_dataset -- --save-images

Example 2. Remove some class from YOLO dataset

Delete all items, which contain cat objects and remove cat from list of classes:

datum import -o project -f yolo -i ./yolo_dataset
datum filter -p project -o filtered -m i+a -e '/item/annotation[label!="cat"]'
datum transform -p filtered -o without_cat -t remap_labels -- -l cat:
datum export -p without_cat -f yolo -o ./yolo_without_cats

Example 3. Create custom dataset in YOLO format

import numpy as np
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Bbox, DatasetItem

dataset = Dataset.from_iterable([
    DatasetItem(id='image_001', subset='train',
        image=np.ones((20, 20, 3)),
        annotations=[
            Bbox(3.0, 1.0, 8.0, 5.0, label=1),
            Bbox(1.0, 1.0, 10.0, 1.0, label=2)
        ]
    ),
    DatasetItem(id='image_002', subset='train',
        image=np.ones((15, 10, 3)),
        annotations=[
            Bbox(4.0, 4.0, 4.0, 4.0, label=3)
        ]
    )
], categories=['house', 'bridge', 'crosswalk', 'traffic_light'])

dataset.export('../yolo_dataset', format='yolo', save_images=True)

Example 4. Get information about objects on each image

If you only want information about label names for each images, then you can get it from code:

from datumaro.components.dataset import Dataset
from datumaro.components.extractor import AnnotationType

dataset = Dataset.import_from('./yolo_dataset', format='yolo')
cats = dataset.categories()[AnnotationType.label]

for item in dataset:
    for ann in item.annotations:
        print(item.id, cats[ann.label].name)

And If you want complete information about each items you can run:

datum import -o project -f yolo -i ./yolo_dataset
datum filter -p project --dry-run -e '/item'