COCO

Format specification

COCO format specification is available here.

The dataset has annotations for multiple tasks. Each task has its own format in Datumaro, and there is also a combined coco format, which includes all the available tasks. The sub-formats have the same options as the “main” format and only limit the set of annotation files they work with. To work with multiple formats, use the corresponding option of the coco format.

Supported tasks / formats:

The combined format - coco
Image Captioning - coco_caption
Object Detection - coco_instances
Panoptic Segmentation - coco_panoptic
Keypoint Detection - coco_person_keypoints
Stuff Segmentation - coco_stuff
Image Info - coco_image_info
Image classification (Datumaro extension) - coco_labels. The format is like Object Detection, but uses only category_id and score annotation fields.

Supported annotation types (depending on the task):

Caption (captions)
Label (label, Datumaro extension)
Bbox (instances, person keypoints)
Polygon (instances, person keypoints)
Mask (instances, person keypoints, panoptic, stuff)
Points (person keypoints)

Supported annotation attributes:

is_crowd (boolean; on bbox, polygon and mask annotations) - Indicates that the annotation covers multiple instances of the same class.
score (number; range [0; 1]) - Indicates the confidence in this annotation. Ground truth annotations always have 1.
arbitrary attributes (string/number) - A Datumaro extension. Stored in the attributes section of the annotation descriptor.

Import COCO dataset

The COCO dataset is available for free download:

Images:

Annotations:

A Datumaro project with a COCO source can be created in the following way:

datum create
datum import --format coco <path/to/dataset>

It is possible to specify project name and project directory. Run datum create --help for more information.

Extra options for adding a source in the COCO format:

--keep-original-category-ids: Add dummy label categories so that category indexes in the imported data source correspond to the category IDs in the original annotation file.

A COCO dataset directory should have the following structure:

└─ Dataset/
    ├── dataset_meta.json # a list of custom labels (optional)
    ├── images/
    │   ├── train/
    │   │   ├── <image_name1.ext>
    │   │   ├── <image_name2.ext>
    │   │   └── ...
    │   └── val/
    │       ├── <image_name1.ext>
    │       ├── <image_name2.ext>
    │       └── ...
    └── annotations/
        ├── <task>_<subset_name>.json
        └── ...

For the panoptic task, a dataset directory should have the following structure:

└─ Dataset/
    ├── dataset_meta.json # a list of custom labels (optional)
    ├── images/
    │   ├── train/
    │   │   ├── <image_name1.ext>
    │   │   ├── <image_name2.ext>
    │   │   └── ...
    │   ├── val/
    │   │   ├── <image_name1.ext>
    │   │   ├── <image_name2.ext>
    │   │   └── ...
    └── annotations/
        ├── panoptic_train/
        │   ├── <image_name1.ext>
        │   ├── <image_name2.ext>
        │   └── ...
        ├── panoptic_train.json
        ├── panoptic_val/
        │   ├── <image_name1.ext>
        │   ├── <image_name2.ext>
        │   └── ...
        └── panoptic_val.json

Annotation files must have the names like <task_name>_<subset_name>.json. The year is treated as a part of the subset name. If the annotation file name does’t match this pattern, use one of the task-specific formats instead of plain coco: coco_captions, coco_image_info, coco_instances, coco_labels, coco_panoptic, coco_person_keypoints, coco_stuff. In this case all items of the dataset will be added to the default subset.

To add custom classes, you can use dataset_meta.json.

You can import a dataset for one or several tasks instead of the whole dataset. This option also allows to import annotation files with non-default names. For example:

datum create
datum import --format coco_stuff -r <relpath/to/stuff.json> <path/to/dataset>

To make sure that the selected dataset has been added to the project, you can run datum project info, which will display the project information.

Notes:

COCO categories can have any integer ids, however, Datumaro will count annotation category id 0 as “not specified”. This does not contradict the original annotations, because they have category indices starting from 1.

Export to other formats

Datumaro can convert COCO dataset into any other format Datumaro supports. To get the expected result, convert the dataset to formats that support the specified task (e.g. for panoptic segmentation - VOC, CamVID)

There are several ways to convert a COCO dataset to other dataset formats using CLI:

datum create
datum import -f coco <path/to/coco>
datum export -f voc -o <output/dir>

datum convert -if coco -i <path/to/coco> -f voc -o <output/dir>

Or, using Python API:

import datumaro as dm

dataset = dm.Dataset.import_from('<path/to/dataset>', 'coco')
dataset.export('save_dir', 'voc', save_media=True)

Export to COCO

There are several ways to convert a dataset to COCO format:

# export dataset into COCO format from existing project
datum export -p <path/to/project> -f coco -o <output/dir> \
    -- --save-media

# converting to COCO format from other format
datum convert -if voc -i <path/to/dataset> \
    -f coco -o <output/dir> -- --save-media

Extra options for exporting to COCO format:

--save-media allow to export dataset with saving media files (by default False)
--image-ext IMAGE_EXT allow to specify image extension for exporting dataset (by default - keep original or use .jpg, if none)
--save-dataset-meta - allow to export dataset with saving dataset meta file (by default False)
--segmentation-mode MODE allow to specify save mode for instance segmentation:
- ‘guess’: guess the mode for each instance (using ‘is_crowd’ attribute as hint)
- ‘polygons’: save polygons (merge and convert masks, prefer polygons)
- ‘mask’: save masks (merge and convert polygons, prefer masks) (by default guess)
--crop-covered allow to crop covered segments so that background objects segmentation was more accurate (by default False)
--allow-attributes ALLOW_ATTRIBUTES allow export of attributes (by default True). The parameter enables or disables writing the custom annotation attributes to the “attributes” annotation field. This field is an extension to the original COCO format
--reindex REINDEX allow to assign new indices to images and annotations, useful to avoid merge conflicts (by default False). This option allows to control if the images and annotations must be given new indices. It can be useful, when you want to preserve the original indices in the produced dataset. Consider having this option enabled when converting from other formats or merging datasets to avoid conflicts
--merge-images allow to save all images into a single directory (by default False). The parameter controls the output directory for images. When enabled, the dataset images are saved into a single directory, otherwise they are saved in separate directories by subsets.
--tasks TASKS allow to specify tasks for export dataset, by default Datumaro uses all tasks. Example:

datum create
datum import -f coco <path/to/dataset>
datum export -f coco -- --tasks instances,stuff

Examples

Datumaro supports filtering, transformation, merging etc. for all formats and for the COCO format in particular. Follow the user manual to get more information about these operations.

There are several examples of using Datumaro operations to solve particular problems with a COCO dataset:

Example 1. How to load an original panoptic COCO dataset and convert to Pascal VOC

datum create -o project
datum import -p project -f coco_panoptic ./COCO/annotations/panoptic_val2017.json
datum stats -p project
datum export -p project -f voc -- --save-media

Example 2. How to create custom COCO-like dataset

import numpy as np
import datumaro as dm

dataset = dm.Dataset.from_iterable([
  dm.DatasetItem(id='000000000001',
    image=np.ones((1, 5, 3)),
    subset='val',
    attributes={'id': 40},
    annotations=[
      dm.Mask(image=np.array([[0, 0, 1, 1, 0]]), label=3,
        id=7, group=7, attributes={'is_crowd': False}),
      dm.Mask(image=np.array([[0, 1, 0, 0, 1]]), label=1,
        id=20, group=20, attributes={'is_crowd': True}),
    ]
  ),
], categories=['a', 'b', 'c', 'd'])

dataset.export('./dataset', format='coco_panoptic')

Examples of using this format from the code can be found in the format tests