Dataset Management Framework Documentation
Welcome to the documentation for the Dataset Management Framework (Datumaro).
The Datumaro is a free framework and CLI tool for building, transforming,
and analyzing datasets.
It is developed and used by Intel to build, transform, and analyze annotations
and datasets in a large number of supported formats.
Our documentation provides information for AI researchers, developers,
and teams, who are working with datasets and annotations.
flowchart LR
datasets[(VOC dataset<br/>+<br/>COCO datset<br/>+<br/>CVAT annotation)]
datumaro{Datumaro}
dataset[dataset]
annotation[Annotation tool]
training[Model training]
publication[Publication, statistics etc]
datasets-->datumaro
datumaro-->dataset
dataset-->annotation & training & publication
Basic information and sections needed for a quick start.
This section contains documents for Datumaro users.
Documentation for Datumaro developers.
1 - Getting started
To read about the design concept and features of Datumaro, go to the design section.
Installation
Dependencies
- Python (3.6+)
- Optional: OpenVINO, TensorFlow, PyTorch, MxNet, Caffe, Accuracy Checker
Optionally, create a virtual environment:
python -m pip install virtualenv
python -m virtualenv venv
. venv/bin/activate
Install Datumaro package:
Usage
There are several options available:
Datuaro as a standalone tool allows to do various dataset operations from
the command line interface:
datum --help
python -m datumaro --help
Python module
Datumaro can be used in custom scripts as a Python module. Used this way, it
allows to use its features from an existing codebase, enabling dataset
reading, exporting and iteration capabilities, simplifying integration of custom
formats and providing high performance operations:
from datumaro.components.project import Project
# load a Datumaro project
project = Project.load('directory')
# create a dataset
dataset = project.make_dataset()
# keep only annotated images
dataset.select(lambda item: len(item.annotations) != 0)
# change dataset labels
dataset.transform('remap_labels',
{'cat': 'dog', # rename cat to dog
'truck': 'car', # rename truck to car
'person': '', # remove this label
}, default='delete') # remove everything else
# iterate over dataset elements
for item in dataset:
print(item.id, item.annotations)
# export the resulting dataset in COCO format
dataset.export('dst/dir', 'coco')
Check our developer manual for additional
information.
Examples
-
Convert PASCAL VOC dataset to COCO format, keep only images with cat
class
presented:
# Download VOC dataset:
# http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
datum convert --input-format voc --input-path <path/to/voc> \
--output-format coco \
--filter '/item[annotation/label="cat"]' \
-- --reindex 1 # avoid annotation id conflicts
-
Convert only non-occluded
annotations from a
CVAT project to TFrecord:
# export Datumaro dataset in CVAT UI, extract somewhere, go to the project dir
datum filter -e '/item/annotation[occluded="False"]' \
--mode items+anno --output-dir not_occluded
datum export --project not_occluded \
--format tf_detection_api -- --save-images
-
Annotate MS COCO dataset, extract image subset, re-annotate it in
CVAT, update old dataset:
# Download COCO dataset http://cocodataset.org/#download
# Put images to coco/images/ and annotations to coco/annotations/
datum import --format coco --input-path <path/to/coco>
datum export --filter '/image[images_I_dont_like]' --format cvat \
--output-dir reannotation
# import dataset and images to CVAT, re-annotate
# export Datumaro project, extract to 'reannotation-upd'
datum merge reannotation-upd
datum export --format coco
-
Annotate instance polygons in
CVAT, export as masks in COCO:
datum convert --input-format cvat --input-path <path/to/cvat.xml> \
--output-format coco -- --segmentation-mode masks
-
Apply an OpenVINO detection model to some COCO-like dataset,
then compare annotations with ground truth and visualize in TensorBoard:
datum import --format coco --input-path <path/to/coco>
# create model results interpretation script
datum model add mymodel openvino \
--weights model.bin --description model.xml \
--interpretation-script parse_results.py
datum model run --model mymodel --output-dir mymodel_inference/
datum diff mymodel_inference/ --format tensorboard --output-dir diff
-
Change colors in PASCAL VOC-like .png
masks:
datum import --format voc --input-path <path/to/voc/dataset>
# Create a color map file with desired colors:
#
# label : color_rgb : parts : actions
# cat:0,0,255::
# dog:255,0,0::
#
# Save as mycolormap.txt
datum export --format voc_segmentation -- --label-map mycolormap.txt
# add "--apply-colormap=0" to save grayscale (indexed) masks
# check "--help" option for more info
# use "datum --loglevel debug" for extra conversion info
-
Create a custom COCO-like dataset:
import numpy as np
from datumaro.components.extractor import (DatasetItem,
Bbox, LabelCategories, AnnotationType)
from datumaro.components.dataset import Dataset
dataset = Dataset(categories={
AnnotationType.label: LabelCategories.from_iterable(['cat', 'dog'])
})
dataset.put(DatasetItem(id=0, image=np.ones((5, 5, 3)), annotations=[
Bbox(1, 2, 3, 4, label=0),
]))
dataset.export('test_dataset', 'coco')
2 - Datumaro Design
Concept
Datumaro is:
- a tool to build composite datasets and iterate over them
- a tool to create and maintain datasets
- Version control of annotations and images
- Publication (with removal of sensitive information)
- Editing
- Joining and splitting
- Exporting, format changing
- Image preprocessing
- a dataset storage
- a tool to debug datasets
- A network can be used to generate
informative data subsets (e.g. with false-positives)
to be analyzed further
Requirements
- User interfaces
- a library
- a console tool with visualization means
- Targets: single datasets, composite datasets, single images / videos
- Built-in support for well-known annotation formats and datasets:
CVAT, COCO, PASCAL VOC, Cityscapes, ImageNet
- Extensibility with user-provided components
- Lightweightness - it should be easy to start working with Datumaro
- Minimal dependency on environment and configuration
- It should be easier to use Datumaro than writing own code
for computation of statistics or dataset manipulations
Functionality and ideas
- Blur sensitive areas on dataset images
- Dataset annotation filters, relabelling etc.
- Dataset augmentation
- Calculation of statistics:
- “Edit” command to modify annotations
- Versioning (for images, annotations, subsets, sources etc., comparison)
- Documentation generation
- Provision of iterators for user code
- Dataset downloading
- Dataset generation
- Dataset building (export in a specific format, indexation, statistics,
documentation)
- Dataset exporting to other formats
- Dataset debugging (run inference, generate dataset slices, compute statistics)
- “Explainable AI” - highlight network attention areas (paper)
- Black-box approach
- Classification, Detection, Segmentation, Captioning
- White-box approach
Research topics
- exploration of network prediction uncertainty (aka Bayessian approach)
Use case: explanation of network “quality”, “stability”, “certainty”
- adversarial attacks on networks
- dataset minification / reduction
Use case: removal of redundant information to reach the same network quality
with lesser training time
- dataset expansion and filtration of additions
Use case: add only important data
- guidance for key frame selection for tracking (paper)
Use case: more effective annotation, better predictions
RC 1 vision
CVAT integration
Datumaro needs to be integrated with CVAT,
extending CVAT UI capabilities regarding task and project operations.
It should be capable of downloading and processing data from CVAT.
User
|
v
+------------------+
| CVAT |
+--------v---------+ +------------------+ +--------------+
| Datumaro module | ----> | Datumaro project | <---> | Datumaro CLI | <--- User
+------------------+ +------------------+ +--------------+
Interfaces
Features
-
Dataset format support (reading, writing)
-
Dataset visualization (show
)
-
Calculation of statistics for datasets
-
Dataset building
-
Dataset comparison (diff
)
-
Dataset and model debugging
-
CVAT-integration features
Optional features
Properties
- Lightweightness
- Modularity
- Extensibility
3.1 - Installation
Dependencies
- Python (3.6+)
- Optional: OpenVINO, TensorFlow, PyTorch, MxNet, Caffe, Accuracy Checker
Installation steps
Optionally, set up a virtual environment:
python -m pip install virtualenv
python -m virtualenv venv
. venv/bin/activate
Install:
# From PyPI:
pip install datumaro
# From the GitHub repository:
pip install 'git+https://github.com/openvinotoolkit/datumaro'
You can change the installation branch with ...@<branch_name>
Also use --force-reinstall
parameter in this case.
3.2 - Interfaces
As a standalone tool:
As a python module:
The directory containing Datumaro should be in the PYTHONPATH
environment variable or cvat/datumaro/
should be the current directory.
python -m datumaro --help
python datumaro/ --help
python datum.py --help
As a python library:
3.3 - Supported dataset formats and annotations
List of supported formats:
- MS COCO
(
image_info
, instances
, person_keypoints
, captions
, labels
,panoptic
, stuff
)
- PASCAL VOC (
classification
, detection
, segmentation
(class, instances),
action_classification
, person_layout
)
- YOLO (
bboxes
)
- TF Detection API (
bboxes
, masks
)
- WIDER Face (
bboxes
)
- VGGFace2 (
landmarks
, bboxes
)
- MOT sequences
- MOTS (png)
- ImageNet (
classification
, detection
)
- CIFAR-10/100 (
classification
(python version))
- MNIST (
classification
)
- MNIST in CSV (
classification
)
- CamVid (
segmentation
)
- Cityscapes (
segmentation
)
- KITTI (
segmentation
, detection
)
- KITTI 3D (
raw
/tracklets
/velodyne points
)
- Supervisely (
pointcloud
)
- CVAT
- LabelMe
- ICDAR13/15 (
word_recognition
, text_localization
, text_segmentation
)
- Market-1501 (
person re-identification
)
- LFW (
classification
, person re-identification
, landmarks
)
List of supported annotation types:
- Labels
- Bounding boxes
- Polygons
- Polylines
- (Segmentation) Masks
- (Key-)Points
- Captions
3.4 - Supported data formats
Datumaro only works with 2d RGB(A) images.
To create an unlabelled dataset from an arbitrary directory with images use
ImageDir
format:
datum create -o <project/dir>
datum add path -p <project/dir> -f image_dir <directory/path/>
or if you work with Datumaro API:
For using with a project:
from datumaro.components.project import Project
project = Project()
project.add_source('source1', {
'format': 'image_dir',
'url': 'directory/path/'
})
dataset = project.make_dataset()
And for using as a dataset:
from datumaro.components.dataset import Dataset
dataset = Dataset.import_from('directory/path/', 'image_dir')
This will search for images in the directory recursively and add
them as dataset entries with names like <subdir1>/<subsubdir1>/<image_name1>
.
The list of formats matches the list of supported image formats in OpenCV.
.jpg, .jpeg, .jpe, .jp2, .png, .bmp, .dib, .tif, .tiff, .tga, .webp, .pfm,
.sr, .ras, .exr, .hdr, .pic, .pbm, .pgm, .ppm, .pxm, .pnm
After addition into a project, images can be split into subsets and renamed
with transformations, filtered, joined with existing annotations etc.
To use a video as an input, one should either create an Extractor plugin,
which splits a video into frames, or split the video manually and import images.
3.5 - Command line workflow
The key object is a project, so most CLI commands operate on projects.
However, there are few commands operating on datasets directly.
A project is a combination of a project’s own dataset, a number of
external data sources and an environment.
An empty Project can be created by project create
command,
an existing dataset can be imported with project import
command.
A typical way to obtain projects is to export tasks in CVAT UI.
If you want to interact with models, you need to add them to project first.
Project structure
└── project/
├── .datumaro/
| ├── config.yml
│ ├── .git/
│ ├── models/
│ └── plugins/
│ ├── plugin1/
│ | ├── file1.py
│ | └── file2.py
│ ├── plugin2.py
│ ├── custom_extractor1.py
│ └── ...
├── dataset/
└── sources/
├── source1
└── ...
3.6 - Command reference
Note: command invocation syntax is subject to change,
always refer to command –help output
Available CLI commands:
flowchart LR
d{datum}
p((project))
s((source))
m((model))
d==>p
p==create===>str1([Creates a Datumaro project])
p==import===>str2([Generates a project from other project or dataset in specific format])
p==export===>str3([Saves dataset in a specific format])
p==extract===>str4([Extracts subproject by filter])
p==merge===>str5([Adds new items to project])
p==diff===>str6([Compares two projects])
p==transform===>str7([Applies specific transformation to the dataset])
p==info===>str8([Outputs valuable info])
d==>s
s==add===>str9([Adds data source by its URL])
s==remove===>str10([Remove source dataset])
d==>m
m==add===>str11([Registers model for inference])
m==remove===>str12([Removes model from project])
m==run===>str13([Executes network for inference])
d==>c(create)===>str14([Calls project create])
d==>a(add)===>str15([Calls source add])
d==>r(remove)===>str16([Calls source remove])
d==>e(export)===>str17([Calls project export])
d==>exp(explain)===>str18([Runs inference explanation])
3.6.1 - Convert datasets
This command allows to convert a dataset from one format into another.
In fact, this command is a combination of project import
and project export
and just provides a simpler way to obtain the same result when no extra options
is needed. A list of supported formats can be found in the --help
output of
this command.
Usage:
datum convert --help
datum convert \
-i <input path> \
-if <input format> \
-o <output path> \
-f <output format> \
-- [extra parameters for output format]
Example: convert a VOC-like dataset to a COCO-like one:
datum convert --input-format voc --input-path <path/to/voc/> \
--output-format coco
3.6.2 - Create project
The command creates an empty project. Once a Project is created, there are
a few options to interact with it.
Usage:
datum create --help
datum create \
-o <project_dir>
Example: create an empty project my_dataset
datum create -o my_dataset/
3.6.3 - Add and remove data
A Project can contain a number of external Data Sources. Each Data Source
describes a way to produce dataset items. A Project combines dataset items from
all the sources and its own dataset into one composite dataset. You can manage
project sources by commands in the source
command line context.
Datasets come in a wide variety of formats. Each dataset
format defines its own data structure and rules on how to
interpret the data. For example, the following data structure
is used in COCO format:
/dataset/
- /images/<id>.jpg
- /annotations/
Supported formats are listed in the command help. Check extending tips
for information on extra format support.
Usage:
datum add --help
datum remove --help
datum add \
path <path> \
-p <project dir> \
-f <format> \
-n <name>
datum remove \
-p <project dir> \
-n <name>
Example: create a project from a bunch of different annotations and images,
and generate TFrecord for TF Detection API for model training
datum create
# 'default' is the name of the subset below
datum add path <path/to/coco/instances_default.json> -f coco_instances
datum add path <path/to/cvat/default.xml> -f cvat
datum add path <path/to/voc> -f voc_detection
datum add path <path/to/datumaro/default.json> -f datumaro
datum add path <path/to/images/dir> -f image_dir
datum export -f tf_detection_api
3.6.4 - Filter project
This command allows to create a sub-Project from a Project. The new project
includes only items satisfying some condition. XPath
is used as a query format.
There are several filtering modes available (-m/--mode
parameter).
Supported modes:
i
, items
a
, annotations
i+a
, a+i
, items+annotations
, annotations+items
When filtering annotations, use the items+annotations
mode to point that annotation-less dataset items should be
removed. To select an annotation, write an XPath that
returns annotation
elements (see examples).
Usage:
datum filter --help
datum filter \
-p <project dir> \
-e '<xpath filter expression>'
Example: extract a dataset with only images which width
< height
datum filter \
-p test_project \
-e '/item[image/width < image/height]'
Example: extract a dataset with only images of subset train
.
datum project filter \
-p test_project \
-e '/item[subset="train"]'
Example: extract a dataset with only large annotations of class cat
and any
non-persons
datum filter \
-p test_project \
--mode annotations -e '/item/annotation[(label="cat" and area > 99.5) or label!="person"]'
Example: extract a dataset with only occluded annotations, remove empty images
datum filter \
-p test_project \
-m i+a -e '/item/annotation[occluded="True"]'
Item representations are available with --dry-run
parameter:
<item>
<id>290768</id>
<subset>minival2014</subset>
<image>
<width>612</width>
<height>612</height>
<depth>3</depth>
</image>
<annotation>
<id>80154</id>
<type>bbox</type>
<label_id>39</label_id>
<x>264.59</x>
<y>150.25</y>
<w>11.199999999999989</w>
<h>42.31</h>
<area>473.87199999999956</area>
</annotation>
<annotation>
<id>669839</id>
<type>bbox</type>
<label_id>41</label_id>
<x>163.58</x>
<y>191.75</y>
<w>76.98999999999998</w>
<h>73.63</h>
<area>5668.773699999998</area>
</annotation>
...
</item>
3.6.5 - Update project (merge)
This command updates items in a project from another one
(check Merge Projects
for complex merging).
Usage:
datum merge --help
datum merge \
-p <project dir> \
-o <output dir> \
<other project dir>
Example: update annotations in the first_project
with annotations
from the second_project
and save the result as merged_project
datum merge \
-p first_project \
-o merged_project \
second_project
3.6.6 - Import project
This command creates a Project from an existing dataset.
Supported formats are listed in the command help. Check extending tips
for information on extra format support.
Usage:
datum import --help
datum import \
-i <dataset_path> \
-o <project_dir> \
-f <format>
Example: create a project from COCO-like dataset
datum import \
-i /home/coco_dir \
-o /home/project_dir \
-f coco
An MS COCO-like dataset should have the following directory structure:
COCO/
├── annotations/
│ ├── instances_val2017.json
│ ├── instances_train2017.json
├── images/
│ ├── val2017
│ ├── train2017
Everything after the last _
is considered a subset name in the COCO format.
3.6.7 - Export project
This command exports a Project as a dataset in some format.
Supported formats are listed in the command help. Check extending tips
for information on extra format support.
Usage:
datum export --help
datum export \
-p <project dir> \
-o <output dir> \
-f <format> \
-- [additional format parameters]
Example: save project as VOC-like dataset, include images, convert images to PNG
datum export \
-p test_project \
-o test_project-export \
-f voc \
-- --save-images --image-ext='.png'
3.6.8 - Merge projects
This command merges items from 2 or more projects and checks annotations for
errors.
Spatial annotations are compared by distance and intersected, labels and
attributes are selected by voting.
Merge conflicts, missing items and annotations, other errors are saved into a .json
file.
Usage:
datum merge --help
datum merge <project dirs>
Example: merge 4 (partially-)intersecting projects,
- consider voting succeeded when there are 3+ same votes
- consider shapes intersecting when IoU >= 0.6
- check annotation groups to have
person
, hand
, head
and foot
(?
for optional)
datum merge project1/ project2/ project3/ project4/ \
--quorum 3 \
-iou 0.6 \
--groups 'person,hand?,head,foot?'
3.6.9 - Compare projects
The command compares two datasets and saves the results in the
specified directory. The current project is considered to be
“ground truth”.
datum diff --help
datum diff <other_project_dir> -o <save_dir>
Example: compare a dataset with model inference
datum import <...>
datum model add mymodel <...>
datum transform <...> -o inference
datum diff inference -o diff
3.6.10 - Obtaining project info
This command outputs project status information.
Usage:
datum info --help
datum info \
-p <project dir>
Example:
datum info -p /test_project
Project:
name: test_project
location: /test_project
Sources:
source 'instances_minival2014':
format: coco_instances
url: /coco_like/annotations/instances_minival2014.json
Dataset:
length: 5000
categories: label
label:
count: 80
labels: person, bicycle, car, motorcycle (and 76 more)
subsets: minival2014
subset 'minival2014':
length: 5000
categories: label
label:
count: 80
labels: person, bicycle, car, motorcycle (and 76 more)
3.6.11 - Obtaining project statistics
This command computes various project statistics, such as:
- image mean and std. dev.
- class and attribute balance
- mask pixel balance
- segment area distribution
Usage:
datum stats --help
datum stats \
-p <project dir>
Example:
datum stats -p test_project
{
"annotations": {
"labels": {
"attributes": {
"gender": {
"count": 358,
"distribution": {
"female": [
149,
0.41620111731843573
],
"male": [
209,
0.5837988826815642
]
},
"values count": 2,
"values present": [
"female",
"male"
]
},
"view": {
"count": 340,
"distribution": {
"__undefined__": [
4,
0.011764705882352941
],
"front": [
54,
0.1588235294117647
],
"left": [
14,
0.041176470588235294
],
"rear": [
235,
0.6911764705882353
],
"right": [
33,
0.09705882352941177
]
},
"values count": 5,
"values present": [
"__undefined__",
"front",
"left",
"rear",
"right"
]
}
},
"count": 2038,
"distribution": {
"car": [
340,
0.16683022571148184
],
"cyclist": [
194,
0.09519136408243375
],
"head": [
354,
0.17369970559371933
],
"ignore": [
100,
0.04906771344455348
],
"left_hand": [
238,
0.11678115799803729
],
"person": [
358,
0.17566241413150147
],
"right_hand": [
77,
0.037782139352306184
],
"road_arrows": [
326,
0.15996074582924436
],
"traffic_sign": [
51,
0.025024533856722278
]
}
},
"segments": {
"area distribution": [
{
"count": 1318,
"max": 11425.1,
"min": 0.0,
"percent": 0.9627465303140978
},
{
"count": 1,
"max": 22850.2,
"min": 11425.1,
"percent": 0.0007304601899196494
},
{
"count": 0,
"max": 34275.3,
"min": 22850.2,
"percent": 0.0
},
{
"count": 0,
"max": 45700.4,
"min": 34275.3,
"percent": 0.0
},
{
"count": 0,
"max": 57125.5,
"min": 45700.4,
"percent": 0.0
},
{
"count": 0,
"max": 68550.6,
"min": 57125.5,
"percent": 0.0
},
{
"count": 0,
"max": 79975.7,
"min": 68550.6,
"percent": 0.0
},
{
"count": 0,
"max": 91400.8,
"min": 79975.7,
"percent": 0.0
},
{
"count": 0,
"max": 102825.90000000001,
"min": 91400.8,
"percent": 0.0
},
{
"count": 50,
"max": 114251.0,
"min": 102825.90000000001,
"percent": 0.036523009495982466
}
],
"avg. area": 5411.624543462382,
"pixel distribution": {
"car": [
13655,
0.0018431496518735067
],
"cyclist": [
939005,
0.12674674030446592
],
"head": [
0,
0.0
],
"ignore": [
5501200,
0.7425510702956085
],
"left_hand": [
0,
0.0
],
"person": [
954654,
0.12885903974805205
],
"right_hand": [
0,
0.0
],
"road_arrows": [
0,
0.0
],
"traffic_sign": [
0,
0.0
]
}
}
},
"annotations by type": {
"bbox": {
"count": 548
},
"caption": {
"count": 0
},
"label": {
"count": 0
},
"mask": {
"count": 0
},
"points": {
"count": 669
},
"polygon": {
"count": 821
},
"polyline": {
"count": 0
}
},
"annotations count": 2038,
"dataset": {
"image mean": [
107.06903686941979,
79.12831698580979,
52.95829558185416
],
"image std": [
49.40237673503467,
43.29600731496902,
35.47373007603151
],
"images count": 100
},
"images count": 100,
"subsets": {},
"unannotated images": [
"img00051",
"img00052",
"img00053",
"img00054",
"img00055",
],
"unannotated images count": 5,
"unique images count": 97,
"repeating images count": 3,
"repeating images": [
[("img00057", "default"), ("img00058", "default")],
[("img00059", "default"), ("img00060", "default")],
[("img00061", "default"), ("img00062", "default")],
],
}
3.6.12 - Validate project annotations
This command inspects annotations with respect to the task type
and stores the result in JSON file.
The task types supported are classification
, detection
, and segmentation
.
The validation result contains
annotation statistics
based on the task type
validation reports
, such as
- items not having annotations
- items having undefined annotations
- imbalanced distribution in class/attributes
- too small or large values
summary
Usage:
- There are five configurable parameters for validation
few_samples_thr
: threshold for giving a warning for minimum number of
samples per class
imbalance_ratio_thr
: threshold for giving imbalance data warning
far_from_mean_thr
: threshold for giving a warning that data is far
from mean
dominance_ratio_thr
: threshold for giving a warning bounding box
imbalance
topk_bins
: ratio of bins with the highest number of data to total bins
in the histogram
datum validate --help
datum validate -p <project dir> -t <task_type> -- \
-fs <few_samples_thr> \
-ir <imbalance_ratio_thr> \
-m <far_from_mean_thr> \
-dr <dominance_ratio_thr> \
-k <topk_bins>
Example : give warning when imbalance ratio of data with classification task
over 40
datum validate -p prj-cls -t classification -- \
-ir 40
Here is the list of validation items(a.k.a. anomaly types).
Anomaly Type |
Description |
Task Type |
MissingLabelCategories |
Metadata (ex. LabelCategories) should be defined |
common |
MissingAnnotation |
No annotation found for an Item |
common |
MissingAttribute |
An attribute key is missing for an Item |
common |
MultiLabelAnnotations |
Item needs a single label |
classification |
UndefinedLabel |
A label not defined in the metadata is found for an item |
common |
UndefinedAttribute |
An attribute not defined in the metadata is found for an item |
common |
LabelDefinedButNotFound |
A label is defined, but not found actually |
common |
AttributeDefinedButNotFound |
An attribute is defined, but not found actually |
common |
OnlyOneLabel |
The dataset consists of only label |
common |
OnlyOneAttributeValue |
The dataset consists of only attribute value |
common |
FewSamplesInLabel |
The number of samples in a label might be too low |
common |
FewSamplesInAttribute |
The number of samples in an attribute might be too low |
common |
ImbalancedLabels |
There is an imbalance in the label distribution |
common |
ImbalancedAttribute |
There is an imbalance in the attribute distribution |
common |
ImbalancedDistInLabel |
Values (ex. bbox width) are not evenly distributed for a label |
detection, segmentation |
ImbalancedDistInAttribute |
Values (ex. bbox width) are not evenly distributed for an attribute |
detection, segmentation |
NegativeLength |
The width or height of bounding box is negative |
detection |
InvalidValue |
There’s invalid (ex. inf, nan) value for bounding box info. |
detection |
FarFromLabelMean |
An annotation has an too small or large value than average for a label |
detection, segmentation |
FarFromAttrMean |
An annotation has an too small or large value than average for an attribute |
detection, segmentation |
Validation Result Format:
{
'statistics': {
## common statistics
'label_distribution': {
'defined_labels': <dict>, # <label:str>: <count:int>
'undefined_labels': <dict>
# <label:str>: {
# 'count': <int>,
# 'items_with_undefined_label': [<item_key>, ]
# }
},
'attribute_distribution': {
'defined_attributes': <dict>,
# <label:str>: {
# <attribute:str>: {
# 'distribution': {<attr_value:str>: <count:int>, },
# 'items_missing_attribute': [<item_key>, ]
# }
# }
'undefined_attributes': <dict>
# <label:str>: {
# <attribute:str>: {
# 'distribution': {<attr_value:str>: <count:int>, },
# 'items_with_undefined_attr': [<item_key>, ]
# }
# }
},
'total_ann_count': <int>,
'items_missing_annotation': <list>, # [<item_key>, ]
## statistics for classification task
'items_with_multiple_labels': <list>, # [<item_key>, ]
## statistics for detection task
'items_with_invalid_value': <dict>,
# '<item_key>': {<ann_id:int>: [ <property:str>, ], }
# - properties: 'x', 'y', 'width', 'height',
# 'area(wxh)', 'ratio(w/h)', 'short', 'long'
# - 'short' is min(w,h) and 'long' is max(w,h).
'items_with_negative_length': <dict>,
# '<item_key>': { <ann_id:int>: { <'width'|'height'>: <value>, }, }
'bbox_distribution_in_label': <dict>, # <label:str>: <bbox_template>
'bbox_distribution_in_attribute': <dict>,
# <label:str>: {<attribute:str>: { <attr_value>: <bbox_template>, }, }
'bbox_distribution_in_dataset_item': <dict>,
# '<item_key>': <bbox count:int>
## statistics for segmentation task
'items_with_invalid_value': <dict>,
# '<item_key>': {<ann_id:int>: [ <property:str>, ], }
# - properties: 'area', 'width', 'height'
'mask_distribution_in_label': <dict>, # <label:str>: <mask_template>
'mask_distribution_in_attribute': <dict>,
# <label:str>: {
# <attribute:str>: { <attr_value>: <mask_template>, }
# }
'mask_distribution_in_dataset_item': <dict>,
# '<item_key>': <mask/polygon count: int>
},
'validation_reports': <list>, # [ <validation_error_format>, ]
# validation_error_format = {
# 'anomaly_type': <str>,
# 'description': <str>,
# 'severity': <str>, # 'warning' or 'error'
# 'item_id': <str>, # optional, when it is related to a DatasetItem
# 'subset': <str>, # optional, when it is related to a DatasetItem
# }
'summary': {
'errors': <count: int>,
'warnings': <count: int>
}
}
item_key
is defined as,
item_key = (<DatasetItem.id:str>, <DatasetItem.subset:str>)
bbox_template
and mask_template
are defined as,
bbox_template = {
'width': <numerical_stat_template>,
'height': <numerical_stat_template>,
'area(wxh)': <numerical_stat_template>,
'ratio(w/h)': <numerical_stat_template>,
'short': <numerical_stat_template>, # short = min(w, h)
'long': <numerical_stat_template> # long = max(w, h)
}
mask_template = {
'area': <numerical_stat_template>,
'width': <numerical_stat_template>,
'height': <numerical_stat_template>
}
numerical_stat_template
is defined as,
numerical_stat_template = {
'items_far_from_mean': <dict>,
# {'<item_key>': {<ann_id:int>: <value:float>, }, }
'mean': <float>,
'stdev': <float>,
'min': <float>,
'max': <float>,
'median': <float>,
'histogram': {
'bins': <list>, # [<float>, ]
'counts': <list>, # [<int>, ]
}
}
3.6.13 - Register model
Supported models:
- OpenVINO
- Custom models via custom
launchers
Usage:
Example: register an OpenVINO model
A model consists of a graph description and weights. There is also a script
used to convert model outputs to internal data structures.
datum create
datum model add \
-n <model_name> -l open_vino -- \
-d <path_to_xml> -w <path_to_bin> -i <path_to_interpretation_script>
Interpretation script for an OpenVINO detection model (convert.py
):
You can find OpenVINO model interpreter samples in
datumaro/plugins/openvino/samples
(instruction).
from datumaro.components.extractor import *
max_det = 10
conf_thresh = 0.1
def process_outputs(inputs, outputs):
# inputs = model input, array or images, shape = (N, C, H, W)
# outputs = model output, shape = (N, 1, K, 7)
# results = conversion result, [ [ Annotation, ... ], ... ]
results = []
for input, output in zip(inputs, outputs):
input_height, input_width = input.shape[:2]
detections = output[0]
image_results = []
for i, det in enumerate(detections):
label = int(det[1])
conf = float(det[2])
if conf <= conf_thresh:
continue
x = max(int(det[3] * input_width), 0)
y = max(int(det[4] * input_height), 0)
w = min(int(det[5] * input_width - x), input_width)
h = min(int(det[6] * input_height - y), input_height)
image_results.append(Bbox(x, y, w, h,
label=label, attributes={'score': conf} ))
results.append(image_results[:max_det])
return results
def get_categories():
# Optionally, provide output categories - label map etc.
# Example:
label_categories = LabelCategories()
label_categories.add('person')
label_categories.add('car')
return { AnnotationType.label: label_categories }
3.6.14 - Run inference
This command applies model to dataset images and produces a new project.
Usage:
datum model run --help
datum model run \
-p <project dir> \
-m <model_name> \
-o <save_dir>
Example: launch inference on a dataset
datum import <...>
datum model add mymodel <...>
datum model run -m mymodel -o inference
3.6.15 - Run inference explanation
Runs an explainable AI algorithm for a model.
This tool is supposed to help an AI developer to debug a model and a dataset.
Basically, it executes inference and tries to find problems in the trained
model - determine decision boundaries and belief intervals for the classifier.
Currently, the only available algorithm is RISE (article),
which runs inference and then re-runs a model multiple times on each
image to produce a heatmap of activations for each output of the
first inference. As a result, we obtain few heatmaps, which
shows, how image pixels affected the inference result. This algorithm doesn’t
require any special information about the model, but it requires the model to
return all the outputs and confidences. The algorithm only supports
classification and detection models.
The following use cases available:
- RISE for classification
- RISE for object detection
Usage:
datum explain --help
datum explain \
-m <model_name> \
-o <save_dir> \
-t <target> \
<method> \
<method_params>
Example: run inference explanation on a single image with visualization
datum create <...>
datum model add mymodel <...>
datum explain -t image.png -m mymodel \
rise --max-samples 1000 --progressive
Note: this algorithm requires the model to return
all (or a reasonable amount) the outputs and confidences unfiltered,
i.e. all the Label
annotations for classification models and
all the Bbox
es for detection models.
You can find examples of the expected model outputs in tests/test_RISE.py
For OpenVINO models the output processing script would look like this:
Classification scenario:
from datumaro.components.extractor import *
from datumaro.util.annotation_util import softmax
def process_outputs(inputs, outputs):
# inputs = model input, array or images, shape = (N, C, H, W)
# outputs = model output, logits, shape = (N, n_classes)
# results = conversion result, [ [ Annotation, ... ], ... ]
results = []
for input, output in zip(inputs, outputs):
input_height, input_width = input.shape[:2]
confs = softmax(output[0])
for label, conf in enumerate(confs):
results.append(Label(int(label)), attributes={'score': float(conf)})
return results
Object Detection scenario:
from datumaro.components.extractor import *
# return a significant number of output boxes to make multiple runs
# statistically correct and meaningful
max_det = 1000
def process_outputs(inputs, outputs):
# inputs = model input, array or images, shape = (N, C, H, W)
# outputs = model output, shape = (N, 1, K, 7)
# results = conversion result, [ [ Annotation, ... ], ... ]
results = []
for input, output in zip(inputs, outputs):
input_height, input_width = input.shape[:2]
detections = output[0]
image_results = []
for i, det in enumerate(detections):
label = int(det[1])
conf = float(det[2])
x = max(int(det[3] * input_width), 0)
y = max(int(det[4] * input_height), 0)
w = min(int(det[5] * input_width - x), input_width)
h = min(int(det[6] * input_height - y), input_height)
image_results.append(Bbox(x, y, w, h,
label=label, attributes={'score': conf} ))
results.append(image_results[:max_det])
return results
3.6.16 - Transform Project
This command allows to modify images or annotations in a project all at once.
datum transform --help
datum transform \
-p <project_dir> \
-o <output_dir> \
-t <transform_name> \
-- [extra transform options]
Example: split a dataset randomly to train
and test
subsets, ratio is 2:1
datum transform -t random_split -- --subset train:.67 --subset test:.33
Example: split a dataset in task-specific manner. The tasks supported are
classification, detection, segmentation and re-identification.
datum transform -t split -- \
-t classification --subset train:.5 --subset val:.2 --subset test:.3
datum transform -t split -- \
-t detection --subset train:.5 --subset val:.2 --subset test:.3
datum transform -t split -- \
-t segmentation --subset train:.5 --subset val:.2 --subset test:.3
datum transform -t split -- \
-t reid --subset train:.5 --subset val:.2 --subset test:.3 \
--query .5
Example: convert polygons to masks, masks to boxes etc.:
datum transform -t boxes_to_masks
datum transform -t masks_to_polygons
datum transform -t polygons_to_masks
datum transform -t shapes_to_boxes
Example: remap dataset labels, person
to car
and cat
to dog
,
keep bus
, remove others
datum transform -t remap_labels -- \
-l person:car -l bus:bus -l cat:dog \
--default delete
Example: rename dataset items by a regular expression
- Replace
pattern
with replacement
- Remove
frame_
from item ids
datum transform -t rename -- -e '|pattern|replacement|'
datum transform -t rename -- -e '|frame_(\d+)|\\1|'
Example: sampling dataset items as many as the number of target samples with
sampling method entered by the user, divide into sampled
and unsampled
subsets
- There are five methods of sampling the m option.
topk
: Return the k with high uncertainty data
lowk
: Return the k with low uncertainty data
randk
: Return the random k data
mixk
: Return half to topk method and the rest to lowk method
randtopk
: First, select 3 times the number of k randomly, and return
the topk among them.
datum transform -t sampler -- \
-a entropy \
-i train \
-o sampled \
-u unsampled \
-m topk \
-k 20
Example : control number of outputs to 100 after NDR
- There are two methods in NDR e option
random
: sample from removed data randomly
similarity
: sample from removed data with ascending
- There are two methods in NDR u option
uniform
: sample data with uniform distribution
inverse
: sample data with reciprocal of the number
datum transform -t ndr -- \
-w train \
-a gradient \
-k 100 \
-e random \
-u uniform
3.7 - Extending
There are few ways to extend and customize Datumaro behavior, which is
supported by plugins. Check our contribution guide for
details on plugin implementation. In general, a plugin is a Python code file.
It must be put into a plugin directory:
<project_dir>/.datumaro/plugins
for project-specific plugins
<datumaro_dir>/plugins
for global plugins
Built-in plugins
Datumaro provides several builtin plugins. Plugins can have dependencies,
which need to be installed separately.
TensorFlow
The plugin provides support of TensorFlow Detection API format, which includes
boxes and masks. It depends on TensorFlow, which can be installed with pip
:
pip install tensorflow
# or
pip install tensorflow-gpu
# or
pip install datumaro[tf]
# or
pip install datumaro[tf-gpu]
Accuracy Checker
This plugin allows to use Accuracy Checker
to launch deep learning models from various frameworks
(Caffe, MxNet, PyTorch, OpenVINO, …) through Accuracy Checker’s API.
The plugin depends on Accuracy Checker, which can be installed with pip
:
pip install 'git+https://github.com/openvinotoolkit/open_model_zoo.git#subdirectory=tools/accuracy_checker'
OpenVINO™
This plugin provides support for model inference with OpenVINO™.
The plugin depends on the OpenVINO™ Toolkit, which can be installed by
following these instructions
Dataset reading is supported by Extractors and Importers.
An Extractor produces a list of dataset items corresponding
to the dataset. An Importer creates a project from the data source location.
It is possible to add custom Extractors and Importers. To do this, you need
to put an Extractor and Importer implementation scripts to a plugin directory.
Dataset writing is supported by Converters.
A Converter produces a dataset of a specific format from dataset items.
It is possible to add custom Converters. To do this, you need to put a Converter
implementation script to a plugin directory.
A Transform is a function for altering a dataset and producing a new one.
It can update dataset items, annotations, classes, and other properties.
A list of available transforms for dataset conversions can be extended by
adding a Transform implementation script into a plugin directory.
Model launchers
A list of available launchers for model execution can be extended by adding
a Launcher implementation script into a plugin directory.
4 - Dataset Management Framework (Datumaro) API and developer manual
Basics
The center part of the library is the Dataset
class, which represents
a dataset and allows to iterate over its elements.
DatasetItem
, an element of a dataset, represents a single
dataset entry with annotations - an image, video sequence, audio track etc.
It can contain only annotated data or meta information, only annotations, or
all of this.
Basic library usage and data flow:
Extractors -> Dataset -> Converter
|
Filtration
Transformations
Statistics
Merging
Inference
Quality Checking
Comparison
...
- Data is read (or produced) by one or many
Extractor
s and merged
into a Dataset
- The dataset is processed in some way
- The dataset is saved with a
Converter
Datumaro has a number of dataset and annotation features:
- iteration over dataset elements
- filtering of datasets and annotations by a custom criteria
- working with subsets (e.g.
train
, val
, test
)
- computing of dataset statistics
- comparison and merging of datasets
- various annotation operations
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Bbox, Polygon, DatasetItem
# Import and export a dataset
dataset = Dataset.import_from('src/dir', 'voc')
dataset.export('dst/dir', 'coco')
# Create a dataset, convert polygons to masks, save in PASCAL VOC format
dataset = Dataset.from_iterable([
DatasetItem(id='image1', annotations=[
Bbox(x=1, y=2, w=3, h=4, label=1),
Polygon([1, 2, 3, 2, 4, 4], label=2, attributes={'occluded': True}),
]),
], categories=['cat', 'dog', 'person'])
dataset.transform('polygons_to_masks')
dataset.export('dst/dir', 'voc')
The Dataset class
The Dataset
class from the datumaro.components.dataset
module represents
a dataset, consisting of multiple DatasetItem
s. Annotations are
represented by members of the datumaro.components.extractor
module,
such as Label
, Mask
or Polygon
. A dataset can contain items from one or
multiple subsets (e.g. train
, test
, val
etc.), the list of dataset subsets
is available at dataset.subsets
.
Datasets typically have annotations, and these annotations can
require additional information to be interpreted correctly. For instance, it
can include class names, class hierarchy, keypoint connections,
class colors for masks, class attributes.
This information is stored in dataset.categories
, which is a mapping from
AnnotationType
to a corresponding ...Categories
class. Each annotation type
can have its Categories
. Typically, there will be a LabelCategories
object.
Annotations and other categories address dataset labels
by their indices in this object.
The main operation for a dataset is iteration over its elements.
An item corresponds to a single image, a video sequence, etc. There are also
few other operations available, such as filtration (dataset.select
) and
transformations (dataset.transform
). A dataset can be created from extractors
or other datasets with Dataset.from_extractors()
and directly from items with
Dataset.from_iterable()
. A dataset is an extractor itself. If it is created
from multiple extractors, their categories must match, and their contents
will be merged.
A dataset item is an element of a dataset. Its id
is a name of a
corresponding image. There can be some image attributes
,
an image
and annotations
.
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Bbox, Polygon, DatasetItem
# create a dataset from other datasets
dataset = Dataset.from_extractors(dataset1, dataset2)
# or directly from items
dataset = Dataset.from_iterable([
DatasetItem(id='image1', annotations=[
Bbox(x=1, y=2, w=3, h=4, label=1),
Polygon([1, 2, 3, 2, 4, 4], label=2),
]),
], categories=['cat', 'dog', 'person'])
# keep only annotated images
dataset.select(lambda item: len(item.annotations) != 0)
# change dataset labels
dataset.transform('remap_labels',
{'cat': 'dog', # rename cat to dog
'truck': 'car', # rename truck to car
'person': '', # remove this label
}, default='delete')
# iterate over elements
for item in dataset:
print(item.id, item.annotations)
# iterate over subsets as Datasets
for subset_name, subset in dataset.subsets().items():
for item in subset:
print(item.id, item.annotations)
Projects
Projects are intended for complex use of Datumaro. They provide means of
persistence, of extending, and CLI operation for Datasets. A project can
be converted to a Dataset with project.make_dataset
. Project datasets
can have multiple data sources, which are merged on dataset creation. They
can have a hierarchy. Project configuration is available in project.config
.
A dataset can be saved in datumaro_project
format.
The Environment
class is responsible for accessing built-in and
project-specific plugins. For a project, there is an instance of
related Environment
in project.env
.
Library contents
The framework provides functions to read and write datasets in specific formats.
It is supported by Extractor
s, Importer
s, and Converter
s.
Dataset reading is supported by Extractor
s and Importer
s:
- An
Extractor
produces a list of DatasetItem
s corresponding to the
dataset. Annotations are available in the DatasetItem.annotations
list
- An
Importer
creates a project from a data source location
It is possible to add custom Extractor
s and Importer
s. To do this, you need
to put an Extractor
and Importer
implementations to a plugin directory.
Dataset writing is supported by Converter
s.
A Converter
produces a dataset of a specific format from dataset items.
It is possible to add custom Converter
s. To do this, you need to put a
Converter
implementation script to a plugin directory.
A Transform
is a function for altering a dataset and producing a new one.
It can update dataset items, annotations, classes, and other properties.
A list of available transforms for dataset conversions can be extended by
adding a Transform
implementation script into a plugin directory.
Model launchers
A list of available launchers for model execution can be extended by
adding a Launcher
implementation script into a plugin directory.
Plugins
Datumaro comes with a number of built-in formats and other tools,
but it also can be extended by plugins. Plugins are optional components,
which dependencies are not installed by default.
In Datumaro there are several types of plugins, which include:
extractor
- produces dataset items from data source
importer
- recognizes dataset type and creates project
converter
- exports dataset to a specific format
transformation
- modifies dataset items or other properties
launcher
- executes models
A plugin is a regular Python module. It must be present in a plugin directory:
<project_dir>/.datumaro/plugins
for project-specific plugins
<datumaro_dir>/plugins
for global plugins
A plugin can be used either via the Environment
class instance,
or by regular module importing:
from datumaro.components.project import Environment, Project
from datumaro.plugins.yolo_format.converter import YoloConverter
# Import a dataset
dataset = Environment().make_importer('voc')(src_dir).make_dataset()
# Load an existing project, save the dataset in some project-specific format
project = Project.load('project/dir')
project.env.converters.get('custom_format').convert(dataset, save_dir=dst_dir)
# Save the dataset in some built-in format
Environment().converters.get('yolo').convert(dataset, save_dir=dst_dir)
YoloConverter.convert(dataset, save_dir=dst_dir)
Writing a plugin
A plugin is a Python module with any name, which exports some symbols. Symbols,
starting with _
are not exported by default. To export a symbol,
inherit it from one of the special classes:
from datumaro.components.extractor import Importer, Extractor, Transform
from datumaro.components.launcher import Launcher
from datumaro.components.converter import Converter
The exports
list of the module can be used to override default behaviour:
class MyComponent1: ...
class MyComponent2: ...
exports = [MyComponent2] # exports only MyComponent2
There is also an additional class to modify plugin appearance in command line:
from datumaro.components.cli_plugin import CliPlugin
class MyPlugin(Converter, CliPlugin):
"""
Optional documentation text, which will appear in command-line help
"""
NAME = 'optional_custom_plugin_name'
def build_cmdline_parser(self, **kwargs):
parser = super().build_cmdline_parser(**kwargs)
# set up argparse.ArgumentParser instance
# the parsed args are supposed to be used as invocation options
return parser
Plugin example
datumaro/plugins/
- my_plugin1/file1.py
- my_plugin1/file2.py
- my_plugin2.py
my_plugin1/file2.py
contents:
from datumaro.components.extractor import Transform, CliPlugin
from .file1 import something, useful
class MyTransform(Transform, CliPlugin):
NAME = "custom_name" # could be generated automatically
"""
Some description. The text will be displayed in the command line output.
"""
@classmethod
def build_cmdline_parser(cls, **kwargs):
parser = super().build_cmdline_parser(**kwargs)
parser.add_argument('-q', help="Very useful parameter")
return parser
def __init__(self, extractor, q):
super().__init__(extractor)
self.q = q
def transform_item(self, item):
return item
my_plugin2.py
contents:
from datumaro.components.extractor import Extractor
class MyFormat: ...
class _MyFormatConverter(Converter): ...
class MyFormatExtractor(Extractor): ...
exports = [MyFormat] # explicit exports declaration
# MyFormatExtractor and _MyFormatConverter won't be exported
Command-line
Basically, the interface is divided on contexts and single commands.
Contexts are semantically grouped commands, related to a single topic or target.
Single commands are handy shorter alternatives for the most used commands
and also special commands, which are hard to be put into any specific context.
Docker is an example of similar approach.
flowchart LR
d{datum}
p((project))
s((source))
m((model))
d==>p
p==create===>str1([Creates a Datumaro project])
p==import===>str2([Generates a project from other project or dataset in specific format])
p==export===>str3([Saves dataset in a specific format])
p==extract===>str4([Extracts subproject by filter])
p==merge===>str5([Adds new items to project])
p==diff===>str6([Compares two projects])
p==transform===>str7([Applies specific transformation to the dataset])
p==info===>str8([Outputs valuable info])
d==>s
s==add===>str9([Adds data source by its URL])
s==remove===>str10([Remove source dataset])
d==>m
m==add===>str11([Registers model for inference])
m==remove===>str12([Removes model from project])
m==run===>str13([Executes network for inference])
d==>c(create)===>str14([Calls project create])
d==>a(add)===>str15([Calls source add])
d==>r(remove)===>str16([Calls source remove])
d==>e(export)===>str17([Calls project export])
d==>exp(explain)===>str18([Runs inference explanation])
Model-View-ViewModel (MVVM) UI pattern is used.
flowchart LR
c((CLI))<--CliModel--->d((Domain))
g((GUI))<--GuiModel--->d
a((API))<--->d
t((Tests))<--->d
5 - Formats
5.1 - ADE20k (v2017)
Supported annotation types:
Supported annotation attributes:
occluded
(boolean): whether the object is occluded by another object
- other arbitrary boolean attributes, which can be specified
in the annotation file
<image_name>_atr.txt
Load ADE20K 2017 dataset
There are two ways to create Datumaro project and add ADE20K to it:
datum import --format ade20k2017 --input-path <path/to/dataset>
# or
datum create
datum add path -f ade20k2017 <path/to/dataset>
Also it is possible to load dataset using Python API:
from datumaro.components.dataset import Dataset
ade20k_dataset = Dataset.import_from('<path/to/dataset>', 'ade20k2017')
ADE20K dataset directory should have the following structure:
dataset/
├── subset1/
│ └── super_label_1/
│ ├── img1.jpg
│ ├── img1_atr.txt
│ ├── img1_parts_1.png
│ ├── img1_seg.png
│ ├── img2.jpg
│ ├── img2_atr.txt
│ └── ...
└── subset2/
├── img3.jpg
├── img3_atr.txt
├── img3_parts_1.png
├── img3_parts_2.png
├── img4.jpg
├── img4_atr.txt
├── img4_seg.png
└── ...
The mask images <image_name>_seg.png
contain information about the object
class segmentation masks and also separates each class into instances.
The channels R and G encode the objects class masks.
The channel B encodes the instance object masks.
The mask images <image_name>_parts_N.png
contain segmentation mask for parts
of objects, where N is a number indicating the level in the part hierarchy.
The annotation files <image_name>_atr.txt
describing the content of each
image. Each line in the text file contains:
- column 1: instance number,
- column 2: part level (0 for objects),
- column 3: occluded (1 for true),
- column 4: original raw name (might provide a more detailed categorization),
- column 5: class name (parsed using wordnet),
- column 6: double-quoted list of attributes, separated by commas.
Each column is separated by a
#
. See example of dataset
here.
Datumaro can convert ADE20K into any other format Datumaro supports.
To get the expected result, convert the dataset to a format
that supports segmentation masks.
There are a few ways to convert ADE20k 2017 to other dataset format using CLI:
datum import -f ade20k2017 -i <path/to/dataset>
datum export -f coco -o ./save_dir -- --save-images
# or
datum convert -if ade20k2017 -i <path/to/dataset> -f coco -o ./save_dir \
--save-images
Or using Python API
from datumaro.components.dataset import Dataset
dataset = Dataset.import_from('<path/to/dataset>', 'ade202017')
dataset.export('save_dir', 'coco')
Examples
Examples of using this format from the code can be found in
the format tests
5.2 - ADE20k (v2020)
The original ADE20K 2020 dataset is available
here.
Also the consistency set (for checking the annotation consistency)
is available here.
Supported annotation types:
Supported annotation attributes:
occluded
(boolean): whether the object is occluded by another object
- other arbitrary boolean attributes, which can be specified
in the annotation file
<image_name>.json
Load ADE20K dataset
There are two ways to create Datumaro project and add ADE20K to it:
datum import --format ade20k2020 --input-path <path/to/dataset>
# or
datum create
datum add path -f ade20k2020 <path/to/dataset>
Also it is possible to load dataset using Python API:
from datumaro.components.dataset import Dataset
ade20k_dataset = Dataset.import_from('<path/to/dataset>', 'ade20k2020')
ADE20K dataset directory should has the following structure:
dataset/
├── subset1/
│ ├── img1/ # directory with instance masks for img1
│ | ├── instance_001_img1.png
│ | ├── instance_002_img1.png
│ | └── ...
│ ├── img1.jpg
│ ├── img1.json
│ ├── img1_seg.png
│ ├── img1_parts_1.png
│ |
│ ├── img2/ # directory with instance masks for img2
│ | ├── instance_001_img2.png
│ | ├── instance_002_img2.png
│ | └── ...
│ ├── img2.jpg
│ ├── img2.json
│ └── ...
│
└── subset2/
├── super_label_1/
| ├── img3/ # directory with instance masks for img3
| | ├── instance_001_img3.png
| | ├── instance_002_img3.png
| | └── ...
| ├── img3.jpg
| ├── img3.json
| ├── img3_seg.png
| ├── img3_parts_1.png
| └── ...
|
├── img4/ # directory with instance masks for img4
| ├── instance_001_img4.png
| ├── instance_002_img4.png
| └── ...
├── img4.jpg
├── img4.json
├── img4_seg.png
└── ...
The mask images <image_name>_seg.png
contain information about the object
class segmentation masks and also separates each class into instances.
The channels R and G encode the objects class masks.
The channel B encodes the instance object masks.
The mask images <image_name>_parts_N.png
contain segmentation mask for
parts of objects, where N is a number indicating the level in the part
hierarchy.
The <image_name>
directory contains instance masks for each
object in the image, these masks represent one-channel images,
each pixel of which indicates an affinity to a specific object.
The annotation files <image_name>.json
describing the content of each image.
See our tests asset
for example of this file,
or check ADE20K toolkit for it.
Datumaro can convert ADE20K into any other format Datumaro supports.
To get the expected result, convert the dataset to a format
that supports segmentation masks.
There are a few ways to convert ADE20k to other dataset format using CLI:
datum import -f ade20k2020 -i <path/to/dataset>
datum export -f coco -o ./save_dir -- --save-images
# or
datum convert -if ade20k2020 -i <path/to/dataset> -f coco -o ./save_dir \
--save-images
Or using Python API
from datumaro.components.dataset import Dataset
dataset = Dataset.import_from('<path/to/dataset>', 'ade20k2020')
dataset.export('save_dir', 'voc')
Examples
Examples of using this format from the code can be found in
the format tests
5.3 - CIFAR
CIFAR format specification is available here.
Supported annotation types:
Datumaro supports Python version CIFAR-10/100.
The difference between CIFAR-10 and CIFAR-100 is how labels are stored
in the meta files (batches.meta
or meta
) and in the annotation files.
The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image
comes with a “fine” label (the class to which it belongs) and a “coarse” label
(the superclass to which it belongs). In CIFAR-10 there are no superclasses.
CIFAR formats contains 32 x 32 images. As an extension, Datumaro supports
reading and writing of arbitrary-sized images.
Load CIFAR dataset
The CIFAR dataset is available for free download:
There are two ways to create Datumaro project and add CIFAR dataset to it:
datum import --format cifar --input-path <path/to/dataset>
# or
datum create
datum add path -f cifar <path/to/dataset>
It is possible to specify project name and project directory run
datum create --help
for more information.
CIFAR-10 dataset directory should have the following structure:
└─ Dataset/
├── batches.meta
├── <subset_name1>
├── <subset_name2>
└── ...
CIFAR-100 dataset directory should have the following structure:
└─ Dataset/
├── meta
├── <subset_name1>
├── <subset_name2>
└── ...
Dataset files use Pickle
data format.
Meta files:
CIFAR-10:
num_cases_per_batch: 1000
label_names: list of strings (['airplane', 'automobile', 'bird', ...])
num_vis: 3072
CIFAR-100:
fine_label_names: list of strings (['apple', 'aquarium_fish', ...])
coarse_label_names: list of strings (['aquatic_mammals', 'fish', ...])
Annotation files:
Common:
'batch_label': 'training batch 1 of <N>'
'data': numpy.ndarray of uint8, layout N x C x H x W
'filenames': list of strings
If images have non-default size (32x32) (Datumaro extension):
'image_sizes': list of (H, W) tuples
CIFAR-10:
'labels': list of strings
CIFAR-100:
'fine_labels': list of integers
'coarse_labels': list of integers
Datumaro can convert CIFAR dataset into any other format Datumaro supports.
To get the expected result, convert the dataset to formats
that support the classification task (e.g. MNIST, ImageNet, PascalVOC,
etc.) There are few ways to convert CIFAR dataset to other dataset format:
datum project import -f cifar -i <path/to/cifar>
datum export -f imagenet -o <path/to/output/dir>
# or
datum convert -if cifar -i <path/to/cifar> -f imagenet -o <path/to/output/dir>
Export to CIFAR
There are few ways to convert dataset to CIFAR format:
# export dataset into CIFAR format from existing project
datum export -p <path/to/project> -f cifar -o <path/to/export/dir> \
-- --save-images
# converting to CIFAR format from other format
datum convert -if imagenet -i <path/to/imagenet/dataset> \
-f cifar -o <path/to/export/dir> -- --save-images
Extra options for export to CIFAR format:
--save-images
allow to export dataset with saving images
(by default False
);
--image-ext <IMAGE_EXT>
allow to specify image extension
for exporting dataset (by default .png
).
The format (CIFAR-10 or CIFAR-100) in which the dataset will be
exported depends on the presence of superclasses in the LabelCategories
.
Examples
Datumaro supports filtering, transformation, merging etc. for all formats
and for the CIFAR format in particular. Follow user manual
to get more information about these operations.
There are few examples of using Datumaro operations to solve
particular problems with CIFAR dataset:
Example 1. How to create custom CIFAR-like dataset
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Label, DatasetItem
dataset = Dataset.from_iterable([
DatasetItem(id=0, image=np.ones((32, 32, 3)),
annotations=[Label(3)]
),
DatasetItem(id=1, image=np.ones((32, 32, 3)),
annotations=[Label(8)]
)
], categories=['airplane', 'automobile', 'bird', 'cat', 'deer',
'dog', 'frog', 'horse', 'ship', 'truck'])
dataset.export('./dataset', format='cifar')
Example 2. How to filter and convert CIFAR dataset to ImageNet
Convert CIFAR dataset to ImageNet format, keep only images with dog
class
presented:
# Download CIFAR-10 dataset:
# https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
datum convert --input-format cifar --input-path <path/to/cifar> \
--output-format imagenet \
--filter '/item[annotation/label="dog"]'
Examples of using this format from the code can be found in
the format tests
5.4 - Cityscapes
Cityscapes format overview is available here.
Cityscapes format specification is available here.
Supported annotation types:
Supported annotation attributes:
is_crowd
(boolean). Specifies if the annotation label can
distinguish between different instances.
If False
, the annotation id
field encodes the instance id.
Load Cityscapes dataset
The Cityscapes dataset is available for free download.
There are two ways to create Datumaro project and add Cityscapes dataset to it:
datum import --format cityscapes --input-path <path/to/dataset>
# or
datum create
datum add path -f cityscapes <path/to/dataset>
It is possible to specify project name and project directory run
datum create --help
for more information.
Cityscapes dataset directory should have the following structure:
└─ Dataset/
├── imgsFine/
│ ├── leftImg8bit
│ │ ├── <split: train,val, ...>
│ │ | ├── {city1}
│ │ │ | ├── {city1}_{seq:[0...6]}_{frame:[0...6]}_leftImg8bit.png
│ │ │ │ └── ...
│ │ | ├── {city2}
│ │ │ └── ...
│ │ └── ...
└── gtFine/
├── <split: train,val, ...>
│ ├── {city1}
│ | ├── {city1}_{seq:[0...6]}_{frame:[0...6]}_gtFine_color.png
│ | ├── {city1}_{seq:[0...6]}_{frame:[0...6]}_gtFine_instanceIds.png
│ | ├── {city1}_{seq:[0...6]}_{frame:[0...6]}_gtFine_labelIds.png
│ │ └── ...
│ ├── {city2}
│ └── ...
└── ...
Annotated files description:
*_leftImg8bit.png
- left images in 8-bit LDR format
*_color.png
- class labels encoded by its color
*_labelIds.png
- class labels are encoded by its index
*_instanceIds.png
- class and instance labels encoded by an instance ID.
The pixel values encode class and the individual instance: the integer part
of a division by 1000 of each ID provides class ID, the remainder
is the instance ID. If a certain annotation describes multiple instances,
then the pixels have the regular ID of that class
To make sure that the selected dataset has been added to the project, you can
run datum info
, which will display the project and dataset information.
Datumaro can convert Cityscapes dataset into any other format Datumaro supports.
To get the expected result, convert the dataset to formats
that support the segmentation task (e.g. PascalVOC, CamVID, etc.)
There are few ways to convert Cityscapes dataset to other dataset format:
datum project import -f cityscapes -i <path/to/cityscapes>
datum export -f voc -o <path/to/output/dir>
# or
datum convert -if cityscapes -i <path/to/cityscapes> -f voc -o <path/to/output/dir>
Some formats provide extra options for conversion.
These options are passed after double dash (--
) in the command line.
To get information about them, run
datum export -f <FORMAT> -- -h
Export to Cityscapes
There are few ways to convert dataset to Cityscapes format:
# export dataset into Cityscapes format from existing project
datum export -p <path/to/project> -f cityscapes -o <path/to/export/dir> \
-- --save-images
# converting to Cityscapes format from other format
datum convert -if voc -i <path/to/voc/dataset> \
-f cityscapes -o <path/to/export/dir> -- --save-images
Extra options for export to cityscapes format:
--save-images
allow to export dataset with saving images
(by default False
);
--image-ext IMAGE_EXT
allow to specify image extension
for exporting dataset (by default - keep original or use .png
, if none).
--label_map
allow to define a custom colormap. Example
# mycolormap.txt :
# 0 0 255 sky
# 255 0 0 person
#...
datum export -f cityscapes -- --label-map mycolormap.txt
# or you can use original cityscapes colomap:
datum export -f cityscapes -- --label-map cityscapes
Examples
Datumaro supports filtering, transformation, merging etc. for all formats
and for the Cityscapes format in particular. Follow
user manual
to get more information about these operations.
There are few examples of using Datumaro operations to solve
particular problems with Cityscapes dataset:
Example 1. Load the original Cityscapes dataset and convert to Pascal VOC
datum create -o project
datum add path -p project -f cityscapes ./Cityscapes/
datum stats -p project
datum export -p final_project -o dataset -f voc -- --save-images
Example 2. Create a custom Cityscapes-like dataset
import numpy as np
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Mask, DatasetItem
import datumaro.plugins.cityscapes_format as Cityscapes
label_map = OrderedDict()
label_map['background'] = (0, 0, 0)
label_map['label_1'] = (1, 2, 3)
label_map['label_2'] = (3, 2, 1)
categories = Cityscapes.make_cityscapes_categories(label_map)
dataset = Dataset.from_iterable([
DatasetItem(id=1,
image=np.ones((1, 5, 3)),
annotations=[
Mask(image=np.array([[1, 0, 0, 1, 1]]), label=1),
Mask(image=np.array([[0, 1, 1, 0, 0]]), label=2, id=2,
attributes={'is_crowd': False}),
]
),
], categories=categories)
dataset.export('./dataset', format='cityscapes')
Examples of using this format from the code can be found in
the format tests
5.5 - COCO
COCO format specification available here.
The dataset has annotations for multiple tasks. Each task has its own format
in Datumaro, and there is also a combined coco
format, which includes all
the available tasks. The sub-formats have the same options as the “main”
format and only limit the set of annotation files they work with. To work with
multiple formats, use the corresponding option of the coco
format.
Supported tasks / formats:
Supported annotation types (depending on the task):
Caption
(captions)
Label
(label, Datumaro extension)
Bbox
(instances, person keypoints)
Polygon
(instances, person keypoints)
Mask
(instances, person keypoints, panoptic, stuff)
Points
(person keypoints)
Supported annotation attributes:
is_crowd
(boolean; on bbox
, polygon
and mask
annotations) - Indicates
that the annotation covers multiple instances of the same class.
score
(number; range [0; 1]) - Indicates the confidence in this
annotation. Ground truth annotations always have 1.
- arbitrary attributes (string/number) - A Datumaro extension. Stored
in the
attributes
section of the annotation descriptor.
Load COCO dataset
The COCO dataset is available for free download:
Images:
Annotations:
There are two ways to create Datumaro project and add COCO dataset to it:
datum import --format coco --input-path <path/to/dataset>
# or
datum create
datum add path -f coco <path/to/dataset>
It is possible to specify project name and project directory, run
datum create --help
for more information.
A COCO dataset directory should have the following layout:
└─ Dataset/
├── images/
│ ├── train<year>/
│ │ ├── <image_name1.ext>
│ │ ├── <image_name2.ext>
│ │ └── ...
│ └── val<year>/
│ ├── <image_name1.ext>
│ ├── <image_name2.ext>
│ └── ...
└── annotations/
├── <task>_<subset_name><year>.json
└── ...
For the panoptic task, a dataset directory should have the following layout:
└─ Dataset/
├── images/
│ ├── train<year>
│ │ ├── <image_name1.ext>
│ │ ├── <image_name2.ext>
│ │ └── ...
│ ├── val<year>
│ │ ├── <image_name1.ext>
│ │ ├── <image_name2.ext>
│ │ └── ...
└── annotations/
├── panoptic_train<year>/
│ ├── <image_name1.ext>
│ ├── <image_name2.ext>
│ └── ...
├── panoptic_train<year>.json
├── panoptic_val<year>/
│ ├── <image_name1.ext>
│ ├── <image_name2.ext>
│ └── ...
└── panoptic_val<year>.json
Annotation files must have the names like <task>_<subset_name><year>.json
.
You can import dataset for one or few tasks
instead of the whole dataset. This option also allows to import annotation
files with non-default names. For example:
datum import --format coco_stuff --input-path <path/to/stuff.json>
To make sure that the selected dataset has been added to the project, you can
run datum info
, which will display the project and dataset information.
Notes:
- COCO categories can have any integer ids, however, Datumaro will count
annotation category id 0 as “not specified”. This does not contradict
the original annotations, because they have category indices starting from 1.
Datumaro can convert COCO dataset into any other format Datumaro supports.
To get the expected result, convert the dataset to formats
that support the specified task (e.g. for panoptic segmentation - VOC, CamVID)
There are few ways to convert COCO dataset to other dataset format:
datum project import -f coco -i <path/to/coco>
datum export -f voc -o <path/to/output/dir>
# or
datum convert -if coco -i <path/to/coco> -f voc -o <path/to/output/dir>
Some formats provide extra options for conversion.
These options are passed after double dash (--
) in the command line.
To get information about them, run
datum export -f <FORMAT> -- -h
Export to COCO
There are few ways to convert dataset to COCO format:
# export dataset into COCO format from existing project
datum export -p <path/to/project> -f coco -o <path/to/export/dir> \
-- --save-images
# converting to COCO format from other format
datum convert -if voc -i <path/to/voc/dataset> \
-f coco -o <path/to/export/dir> -- --save-images
Extra options for export to COCO format:
--save-images
allow to export dataset with saving images
(by default False
);
--image-ext IMAGE_EXT
allow to specify image extension
for exporting dataset (by default - keep original or use .jpg
, if none);
--segmentation-mode MODE
allow to specify save mode for instance
segmentation:
- ‘guess’: guess the mode for each instance
(using ‘is_crowd’ attribute as hint)
- ‘polygons’: save polygons( merge and convert masks, prefer polygons)
- ‘mask’: save masks (merge and convert polygons, prefer masks)
(by default
guess
);
--crop-covered
allow to crop covered segments so that background objects
segmentation was more accurate (by default False
);
--allow-attributes ALLOW_ATTRIBUTES
allow export of attributes
(by default True
);
--reindex REINDEX
allow to assign new indices to images and annotations,
useful to avoid merge conflicts (by default False
);
--merge-images
allow to save all images into a single directory
(by default False
);
--tasks TASKS
allow to specify tasks for export dataset,
by default Datumaro uses all tasks. Example:
datum import -o project -f coco -i <dataset>
datum export -p project -f coco -- --tasks instances,stuff
Examples
Datumaro supports filtering, transformation, merging etc. for all formats
and for the COCO format in particular. Follow
user manual
to get more information about these operations.
There are few examples of using Datumaro operations to solve
particular problems with COCO dataset:
Example 1. How to load an original panoptic COCO dataset and convert to Pascal VOC
datum create -o project
datum add path -p project -f coco_panoptic ./COCO/annotations/panoptic_val2017.json
datum stats -p project
datum export -p final_project -o dataset -f voc --overwrite -- --save-images
Example 2. How to create custom COCO-like dataset
import numpy as np
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Mask, DatasetItem
dataset = Dataset.from_iterable([
DatasetItem(id='000000000001',
image=np.ones((1, 5, 3)),
subset='val',
attributes={'id': 40},
annotations=[
Mask(image=np.array([[0, 0, 1, 1, 0]]), label=3,
id=7, group=7, attributes={'is_crowd': False}),
Mask(image=np.array([[0, 1, 0, 0, 1]]), label=1,
id=20, group=20, attributes={'is_crowd': True}),
]
),
], categories=['a', 'b', 'c', 'd'])
dataset.export('./dataset', format='coco_panoptic')
Examples of using this format from the code can be found in
the format tests
5.6 - Image zip
The image zip format allow to export/import unannotated datasets
with images to/from zip archive. The format doesn’t support any
annotations or attributes.
Load Image zip dataset
Few ways to load unannotated datasets to your Datumaro project:
datum import -o project -f image_zip -i ./images.zip
- From directory with zip archives. Datumaro will loaded images from
all zip files in the directory:
datum import -o project -f image_zip -i ./foo
The directory with zip archives should have the following structure:
└── foo/
├── archive1.zip/
| ├── image_1.jpg
| ├── image_2.png
| ├── subdir/
| | ├── image_3.jpg
| | └── ...
| └── ...
├── archive2.zip/
| ├── image_101.jpg
| ├── image_102.jpg
| └── ...
...
Images in a archives should have supported extension,
follow the user manual to see the supported
extensions.
Datumaro can load dataset images from a zip archive and convert it to
another supported dataset format,
for example:
datum import -o project -f image_zip -i ./images.zip
datum export -f coco -o ./new_dir -- --save-images
Export unannotated dataset to zip archive
Example: exporting images from VOC dataset to zip archives:
datum import -o project -f voc -i ./VOC2012
datum export -f image_zip -o ./ --overwrite -- --name voc_images.zip \
--compression ZIP_DEFLATED
Extra options for export to image_zip format:
--save-images
allow to export dataset with saving images
(default: False
);
--image-ext <IMAGE_EXT>
allow to specify image extension
for exporting dataset (default: use original or .jpg
, if none);
--name
name of output zipfile (default: default.zip
);
--compression
allow to specify archive compression method.
Available methods:
ZIP_STORED
, ZIP_DEFLATED
, ZIP_BZIP2
, ZIP_LZMA
(default: ZIP_STORED
).
Follow zip documentation
for more information.
Examples
Examples of using this format from the code can be found in
the format tests
5.7 - Velodyne Points / KITTI Raw 3D
Velodyne Points / KITTI Raw 3D data format:
Supported annotation types:
Cuboid3d
(represent tracks)
Supported annotation attributes:
truncation
(write, string), possible values: truncation_unset
,
in_image
, truncated
, out_image
, behind_image
(case-independent).
occlusion
(write, string), possible values: occlusion_unset
, visible
,
partly
, fully
(case-independent). This attribute has priority
over occluded
.
occluded
(read/write, boolean)
keyframe
(read/write, boolean). Responsible for occlusion_kf
field.
track_id
(read/write, integer). Indicates the group over frames for
annotations, represent tracks.
Supported image attributes:
frame
(read/write, integer). Indicates frame number of the image.
Import KITTI Raw dataset
The velodyne points/KITTI Raw dataset is available for downloading
here and
here.
KITTI Raw dataset directory should have the following structure:
└─ Dataset/
├── image_00/ # optional, aligned images from different cameras
│ └── data/
│ ├── <name1.ext>
│ └── <name2.ext>
├── image_01/
│ └── data/
│ ├── <name1.ext>
│ └── <name2.ext>
...
│
├── velodyne_points/ # optional, 3d point clouds
│ └── data/
│ ├── <name1.pcd>
│ └── <name2.pcd>
├── tracklet_labels.xml
└── frame_list.txt # optional, required for custom image names
The format does not support arbitrary image names and paths, but Datumaro
provides an option to use a special index file to allow this.
frame_list.txt
contents:
12345 relative/path/to/name1/from/data
46 relative/path/to/name2/from/data
...
There are two ways to create Datumaro project and add KITTI dataset to it:
datum import --format kitti_raw --input-path <path/to/dataset>
# or
datum create
datum add path -f kitti_raw <path/to/dataset>
To make sure that the selected dataset has been added to the project,
you can run datum info
, which will display the project and dataset
information.
Datumaro can convert KITTI Raw dataset into any other
format Datumaro supports.
Such conversion will only be successful if the output
format can represent the type of dataset you want to convert,
e.g. 3D point clouds can be saved in Supervisely Point Clouds format,
but not in COCO keypoints.
There are few ways to convert KITTI Raw dataset to other dataset format:
datum import -f kitti_raw -i <path/to/kitti_raw> -o proj/
datum export -f sly_pointcloud -o <path/to/output/dir> -p proj/
# or
datum convert -if kitti_raw -i <path/to/kitti_raw> -f sly_pointcloud
Some formats provide extra options for conversion.
These options are passed after double dash (--
) in the command line.
To get information about them, run
datum export -f <FORMAT> -- -h
Export to KITTI Raw
There are few ways to convert dataset to KITTI Raw format:
# export dataset into KITTI Raw format from existing project
datum export -p <path/to/project> -f kitti_raw -o <path/to/export/dir> \
-- --save-images
# converting to KITTI Raw format from other format
datum convert -if sly_pointcloud -i <path/to/sly_pcd/dataset> \
-f kitti_raw -o <path/to/export/dir> -- --save-images --reindex
Extra options for exporting in KITTI Raw format:
--save-images
allow to export dataset with saving images. This will
include point clouds and related images (by default False
)
--image-ext IMAGE_EXT
allow to specify image extension
for exporting dataset (by default - keep original or use .png
, if none)
--reindex
assigns new indices to frames and tracks. Allows annotations
without track_id
attribute (they will be exported as single-frame tracks).
--allow-attrs
allows writing arbitrary annotation attributes. They will
be written in <annotations>
section of <poses><item>
(disabled by default)
Examples
Example 1. Import dataset, compute statistics
datum create -o project
datum add path -p project -f kitti_raw ../../kitti_raw/
datum stats -p project
Example 2. Convert Supervisely Pointclouds to KITTI Raw
datum convert -if sly_pointcloud -i ../sly_pcd/ \
-f kitti_raw -o my_kitti/ -- --save-images --allow-attrs
Example 3. Create a custom dataset
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Cuboid3d, DatasetItem
dataset = Dataset.from_iterable([
DatasetItem(id='some/name/qq',
annotations=[
Cuboid3d(position=[13.54, -9.41, 0.24], label=0,
attributes={'occluded': False, 'track_id': 1}),
Cuboid3d(position=[3.4, -2.11, 4.4], label=1,
attributes={'occluded': True, 'track_id': 2})
],
pcd='path/to/pcd1.pcd',
related_images=[np.ones((10, 10)), 'path/to/image2.png', 'image3.jpg'],
attributes={'frame': 0}
),
], categories=['cat', 'dog'])
dataset.export('my_dataset/', format='kitti_raw', save_images=True)
Examples of using this format from the code can be found in
the format tests
5.8 - KITTI
The KITTI dataset has many annotations for different tasks. Datumaro supports
only few of them.
Supported tasks / formats:
- Object Detection -
kitti_detection
The format specification is available in README.md
here.
- Segmentation -
kitti_segmentation
The format specification is available in README.md
here.
- Raw 3D / Velodyne Points - described here
Supported annotation types:
Bbox
(object detection)
Mask
(segmentation)
Supported attributes:
truncated
(boolean) - indicates that the bounding box specified for
the object does not correspond to the full extent of the object
occluded
(boolean) - indicates that a significant portion of the object
within the bounding box is occluded by another object
Load KITTI dataset
The KITTI left color images for object detection are available here.
The KITTI object detection labels are available here.
The KITTI segmentation dataset is available here.
There are two ways to create Datumaro project and add KITTI dataset to it:
datum import --format kitti --input-path <path/to/dataset>
# or
datum create
datum add path -f kitti <path/to/dataset>
It is possible to specify project name and project directory run
datum create --help
for more information.
KITTI segmentation dataset directory should have the following structure:
└─ Dataset/
├── testing/
│ └── image_2/
│ ├── <name_1>.<img_ext>
│ ├── <name_2>.<img_ext>
│ └── ...
└── training/
├── image_2/ # left color camera images
│ ├── <name_1>.<img_ext>
│ ├── <name_2>.<img_ext>
│ └── ...
├── label_2/ # left color camera label files
│ ├── <name_1>.txt
│ ├── <name_2>.txt
│ └── ...
├── instance/ # instance segmentation masks
│ ├── <name_1>.png
│ ├── <name_2>.png
│ └── ...
├── semantic/ # semantic segmentation masks (labels are encoded by its id)
│ ├── <name_1>.png
│ ├── <name_2>.png
│ └── ...
└── semantic_rgb/ # semantic segmentation masks (labels are encoded by its color)
├── <name_1>.png
├── <name_2>.png
└── ...
You can import dataset for specific tasks
of KITTI dataset instead of the whole dataset,
for example:
datum add path -f kitti_detection <path/to/dataset>
To make sure that the selected dataset has been added to the project, you can
run datum info
, which will display the project and dataset information.
Datumaro can convert KITTI dataset into any other format Datumaro supports.
Such conversion will only be successful if the output
format can represent the type of dataset you want to convert,
e.g. segmentation annotations can be
saved in Cityscapes
format, but no as COCO keypoints
.
There are few ways to convert KITTI dataset to other dataset format:
datum project import -f kitti -i <path/to/kitti>
datum export -f cityscapes -o <path/to/output/dir>
# or
datum convert -if kitti -i <path/to/kitti> -f cityscapes -o <path/to/output/dir>
Some formats provide extra options for conversion.
These options are passed after double dash (--
) in the command line.
To get information about them, run
datum export -f <FORMAT> -- -h
Export to KITTI
There are few ways to convert dataset to KITTI format:
# export dataset into KITTI format from existing project
datum export -p <path/to/project> -f kitti -o <path/to/export/dir> \
-- --save-images
# converting to KITTI format from other format
datum convert -if cityscapes -i <path/to/cityscapes/dataset> \
-f kitti -o <path/to/export/dir> -- --save-images
Extra options for export to KITTI format:
--save-images
allow to export dataset with saving images
(by default False
);
--image-ext IMAGE_EXT
allow to specify image extension
for exporting dataset (by default - keep original or use .png
, if none).
--apply-colormap APPLY_COLORMAP
allow to use colormap for class masks
(in folder semantic_rgb
, by default True
);
--label_map
allow to define a custom colormap. Example
# mycolormap.txt :
# 0 0 255 sky
# 255 0 0 person
#...
datum export -f kitti -- --label-map mycolormap.txt
# or you can use original kitti colomap:
datum export -f kitti -- --label-map kitti
--tasks TASKS
allow to specify tasks for export dataset,
by default Datumaro uses all tasks. Example:
datum import -o project -f kitti -i <dataset>
datum export -p project -f kitti -- --tasks detection
--allow-attributes ALLOW_ATTRIBUTES
allow export of attributes
(by default True
).
Examples
Datumaro supports filtering, transformation, merging etc. for all formats
and for the KITTI format in particular. Follow
user manual
to get more information about these operations.
There are few examples of using Datumaro operations to solve
particular problems with KITTI dataset:
Example 1. How to load an original KITTI dataset and convert to Cityscapes
datum create -o project
datum add path -p project -f kitti ./KITTI/
datum stats -p project
datum export -p final_project -o dataset -f cityscapes -- --save-images
Example 2. How to create custom KITTI-like dataset
import numpy as np
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Mask, DatasetItem
import datumaro.plugins.kitti_format as KITTI
label_map = {}
label_map['background'] = (0, 0, 0)
label_map['label_1'] = (1, 2, 3)
label_map['label_2'] = (3, 2, 1)
categories = KITTI.make_kitti_categories(label_map)
dataset = Dataset.from_iterable([
DatasetItem(id=1,
image=np.ones((1, 5, 3)),
annotations=[
Mask(image=np.array([[1, 0, 0, 1, 1]]), label=1, id=0,
attributes={'is_crowd': False}),
Mask(image=np.array([[0, 1, 1, 0, 0]]), label=2, id=0,
attributes={'is_crowd': False}),
]
),
], categories=categories)
dataset.export('./dataset', format='kitti')
Examples of using this format from the code can be found in
the format tests
5.9 - MNIST
MNIST format specification is available here.
Fashion MNIST format specification is available here.
MNIST in CSV format specification is available here.
The dataset has few data formats available. Datumaro supports the
binary (Python pickle) format and the CSV variant. Each data format is covered
by a separate Datumaro format.
Supported formats:
- Binary (Python pickle) -
mnist
- CSV -
mnist_csv
Supported annotation types:
The format only supports single channel 28 x 28 images.
Load MNIST dataset
The MNIST dataset is available for free download:
The Fashion MNIST dataset is available for free download:
The MNIST in CSV dataset is available for free download:
There are two ways to create Datumaro project and add MNIST dataset to it:
datum import --format mnist --input-path <path/to/dataset>
# or
datum create
datum add path -f mnist <path/to/dataset>
There are two ways to create Datumaro project and add MNIST in CSV dataset
to it:
datum import --format mnist_csv --input-path <path/to/dataset>
# or
datum create
datum add path -f mnist_csv <path/to/dataset>
It is possible to specify project name and project directory run
datum create --help
for more information.
MNIST dataset directory should have the following structure:
└─ Dataset/
├── labels.txt # list of non-digit labels (optional)
├── t10k-images-idx3-ubyte.gz
├── t10k-labels-idx1-ubyte.gz
├── train-images-idx3-ubyte.gz
└── train-labels-idx1-ubyte.gz
MNIST in CSV dataset directory should have the following structure:
└─ Dataset/
├── labels.txt # list of non-digit labels (optional)
├── mnist_test.csv
└── mnist_train.csv
If the dataset needs non-digit labels, you need to add the labels.txt
to the dataset folder. For example, labels.txt
for Fashion MNIST the
following contents:
T-shirt/top
Trouser
Pullover
Dress
Coat
Sandal
Shirt
Sneaker
Bag
Ankle boot
Datumaro can convert MNIST dataset into any other format Datumaro supports.
To get the expected result, convert the dataset to formats
that support the classification task (e.g. CIFAR-10/100, ImageNet, PascalVOC,
etc.) There are few ways to convert MNIST dataset to other dataset format:
datum project import -f mnist -i <path/to/mnist>
datum export -f imagenet -o <path/to/output/dir>
# or
datum convert -if mnist -i <path/to/mnist> -f imagenet -o <path/to/output/dir>
These commands also work for MNIST in CSV if you use mnist_csv
instead of mnist
.
Export to MNIST
There are few ways to convert dataset to MNIST format:
# export dataset into MNIST format from existing project
datum export -p <path/to/project> -f mnist -o <path/to/export/dir> \
-- --save-images
# converting to MNIST format from other format
datum convert -if imagenet -i <path/to/imagenet/dataset> \
-f mnist -o <path/to/export/dir> -- --save-images
Extra options for export to MNIST format:
--save-images
allow to export dataset with saving images
(by default False
);
--image-ext <IMAGE_EXT>
allow to specify image extension
for exporting dataset (by default .png
).
These commands also work for MNIST in CSV if you use mnist_csv
instead of mnist
.
Examples
Datumaro supports filtering, transformation, merging etc. for all formats
and for the MNIST format in particular. Follow user manual
to get more information about these operations.
There are few examples of using Datumaro operations to solve
particular problems with MNIST dataset:
Example 1. How to create custom MNIST-like dataset
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Label, DatasetItem
dataset = Dataset.from_iterable([
DatasetItem(id=0, image=np.ones((28, 28)),
annotations=[Label(2)]
),
DatasetItem(id=1, image=np.ones((28, 28)),
annotations=[Label(7)]
)
], categories=[str(label) for label in range(10)])
dataset.export('./dataset', format='mnist')
Example 2. How to filter and convert MNIST dataset to ImageNet
Convert MNIST dataset to ImageNet format, keep only images with 3
class
presented:
# Download MNIST dataset:
# https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
# https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
datum convert --input-format mnist --input-path <path/to/mnist> \
--output-format imagenet \
--filter '/item[annotation/label="3"]'
Examples of using this format from the code can be found in
the binary format tests and csv format tests
5.10 - Open Images
A description of the Open Images Dataset (OID) format is available
on its website.
Datumaro supports versions 4, 5 and 6.
Supported annotation types:
Label
(human-verified image-level labels)
Bbox
(bounding boxes)
Mask
(segmentation masks)
Supported annotation attributes:
-
Labels
score
(read/write, float).
The confidence level from 0 to 1.
A score of 0 indicates that
the image does not contain objects of the corresponding class.
-
Bounding boxes
score
(read/write, float).
The confidence level from 0 to 1.
In the original dataset this is always equal to 1,
but custom datasets may be created with arbitrary values.
occluded
(read/write, boolean).
Whether the object is occluded by another object.
truncated
(read/write, boolean).
Whether the object extends beyond the boundary of the image.
is_group_of
(read/write, boolean).
Whether the object represents a group of objects of the same class.
is_depiction
(read/write, boolean).
Whether the object is a depiction (such as a drawing)
rather than a real object.
is_inside
(read/write, boolean).
Whether the object is seen from the inside.
-
Masks
box_id
(read/write, string).
An identifier for the bounding box associated with the mask.
predicted_iou
(read/write, float).
Predicted IoU value with respect to the ground truth.
Load Open Images dataset
The Open Images dataset is available for free download.
See the open-images-dataset
GitHub repository
for information on how to download the images.
Datumaro also requires the image description files,
which can be downloaded from the following URLs:
Datumaro expects at least one of the files above to be present.
In addition, the following metadata file must be present as well:
You can optionally download the following additional metadata file:
Annotations can be downloaded from the following URLs:
All annotation files are optional,
except that if the mask metadata files for a given subset are downloaded,
all corresponding images must be downloaded as well, and vice versa.
There are two ways to create Datumaro project and add OID to it:
datum import --format open_images --input-path <path/to/dataset>
# or
datum create
datum add path -f open_images <path/to/dataset>
It is possible to specify project name and project directory; run
datum create --help
for more information.
Open Images dataset directory should have the following structure:
└─ Dataset/
├── annotations/
│ └── bbox_labels_600_hierarchy.json
│ └── image_ids_and_rotation.csv
│ └── oidv6-class-descriptions.csv
│ └── *-annotations-bbox.csv
│ └── *-annotations-human-imagelabels.csv
│ └── *-annotations-object-segmentation.csv
├── images/
| ├── test/
| │ ├── <image_name1.jpg>
| │ ├── <image_name2.jpg>
| │ └── ...
| ├── train/
| │ ├── <image_name1.jpg>
| │ ├── <image_name2.jpg>
| │ └── ...
| └── validation/
| ├── <image_name1.jpg>
| ├── <image_name2.jpg>
| └── ...
└── masks/
├── test/
│ ├── <mask_name1.png>
│ ├── <mask_name2.png>
│ └── ...
├── train/
│ ├── <mask_name1.png>
│ ├── <mask_name2.png>
│ └── ...
└── validation/
├── <mask_name1.png>
├── <mask_name2.png>
└── ...
The mask images must be extracted from the ZIP archives linked above.
To use per-subset image description files instead of image_ids_and_rotation.csv
,
place them in the annotations
subdirectory.
To load bounding box and segmentation mask annotations,
Datumaro needs to know the sizes of the corresponding images.
By default, it will determine these sizes by loading each image from disk,
which requires the images to be present and makes the loading process slow.
If you want to load the aforementioned annotations on a machine where
the images are not available,
or just to speed up the dataset loading process,
you can extract the image size information in advance
and record it in an image metadata file.
This file must be placed at annotations/images.meta
,
and must contain one line per image, with the following structure:
<ID> <height> <width>
Where <ID>
is the file name of the image without the extension,
and <height>
and <width>
are the dimensions of that image.
<ID>
may be quoted with either single or double quotes.
The image metadata file, if present, will be used to determine the image
sizes without loading the images themselves.
Here’s one way to create the images.meta
file using ImageMagick,
assuming that the images are present on the current machine:
# run this from the dataset directory
find images -name '*.jpg' -exec \
identify -format '"%[basename]" %[height] %[width]\n' {} + \
> annotations/images.meta
Datumaro can convert OID into any other format Datumaro supports.
To get the expected result, convert the dataset to a format
that supports image-level labels.
There are a few ways to convert OID to other dataset format:
datum project import -f open_images -i <path/to/open_images>
datum export -f cvat -o <path/to/output/dir>
# or
datum convert -if open_images -i <path/to/open_images> -f cvat -o <path/to/output/dir>
Some formats provide extra options for conversion.
These options are passed after double dash (--
) in the command line.
To get information about them, run
datum export -f <FORMAT> -- -h
Export to Open Images
There are few ways to convert an existing dataset to the Open Images format:
# export dataset into Open Images format from existing project
datum export -p <path/to/project> -f open_images -o <path/to/export/dir> \
-- --save_images
# convert a dataset in another format to the Open Images format
datum convert -if imagenet -i <path/to/imagenet/dataset> \
-f open_images -o <path/to/export/dir> \
-- --save-images
Extra options for export to the Open Images format:
-
--save-images
- save image files when exporting the dataset
(by default, False
)
-
--image-ext IMAGE_EXT
- save image files with the specified extension
when exporting the dataset (by default, uses the original extension
or .jpg
if there isn’t one)
Examples
Datumaro supports filtering, transformation, merging etc. for all formats
and for the Open Images format in particular. Follow
user manual
to get more information about these operations.
Here are a few examples of using Datumaro operations to solve
particular problems with the Open Images dataset:
datum create -o project
datum add path -p project -f open_images ./open-images-dataset/
datum stats -p project
datum export -p project -o dataset -f cvat --overwrite -- --save-images
Example 2. Create a custom OID-like dataset
import numpy as np
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import (
AnnotationType, Label, LabelCategories, DatasetItem,
)
dataset = Dataset.from_iterable(
[
DatasetItem(
id='0000000000000001',
image=np.ones((1, 5, 3)),
subset='validation',
annotations=[
Label(0, attributes={'score': 1}),
Label(1, attributes={'score': 0}),
],
),
],
categories=['/m/0', '/m/1'],
)
dataset.export('./dataset', format='open_images')
Examples of using this format from the code can be found in
the format tests.
5.11 - Pascal VOC
Pascal VOC format specification is available
here.
The dataset has annotations for multiple tasks. Each task has its own format
in Datumaro, and there is also a combined voc
format, which includes all
the available tasks. The sub-formats have the same options as the “main”
format and only limit the set of annotation files they work with. To work with
multiple formats, use the corresponding option of the voc
format.
Supported tasks / formats:
- The combined format -
voc
- Image classification -
voc_classification
- Object detection -
voc_detection
- Action classification -
voc_action
- Class and instance segmentation -
voc_segmentation
- Person layout detection -
voc_layout
Supported annotation types:
Label
(classification)
Bbox
(detection, action detection and person layout)
Mask
(segmentation)
Supported annotation attributes:
occluded
(boolean) - indicates that a significant portion of the
object within the bounding box is occluded by another object
truncated
(boolean) - indicates that the bounding box specified for
the object does not correspond to the full extent of the object
difficult
(boolean) - indicates that the object is considered difficult
to recognize
- action attributes (boolean) -
jumping
, reading
and
others.
Indicate that the object does the corresponding action.
- arbitrary attributes (string/number) - A Datumaro extension. Stored
in the
attributes
section of the annotation xml
file. Available for
bbox annotations only.
Load Pascal VOC dataset
The Pascal VOC dataset is available for free download
here
There are two ways to create Datumaro project and add Pascal VOC dataset to it:
datum import --format voc --input-path <path/to/dataset>
# or
datum create
datum add path -f voc <path/to/dataset>
It is possible to specify project name and project directory run
datum create --help
for more information.
Pascal VOC dataset directory should have the following structure:
└─ Dataset/
├── label_map.txt # a list of non-Pascal labels (optional)
│
├── Annotations/
│ ├── ann1.xml # Pascal VOC format annotation file
│ ├── ann2.xml
│ └── ...
├── JPEGImages/
│ ├── img1.jpg
│ ├── img2.jpg
│ └── ...
├── SegmentationClass/ # directory with semantic segmentation masks
│ ├── img1.png
│ ├── img2.png
│ └── ...
├── SegmentationObject/ # directory with instance segmentation masks
│ ├── img1.png
│ ├── img2.png
│ └── ...
│
└── ImageSets/
├── Main/ # directory with list of images for detection and classification task
│ ├── test.txt # list of image names in test subset (without extension)
| ├── train.txt # list of image names in train subset (without extension)
| └── ...
├── Layout/ # directory with list of images for person layout task
│ ├── test.txt
| ├── train.txt
| └── ...
├── Action/ # directory with list of images for action classification task
│ ├── test.txt
| ├── train.txt
| └── ...
└── Segmentation/ # directory with list of images for segmentation task
├── test.txt
├── train.txt
└── ...
The ImageSets
directory should contain at least one of the directories:
Main
, Layout
, Action
, Segmentation
.
These directories contain .txt
files with a list of images in a subset,
the subset name is the same as the .txt
file name. Subset names can be
arbitrary.
In label_map.txt
you can define custom color map and non-pascal labels,
for example:
# label_map [label : color_rgb : parts : actions]
helicopter:::
elephant:0:124:134:head,ear,foot:
It is also possible to import grayscale (1-channel) PNG masks.
For grayscale masks provide a list of labels with the number of lines
equal to the maximum color index on images. The lines must be in the
right order so that line index is equal to the color index. Lines can
have arbitrary, but different, colors. If there are gaps in the used
color indices in the annotations, they must be filled with arbitrary
dummy labels. Example:
car:0,128,0:: # color index 0
aeroplane:10,10,128:: # color index 1
_dummy2:2,2,2:: # filler for color index 2
_dummy3:3,3,3:: # filler for color index 3
boat:108,0,100:: # color index 3
...
_dummy198:198,198,198:: # filler for color index 198
_dummy199:199,199,199:: # filler for color index 199
the_last_label:12,28,0:: # color index 200
You can import dataset for specific tasks
of Pascal VOC dataset instead of the whole dataset,
for example:
datum add path -f voc_detection <path/to/dataset/ImageSets/Main/train.txt>
To make sure that the selected dataset has been added to the project, you
can run datum info
, which will display the project and dataset information.
Datumaro can convert Pascal VOC dataset into any other format
Datumaro supports.
Such conversion will only be successful if the output
format can represent the type of dataset you want to convert,
e.g. image classification annotations can be
saved in ImageNet
format, but no as COCO keypoints
.
There are few ways to convert Pascal VOC dataset to other dataset format:
datum import -f voc -i <path/to/voc>
datum export -f coco -o <path/to/output/dir>
# or
datum convert -if voc -i <path/to/voc> -f coco -o <path/to/output/dir>
Some formats provide extra options for conversion.
These options are passed after double dash (--
) in the command line.
To get information about them, run
datum export -f <FORMAT> -- -h
Export to Pascal VOC
There are few ways to convert an existing dataset to Pascal VOC format:
# export dataset into Pascal VOC format (classification) from existing project
datum export -p <path/to/project> -f voc -o <path/to/export/dir> -- --tasks classification
# converting to Pascal VOC format from other format
datum convert -if imagenet -i <path/to/imagenet/dataset> \
-f voc -o <path/to/export/dir> \
-- --label_map voc --save-images
Extra options for export to Pascal VOC format:
-
--save-images
- allow to export dataset with saving images
(by default False
)
-
--image-ext IMAGE_EXT
- allow to specify image extension
for exporting dataset (by default use original or .jpg
if none)
-
--apply-colormap APPLY_COLORMAP
- allow to use colormap for class
and instance masks (by default True
)
-
--allow-attributes ALLOW_ATTRIBUTES
- allow export of attributes
(by default True
)
-
--keep-empty KEEP_EMPTY
- write subset lists even if they are empty
(by default: False
)
-
--tasks TASKS
- allow to specify tasks for export dataset,
by default Datumaro uses all tasks. Example:
datum import -o project -f voc -i ./VOC2012
datum export -p project -f voc -- --tasks detection,classification
--label_map
allow to define a custom colormap. Example
# mycolormap.txt [label : color_rgb : parts : actions]:
# cat:0,0,255::
# person:255,0,0:head:
datum export -f voc_segmentation -- --label-map mycolormap.txt
# or you can use original voc colomap:
datum export -f voc_segmentation -- --label-map voc
Examples
Datumaro supports filtering, transformation, merging etc. for all formats
and for the Pascal VOC format in particular. Follow
user manual
to get more information about these operations.
There are few examples of using Datumaro operations to solve
particular problems with Pascal VOC dataset:
Example 1. How to prepare an original dataset for training.
In this example, preparing the original dataset to train the semantic
segmentation model includes:
loading,
checking duplicate images,
setting the number of images,
splitting into subsets,
export the result to Pascal VOC format.
datum create -o project
datum add path -p project -f voc_segmentation ./VOC2012/ImageSets/Segmentation/trainval.txt
datum stats -p project # check statisctics.json -> repeated images
datum transform -p project -o ndr_project -t ndr -- -w trainval -k 2500
datum filter -p ndr_project -o trainval2500 -e '/item[subset="trainval"]'
datum transform -p trainval2500 -o final_project -t random_split -- -s train:.8 -s val:.2
datum export -p final_project -o dataset -f voc -- --label-map voc --save-images
Example 2. How to create custom dataset
from datumaro.components.dataset import Dataset
from datumaro.util.image import Image
from datumaro.components.extractor import Bbox, Polygon, Label, DatasetItem
dataset = Dataset.from_iterable([
DatasetItem(id='image1', image=Image(path='image1.jpg', size=(10, 20)),
annotations=[Label(3),
Bbox(1.0, 1.0, 10.0, 8.0, label=0, attributes={'difficult': True, 'running': True}),
Polygon([1, 2, 3, 2, 4, 4], label=2, attributes={'occluded': True}),
Polygon([6, 7, 8, 8, 9, 7, 9, 6], label=2),
]
),
], categories=['person', 'sky', 'water', 'lion'])
dataset.transform('polygons_to_masks')
dataset.export('./mydataset', format='voc', label_map='my_labelmap.txt')
"""
my_labelmap.txt:
# label:color_rgb:parts:actions
person:0,0,255:hand,foot:jumping,running
sky:128,0,0::
water:0,128,0::
lion:255,128,0::
"""
Example 3. Load, filter and convert from code
Load Pascal VOC dataset, and export train subset with items
which has jumping
attribute:
from datumaro.components.dataset import Dataset
dataset = Dataset.import_from('./VOC2012', format='voc')
train_dataset = dataset.get_subset('train').as_dataset()
def only_jumping(item):
for ann in item.annotations:
if ann.attributes.get('jumping'):
return True
return False
train_dataset.select(only_jumping)
train_dataset.export('./jumping_label_me', format='label_me', save_images=True)
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import AnnotationType
dataset = Dataset.import_from('./VOC2012', format='voc')
def has_mask(item):
for ann in item.annotations:
if ann.type == AnnotationType.mask:
return True
return False
dataset.select(has_mask)
print("Pascal VOC 2012 has %s images for segmentation task:" % len(dataset))
for subset_name, subset in dataset.subsets().items():
for item in subset:
print(item.id, subset_name, end=";")
After executing this code, we can see that there are 5826 images
in Pascal VOC 2012 has for segmentation task and this result is the same as the
official documentation
Examples of using this format from the code can be found in
tests
5.12 - Supervisely Point Cloud
Point Cloud data format:
Supported annotation types:
Supported annotation attributes:
track_id
(read/write, integer), responsible for object
field
createdAt
(write, string),
updatedAt
(write, string),
labelerLogin
(write, string), responsible for the corresponding fields
in the annotation file.
- arbitrary attributes
Supported image attributes:
description
(read/write, string),
createdAt
(write, string),
updatedAt
(write, string),
labelerLogin
(write, string), responsible for the corresponding fields
in the annotation file.
frame
(read/write, integer). Indicates frame number of the image.
- arbitrary attributes
Import Supervisely Point Cloud dataset
An example dataset in Supervisely Point Cloud format is available for download:
https://drive.google.com/u/0/uc?id=1BtZyffWtWNR-mk_PHNPMnGgSlAkkQpBl&export=download
Point Cloud dataset directory should have the following structure:
└─ Dataset/
├── ds0/
│ ├── ann/
│ │ ├── <pcdname1.pcd.json>
│ │ ├── <pcdname2.pcd.json>
│ │ └── ...
│ ├── pointcloud/
│ │ ├── <pcdname1.pcd>
│ │ ├── <pcdname1.pcd>
│ │ └── ...
│ ├── related_images/
│ │ ├── <pcdname1_pcd>/
│ │ | ├── <image_name.ext.json>
│ │ | ├── <image_name.ext.json>
│ │ └── ...
├── key_id_map.json
└── meta.json
There are two ways to import Supervisely Point Cloud dataset:
datum import --format sly_pointcloud --input-path <path/to/dataset>
# or
datum create
datum add path -f sly_pointcloud <path/to/dataset>
To make sure that the selected dataset has been added to the project,
you can run datum info
, which will display the project and dataset
information.
Datumaro can convert Supervisely Point Cloud dataset into any other
format Datumaro supports.
Such conversion will only be successful if the output
format can represent the type of dataset you want to convert,
e.g. 3D point clouds can be saved in KITTI Raw format,
but not in COCO keypoints.
There are few ways to convert Supervisely Point Cloud dataset
to other dataset formats:
datum import -f sly_pointcloud -i <path/to/sly_pcd/> -o proj/
datum export -f kitti_raw -o <path/to/output/dir> -p proj/
# or
datum convert -if sly_pointcloud -i <path/to/sly_pcd/> -f kitti_raw
Some formats provide extra options for conversion.
These options are passed after double dash (--
) in the command line.
To get information about them, run
datum export -f <FORMAT> -- -h
Export to Supervisely Point Cloud
There are few ways to convert dataset to Supervisely Point Cloud format:
# export dataset into Supervisely Point Cloud format from existing project
datum export -p <path/to/project> -f sly_pointcloud -o <path/to/export/dir> \
-- --save-images
# converting to Supervisely Point Cloud format from other format
datum convert -if kitti_raw -i <path/to/kitti_raw/dataset> \
-f sly_pointcloud -o <path/to/export/dir> -- --save-images
Extra options for exporting in Supervisely Point Cloud format:
--save-images
allow to export dataset with saving images. This will
include point clouds and related images (by default False
)
--image-ext IMAGE_EXT
allow to specify image extension
for exporting dataset (by default - keep original or use .png
, if none)
--reindex
assigns new indices to frames and annotations.
--allow-undeclared-attrs
allows writing arbitrary annotation attributes.
By default, only attributes specified in the input dataset metainfo
will be written.
Examples
Example 1. Import dataset, compute statistics
datum create -o project
datum add path -p project -f sly_pointcloud ../sly_dataset/
datum stats -p project
Example 2. Convert Supervisely Point Clouds to KITTI Raw
datum convert -if sly_pointcloud -i ../sly_pcd/ \
-f kitti_raw -o my_kitti/ -- --save-images --reindex --allow-attrs
Example 3. Create a custom dataset
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Cuboid3d, DatasetItem
dataset = Dataset.from_iterable([
DatasetItem(id='frame_1',
annotations=[
Cuboid3d(id=206, label=0,
position=[320.86, 979.18, 1.04],
attributes={'occluded': False, 'track_id': 1, 'x': 1}),
Cuboid3d(id=207, label=1,
position=[318.19, 974.65, 1.29],
attributes={'occluded': True, 'track_id': 2}),
],
pcd='path/to/pcd1.pcd',
attributes={'frame': 0, 'description': 'zzz'}
),
DatasetItem(id='frm2',
annotations=[
Cuboid3d(id=208, label=1,
position=[23.04, 8.75, -0.78],
attributes={'occluded': False, 'track_id': 2})
],
pcd='path/to/pcd2.pcd', related_images=['image2.png'],
attributes={'frame': 1}
),
], categories=['cat', 'dog'])
dataset.export('my_dataset/', format='sly_pointcloud', save_images=True,
allow_undeclared_attrs=True)
Examples of using this format from the code can be found in
the format tests
5.13 - YOLO
-
The YOLO dataset format is for training and validating object detection
models. Specification for this format available
here.
And also you can find some official examples on working with YOLO dataset
here;
-
The YOLO dataset format support the following types of annotations:
-
YOLO format doesn’t support attributes for annotations;
-
The format only supports subsets named train
or valid
.
Load YOLO dataset
Few ways to create Datumaro project and add YOLO dataset to it:
datum import -o project -f yolo -i <path/to/yolo/dataset>
# another way to do the same:
datum create -o project
datum add path -p project -f yolo -i <path/to/yolo/dataset>
# and you can add another one yolo dataset:
datum add path -p project -f yolo -i <path/to/other/yolo/dataset>
YOLO dataset directory should have the following structure:
└─ yolo_dataset/
│
├── obj.names # file with list of classes
├── obj.data # file with dataset information
├── train.txt # list of image paths in train subset
├── valid.txt # list of image paths in valid subset
│
├── obj_train_data/ # directory with annotations and images for train subset
│ ├── image1.txt # list of labeled bounding boxes for image1
│ ├── image1.jpg
│ ├── image2.txt
│ ├── image2.jpg
│ ├── ...
│
├── obj_valid_data/ # directory with annotations and images for valid subset
│ ├── image101.txt
│ ├── image101.jpg
│ ├── image102.txt
│ ├── image102.jpg
│ ├── ...
YOLO dataset cannot contain a subset with a name other than train
or valid
.
If imported dataset contains such subsets, they will be ignored.
If you are exporting a project into yolo format,
all subsets different from train
and valid
will be skipped.
If there is no subset separation in a project, the data
will be saved in train
subset.
obj.data
should have the following content, it is not necessary to have both
subsets, but necessary to have one of them:
classes = 5 # optional
names = <path/to/obj.names>
train = <path/to/train.txt>
valid = <path/to/valid.txt>
backup = backup/ # optional
obj.names
contain list of classes.
The line number for the class is the same as its index:
label1 # label1 has index 0
label2 # label2 has index 1
label3 # label2 has index 2
...
- Files
train.txt
and valid.txt
should have the following structure:
<path/to/image1.jpg>
<path/to/image2.jpg>
...
- Files in directories
obj_train_data/
and obj_valid_data/
should contain information about labeled bounding boxes
for images:
# image1.txt:
# <label_index> <x_center> <y_center> <width> <height>
0 0.250000 0.400000 0.300000 0.400000
3 0.600000 0.400000 0.400000 0.266667
Here x_center
, y_center
, width
, and height
are relative to the image’s
width and height. The x_center
and y_center
are center of rectangle
(are not top-left corner).
Datumaro can convert YOLO dataset into any other format
Datumaro supports.
For successful conversion the output format should support
object detection task (e.g. Pascal VOC, COCO, TF Detection API etc.)
Examples:
datum import -o project -f yolo -i <path/to/yolo/dataset>
datum export -p project -f voc -o <path/to/output/voc/dataset>
datum convert -if yolo -i <path/to/yolo/dataset> \
-f coco_instances -o <path/to/output/coco/dataset>
Datumaro can convert an existing dataset to YOLO format,
if the dataset supports object detection task.
Example:
datum import -p project -f coco_instances -i <path/to/coco/dataset>
datum export -p project -f yolo -o <path/to/output/yolo/dataset> -- --save-images
Extra options for export to YOLO format:
--save-images
allow to export dataset with saving images
(default: False
);
--image-ext <IMAGE_EXT>
allow to specify image extension
for exporting dataset (default: use original or .jpg
, if none).
Examples
datum import -o project -f voc -i ./VOC2012
datum filter -p project -e '/item[subset="train" or subset="val"]' -o trainval_voc
datum transform -p trainval_voc -o trainvalid_voc \
-t map_subsets -- -s train:train -s val:valid
datum export -p trainvalid_voc -f yolo -o ./yolo_dataset -- --save-images
Example 2. Remove some class from YOLO dataset
Delete all items, which contain cat
objects and remove
cat
from list of classes:
datum import -o project -f yolo -i ./yolo_dataset
datum filter -p project -o filtered -m i+a -e '/item/annotation[label!="cat"]'
datum transform -p filtered -o without_cat -t remap_labels -- -l cat:
datum export -p without_cat -f yolo -o ./yolo_without_cats
import numpy as np
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import Bbox, DatasetItem
dataset = Dataset.from_iterable([
DatasetItem(id='image_001', subset='train',
image=np.ones((20, 20, 3)),
annotations=[
Bbox(3.0, 1.0, 8.0, 5.0, label=1),
Bbox(1.0, 1.0, 10.0, 1.0, label=2)
]
),
DatasetItem(id='image_002', subset='train',
image=np.ones((15, 10, 3)),
annotations=[
Bbox(4.0, 4.0, 4.0, 4.0, label=3)
]
)
], categories=['house', 'bridge', 'crosswalk', 'traffic_light'])
dataset.export('../yolo_dataset', format='yolo', save_images=True)
If you only want information about label names for each
images, then you can get it from code:
from datumaro.components.dataset import Dataset
from datumaro.components.extractor import AnnotationType
dataset = Dataset.import_from('./yolo_dataset', format='yolo')
cats = dataset.categories()[AnnotationType.label]
for item in dataset:
for ann in item.annotations:
print(item.id, cats[ann.label].name)
And If you want complete information about each items you can run:
datum import -o project -f yolo -i ./yolo_dataset
datum filter -p project --dry-run -e '/item'
6 - Plugins
6.1 - OpenVINO™ Inference Interpreter
Interpreter samples to parse OpenVINO™ inference outputs. This section on
GitHub
Models supported from interpreter samples
There are detection and image classification examples.
-
Detection (SSD-based)
-
Image Classification
- Public Pre-Trained Models(OMZ) > Classification
You can find more OpenVINO™ Trained Models
here
To run the inference with OpenVINO™, the model format should be Intermediate
Representation(IR).
For the Caffe/TensorFlow/MXNet/Kaldi/ONNX models, please see the Model Conversion Instruction
You need to implement your own interpreter samples to support the other
OpenVINO™ Trained Models.
Model download
-
Prerequisites
# cd <openvino_dir>/deployment_tools/open_model_zoo/tools/downloader
# ./downloader.py --name <model_name>
#
# Examples
cd /opt/intel/openvino/deployment_tools/open_model_zoo/tools/downloader
./downloader.py --name face-detection-0200
Model inference
7 - Contribution Guide
Installation
Prerequisites
git clone https://github.com/openvinotoolkit/datumaro
Optionally, install a virtual environment (recommended):
python -m pip install virtualenv
python -m virtualenv venv
. venv/bin/activate
Then install all dependencies:
while read -r p; do pip install $p; done < requirements.txt
If you’re working inside of a CVAT environment:
. .env/bin/activate
while read -r p; do pip install $p; done < datumaro/requirements.txt
Install Datumaro:
pip install -e /path/to/the/cloned/repo/
Optional dependencies
These components are only required for plugins and not installed by default:
- OpenVINO
- Accuracy Checker
- TensorFlow
- PyTorch
- MxNet
- Caffe
Usage
datum --help
python -m datumaro --help
python datumaro/ --help
python datum.py --help
Code style
Try to be readable and consistent with the existing codebase.
The project mostly follows PEP8 with little differences.
Continuation lines have a standard indentation step by default,
or any other, if it improves readability. For long conditionals use 2 steps.
No trailing whitespaces, 80 characters per line.
Example:
def do_important_work(parameter1, parameter2, parameter3,
option1=None, option2=None, option3=None) -> str:
"""
Optional description. Mandatory for API.
Use comments for implementation specific information, use docstrings
to give information to user / developer.
Returns: status (str) - Possible values: 'done', 'failed'
"""
... do stuff ...
# Use +1 level of indentation for continuation lines
variable_with_a_long_but_meaningful_name = \
function_with_a_long_but_meaningful_name(arg1, arg2, arg3,
kwarg1=value_with_a_long_name, kwarg2=value_with_a_long_name)
# long conditions, loops, with etc. also use +1 level of indentation
if condition1 and long_condition2 or \
not condition3 and condition4 and condition5 or \
condition6 and condition7:
... do other stuff ...
elif other_conditions:
... some other things ...
# in some cases special formatting can improve code readability
specific_case_formatting = np.array([
[0, 1, 1, 0],
[1, 1, 0, 0],
[1, 1, 0, 1],
], dtype=np.int32)
return status
Environment
The recommended editor is VS Code with the Python language plugin.
Testing
It is expected that all Datumaro functionality is covered and checked by
unit tests. Tests are placed in the tests/
directory. Additional
pre-generated files for tests can be stored in the tests/assets/
directory.
CLI tests are separated from the core tests, they are stored in the
tests/cli/
directory.
Currently, we use pytest
for testing, but we
also compatible with unittest
.
To run tests use:
pytest -v
# or
python -m pytest -v
Test cases
Test marking
For better integration with CI and requirements tracking,
we use special annotations for tests.
A test needs to linked with a requirement it is related to. To link a
test, use:
from unittest import TestCase
from .requirements import Requirements, mark_requirement
class MyTests(TestCase):
@mark_requirement(Requirements.DATUM_GENERAL_REQ)
def test_my_requirement(self):
... do stuff ...
Such marking will apply markings from the requirement specified.
They can be overridden for a specific test:
import pytest
@pytest.mark.proirity_low
@mark_requirement(Requirements.DATUM_GENERAL_REQ)
def test_my_requirement(self):
... do stuff ...
Requirements
Requirements and other links need to be added to tests/requirements.py
:
DATUM_244 = "Add Snyk integration"
DATUM_BUG_219 = "Return format is not uniform"
# Fully defined in GitHub issues:
@pytest.mark.reqids(Requirements.DATUM_244, Requirements.DATUM_333)
# And defined any other way:
@pytest.mark.reqids(Requirements.DATUM_GENERAL_REQ)
Available annotations for tests and requirements
Markings are defined in tests/conftest.py
.
A list of requirements and bugs
@pytest.mark.requids(Requirements.DATUM_123)
@pytest.mark.bugs(Requirements.DATUM_BUG_456)
A priority
@pytest.mark.priority_low
@pytest.mark.priority_medium
@pytest.mark.priority_high
Component
The marking used for indication of different system components
@pytest.mark.components(DatumaroComponent.Datumaro)
Skipping tests
@pytest.mark.skip(SkipMessages.NOT_IMPLEMENTED)
Parametrized runs
Parameters are used for running the same test with different parameters e.g.
@pytest.mark.parametrize("numpy_array, batch_size", [
(np.zeros([2]), 0),
(np.zeros([2]), 1),
(np.zeros([2]), 2),
(np.zeros([2]), 5),
(np.zeros([5]), 2),
])
Test documentation
Tests are documented with docs strings. Test descriptions must contain
the following: sections: Description
, Expected results
and Steps
.
def test_can_convert_polygons_to_mask(self):
"""
<b>Description:</b>
Ensure that the dataset polygon annotation can be properly converted
into dataset segmentation mask.
<b>Expected results:</b>
Dataset segmentation mask converted from dataset polygon annotation
is equal to an expected mask.
<b>Steps:</b>
1. Prepare dataset with polygon annotation
2. Prepare dataset with expected mask segmentation mode
3. Convert source dataset to target, with conversion of annotation
from polygon to mask.
4. Verify that resulting segmentation mask is equal to the expected mask.
"""
8 - Release notes
Notes about the release of the developed version can be
read in the CHANGELOG.md of the develop branch.