Getting started
To read about the design concept and features of Datumaro, go to the design section.
Installation
Dependencies
- Python (3.7+)
- Optional: OpenVINO, TensorFlow, PyTorch, MxNet, Caffe, Accuracy Checker, Git
Optionally, create a virtual environment:
Install Datumaro package:
Read full installation instructions in the user manual.
Usage
There are several options available:
Standalone tool
Datuaro as a standalone tool allows to do various dataset operations from the command line interface:
Python module
Datumaro can be used in custom scripts as a Python module. Used this way, it allows to use its features from an existing codebase, enabling dataset reading, exporting and iteration capabilities, simplifying integration of custom formats and providing high performance operations:
List of components with the comfortable importing.
Check our developer manual for additional information.
Examples
-
Convert PASCAL VOC dataset to COCO format, keep only images with
cat
class presented: -
Convert only non-
occluded
annotations from a CVAT project to TFrecord: -
Annotate MS COCO dataset, extract image subset, re-annotate it in CVAT, update old dataset:
-
Annotate instance polygons in CVAT, export as masks in COCO:
-
Apply an OpenVINO detection model to some COCO-like dataset, then compare annotations with ground truth and visualize in TensorBoard:
-
Change colors in PASCAL VOC-like
.png
masks: -
Create a custom COCO-like dataset: