Dataset Management Framework Documentation

Welcome to the documentation for the Dataset Management Framework (Datumaro).

The Datumaro is a free framework and CLI tool for building, transforming, and analyzing datasets. It is developed and used by Intel to build, transform, and analyze annotations and datasets in a large number of supported formats.

Our documentation provides information for AI researchers, developers, and teams, who are working with datasets and annotations.

flowchart LR
    datasets[(VOC dataset<br/>+<br/>COCO datset<br/>+<br/>CVAT annotation)]
    datumaro{Datumaro}
    dataset[dataset]
    annotation[Annotation tool]
    training[Model training]
    publication[Publication, statistics etc]
    datasets-->datumaro
    datumaro-->dataset
    dataset-->annotation & training & publication

Getting started

Basic information and sections needed for a quick start.

User Manual

This section contains documents for Datumaro users.

Developer Manual

Documentation for Datumaro developers.