Describe downloadable datasets
This command reports reports various information about datasets that can be
downloaded with the download
command. The information is reported either as
human-readable text (the default) or as a JSON object. The format can be selected
with the --report-format
option.
When the JSON output format is selected, the output document has the following schema:
{
"<dataset name>": {
"default_output_format": "<Datumaro format name>",
"description": "<human-readable description>",
"download_size": <total size of the downloaded files in bytes>,
"home_url": "<URL of a web page describing the dataset>",
"human_name": "<human-readable dataset name>",
"num_classes": <number of classes in the dataset>,
"subsets": {
"<subset name>": {
"num_items": <number of items in the subset>
},
...
},
"version": "<version number>"
},
...
}
home_url
may be null
if there is no suitable web page for the dataset.
num_classes
may be null
if the dataset does not involve classification.
version
currently contains the version number supplied by TFDS.
In future versions of Datumaro, datasets might come from other sources;
the way version numbers will be set for those is to be determined.
New object members may be added in future versions of Datumaro.
Usage:
datum describe-downloads [-h] [--report-format {text,json}]
[--report-file REPORT_FILE]
Parameters:
-h
,--help
- Print the help message and exit.--report-format
(text
orjson
) - Format in which to report the information. By default,text
is used.--report-file
(string) - File to which to write the report. By default, the report is written to the standard output stream.