Filter project
This command allows to create a sub-Project from a Project. The new project includes only items satisfying some condition. XPath is used as a query format.
There are several filtering modes available (-m/--mode
parameter).
Supported modes:
i
,items
a
,annotations
i+a
,a+i
,items+annotations
,annotations+items
When filtering annotations, use the items+annotations
mode to point that annotation-less dataset items should be
removed. To select an annotation, write an XPath that
returns annotation
elements (see examples).
Usage:
datum filter --help
datum filter \
-p <project dir> \
-e '<xpath filter expression>'
Example: extract a dataset with only images which width
< height
datum filter \
-p test_project \
-e '/item[image/width < image/height]'
Example: extract a dataset with only images of subset train
.
datum project filter \
-p test_project \
-e '/item[subset="train"]'
Example: extract a dataset with only large annotations of class cat
and any
non-persons
datum filter \
-p test_project \
--mode annotations -e '/item/annotation[(label="cat" and area > 99.5) or label!="person"]'
Example: extract a dataset with only occluded annotations, remove empty images
datum filter \
-p test_project \
-m i+a -e '/item/annotation[occluded="True"]'
Item representations are available with --dry-run
parameter:
<item>
<id>290768</id>
<subset>minival2014</subset>
<image>
<width>612</width>
<height>612</height>
<depth>3</depth>
</image>
<annotation>
<id>80154</id>
<type>bbox</type>
<label_id>39</label_id>
<x>264.59</x>
<y>150.25</y>
<w>11.199999999999989</w>
<h>42.31</h>
<area>473.87199999999956</area>
</annotation>
<annotation>
<id>669839</id>
<type>bbox</type>
<label_id>41</label_id>
<x>163.58</x>
<y>191.75</y>
<w>76.98999999999998</w>
<h>73.63</h>
<area>5668.773699999998</area>
</annotation>
...
</item>