Transform Project
This command allows to modify images or annotations in a project all at once.
datum transform --help
datum transform \
-p <project_dir> \
-o <output_dir> \
-t <transform_name> \
-- [extra transform options]
Example: split a dataset randomly to train and test subsets, ratio is 2:1
datum transform -t random_split -- --subset train:.67 --subset test:.33
Example: split a dataset in task-specific manner. The tasks supported are classification, detection, segmentation and re-identification.
datum transform -t split -- \
-t classification --subset train:.5 --subset val:.2 --subset test:.3
datum transform -t split -- \
-t detection --subset train:.5 --subset val:.2 --subset test:.3
datum transform -t split -- \
-t segmentation --subset train:.5 --subset val:.2 --subset test:.3
datum transform -t split -- \
-t reid --subset train:.5 --subset val:.2 --subset test:.3 \
--query .5
Example: convert polygons to masks, masks to boxes etc.:
datum transform -t boxes_to_masks
datum transform -t masks_to_polygons
datum transform -t polygons_to_masks
datum transform -t shapes_to_boxes
Example: remap dataset labels, person to car and cat to dog,
keep bus, remove others
datum transform -t remap_labels -- \
-l person:car -l bus:bus -l cat:dog \
--default delete
Example: rename dataset items by a regular expression
- Replace
patternwithreplacement - Remove
frame_from item ids
datum transform -t rename -- -e '|pattern|replacement|'
datum transform -t rename -- -e '|frame_(\d+)|\\1|'
Example: sampling dataset items as many as the number of target samples with
sampling method entered by the user, divide into sampled and unsampled
subsets
- There are five methods of sampling the m option.
topk: Return the k with high uncertainty datalowk: Return the k with low uncertainty datarandk: Return the random k datamixk: Return half to topk method and the rest to lowk methodrandtopk: First, select 3 times the number of k randomly, and return the topk among them.
datum transform -t sampler -- \
-a entropy \
-i train \
-o sampled \
-u unsampled \
-m topk \
-k 20
Example : control number of outputs to 100 after NDR
- There are two methods in NDR e option
random: sample from removed data randomlysimilarity: sample from removed data with ascending
- There are two methods in NDR u option
uniform: sample data with uniform distributioninverse: sample data with reciprocal of the number
datum transform -t ndr -- \
-w train \
-a gradient \
-k 100 \
-e random \
-u uniform