random sampler module
- class datumaro.plugins.sampler.random_sampler.RandomSampler(extractor: datumaro.components.extractor.IExtractor, count: int, *, subset: Optional[str] = None, seed: Optional[int] = None)[source]
Bases:
datumaro.components.extractor.Transform
,datumaro.components.cli_plugin.CliPlugin
Sampler that keeps no more than required number of items in the dataset.
Notes: - Items are selected uniformly - Requesting a sample larger than the number of all images will return all images
- Example: select subset of 20 images randomly
%(prog)s -k 20
- Example: select subset of 20 images, modify only ‘train’ subset
%(prog)s -k 20 -s train
- class datumaro.plugins.sampler.random_sampler.LabelRandomSampler(extractor: datumaro.components.extractor.IExtractor, *, count: Optional[int] = None, label_counts: Optional[Mapping[str, int]] = None, seed: Optional[int] = None)[source]
Bases:
datumaro.components.extractor.Transform
,datumaro.components.cli_plugin.CliPlugin
Sampler that keeps at least the required number of annotations of each class in the dataset for each subset separately.
Consider using the “stats” command to get class distribution in the dataset.
Notes: - Items can contain annotations of several selected classes (e.g. 3 bounding boxes per image). The number of annotations in the resulting dataset varies between max(class counts) and sum(class counts) - If the input dataset does not has enough class annotations, the result will contain only what is available - Items are selected uniformly - For reasons above, the resulting class distribution in the dataset may not be the same as requested - The resulting dataset will only keep annotations for classes with specified count > 0
- Example: select at least 5 annotations of each class randomly
%(prog)s -k 5
- Example: select at least 5 images with “cat” annotations and 3 “person”
%(prog)s -l “cat:5” -l “person:3”