sampler module

class datumaro.plugins.sampler.sampler.Sampler(extractor, algorithm, input_subset, sampled_subset, unsampled_subset, sampling_method, count, output_file)[source]

Bases: datumaro.components.extractor.Transform, datumaro.components.cli_plugin.CliPlugin

Sampler that analyzes model inference results on the dataset and picks the best sample for training.

Notes: - Each image’s inference result must contain the probability for all classes. - Requesting a sample larger than the number of all images will return all images.

Example: select the most relevant data subset of 20 images

based on model certainty, put the result into ‘sample’ subset and put all the rest into ‘unsampled’ subset, use ‘train’ subset as input. %(prog)s

–algorithm entropy –subset_name train –sample_name sample –unsampled_name unsampled –sampling_method topk -k 20

classmethod build_cmdline_parser(**kwargs)[source]