Video Annotator (VA) is a framework designed to address challenges in training machine learning classifiers by leveraging active learning and vision-language models. It integrates a continuous annotation process involving domain experts to improve sample efficiency and reduce costs. VA enables rapid deployment and quality monitoring of models, empowering users to swiftly address edge cases and fostering a sense of ownership and trust in the system. Experiments indicate VA significantly outperforms baseline methods in video classification tasks.
Table of contents
Video annotator: a framework for efficiently building video classifiers using vision-language models and active learningIntroductionVideo understandingVideo Annotator (VA)ExperimentsConclusionSort: