PIVOT: Prompting for Video Continual Learning
School authors:
author photo
Álvaro Soto
External authors:
  • Andres Villa ( Pontificia Universidad Catolica de Chile , King Abdullah University of Science & Technology )
  • Juan Leon Alcazar ( King Abdullah University of Science & Technology )
  • Motasem Alfarra ( King Abdullah University of Science & Technology )
  • Kumail Alhamoud ( King Abdullah University of Science & Technology )
  • Julio Hurtado ( University of Pisa )
  • Fabian Caba Heilbron ( Adobe Systems Inc. )
  • Bernard Ghanem ( King Abdullah University of Science & Technology )
Abstract:

Modern machine learning pipelines are limited due to data availability, storage quotas, privacy regulations, and expensive annotation processes. These constraints make it difficult or impossible to train and update large-scale models on such dynamic annotated sets. Continual learning directly approaches this problem, with the ultimate goal of devising methods where a deep neural network effectively learns relevant patterns for new (unseen) classes, without significantly altering its performance on previously learned ones. In this paper, we address the problem of continual learning for video data. We introduce PIVOT, a novel method that leverages extensive knowledge in pre-trained models from the image domain, thereby reducing the number of trainable parameters and the associated forgetting. Unlike previous methods, ours is the first approach that effectively uses prompting mechanisms for continual learning without any in-domain pre-training. Our experiments show that PIVOT improves state-of-the-art methods by a significant 27% on the 20-task ActivityNet setup.

UT WOS:001062531308053
Number of Citations 29
Type
Pages 24214-24223
ISSUE
Volume
Month of Publication
Year of Publication 2023
DOI https://doi.org/10.1109/CVPR52729.2023.02319
ISSN
ISBN