**Vision-Language Model**, **ICCV 2023** Self-regularization for foundational vision-language models during fine-tuning.
Jul 13, 2023
**Visual-Spatial and Temporal Perception**, **ICCV 2023** Spatio-temporal focal modulation for video recognition is an efficient network.
Jul 13, 2023
**Vision-Language Model**, **CVPR 2023** Adapting vision language Foundational models like CLIP for video recognition.
Feb 27, 2023