Oral "video representation learning" Papers
2 papers found
VQToken: Neural Discrete Token Representation Learning for Extreme Token Reduction in Video Large Language Models
Haichao Zhang, Yun Fu
NeurIPS 2025oralarXiv:2503.16980
3
citations
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens
Sunil Hwang, Jaehong Yoon, Youngwan Lee et al.
ICML 2024oral