Christoph Feichtenhofer
37
Papers
961
Total Citations
Papers (37)
Spatiotemporal Residual Networks for Video Action Recognition
NeurIPS 2016arXiv
741
citations
Demystifying CLIP Data
ICLR 2024
205
citations
An Empirical Study of Autoregressive Pre-training from Videos
ICCV 2025
15
citations
Temporal Residual Networks for Dynamic Scene Recognition
CVPR 2017
0
citations
Spatiotemporal Multiplier Networks for Video Action Recognition
CVPR 2017
0
citations
What Have We Learned From Deep Representations for Action Recognition?
CVPR 2018arXiv
0
citations
Long-Term Feature Banks for Detailed Video Understanding
CVPR 2019
0
citations
3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training
CVPR 2019
0
citations
A Multigrid Method for Efficiently Training Video Models
CVPR 2020arXiv
0
citations
Ego-Topo: Environment Affordances From Egocentric Video
CVPR 2020
0
citations
X3D: Expanding Architectures for Efficient Video Recognition
CVPR 2020arXiv
0
citations
A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning
CVPR 2021arXiv
0
citations
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition
CVPR 2022arXiv
0
citations
Reversible Vision Transformers
CVPR 2022
0
citations
Masked Feature Prediction for Self-Supervised Visual Pre-Training
CVPR 2022arXiv
0
citations
A ConvNet for the 2020s
CVPR 2022arXiv
0
citations
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
CVPR 2022arXiv
0
citations
Ego4D: Around the World in 3,000 Hours of Egocentric Video
CVPR 2022
0
citations
On the Benefits of 3D Pose and Tracking for Human Action Recognition
CVPR 2023arXiv
0
citations
Scaling Language-Image Pre-Training via Masking
CVPR 2023arXiv
0
citations
Multiview Compressive Coding for 3D Reconstruction
CVPR 2023arXiv
0
citations
Detect to Track and Track to Detect
ICCV 2017arXiv
0
citations
SlowFast Networks for Video Recognition
ICCV 2019
0
citations
Grounded Human-Object Interaction Hotspots From Video
ICCV 2019
0
citations
Multiscale Vision Transformers
ICCV 2021arXiv
0
citations
Multiview Pseudo-Labeling for Semi-Supervised Learning From Video
ICCV 2021arXiv
0
citations
The Effectiveness of MAE Pre-Pretraining for Billion-Scale Pretraining
ICCV 2023arXiv
0
citations
CiT: Curation in Training for Effective Vision-Language Data
ICCV 2023arXiv
0
citations
Diffusion Models as Masked Autoencoders
ICCV 2023arXiv
0
citations
TrackFormer: Multi-Object Tracking With Transformers
CVPR 2022
0
citations
Dynamically Encoded Actions Based on Spacetime Saliency
CVPR 2015
0
citations
Convolutional Two-Stream Network Fusion for Video Action Recognition
CVPR 2016
0
citations
Learning Temporal Pose Estimation from Sparsely-Labeled Videos
NeurIPS 2019
0
citations
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
NeurIPS 2021
0
citations
Masked Autoencoders that Listen
NeurIPS 2022
0
citations
Masked Autoencoders As Spatiotemporal Learners
NeurIPS 2022
0
citations
MAViL: Masked Audio-Video Learners
NeurIPS 2023
0
citations