Anelia Angelova

25
Papers
279
Total Citations

Papers (25)

On Scaling Up a Multilingual Vision and Language Model

CVPR 2024
254
citations

Mirasol3B: A Multimodal Autoregressive Model for Time-Aligned and Contextual Modalities

CVPR 2024
25
citations

Evolving Losses for Unsupervised Video Representation Learning

CVPR 2020arXiv
0
citations

KeyPose: Multi-View 3D Labeling and Keypoint Estimation for Transparent Objects

CVPR 2020arXiv
0
citations

SMURF: Self-Teaching Multi-Frame Unsupervised RAFT With Full-Image Warping

CVPR 2021arXiv
0
citations

Taskology: Utilizing Task Relations at Scale

CVPR 2021arXiv
0
citations

Region-Aware Pretraining for Open-Vocabulary Object Detection With Vision Transformers

CVPR 2023arXiv
0
citations

Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning

CVPR 2023arXiv
0
citations

Evolving Space-Time Neural Architectures for Videos

ICCV 2019
0
citations

Depth From Videos in the Wild: Unsupervised Monocular Depth Learning From Unknown Cameras

ICCV 2019
0
citations

ShapeMask: Learning to Segment Novel Objects by Refining Shape Priors

ICCV 2019
0
citations

4D-Net for Learned Multi-Modal Alignment

ICCV 2021arXiv
0
citations

Patch2CAD: Patchwise Embedding Learning for In-the-Wild Shape Retrieval From a Single Image

ICCV 2021arXiv
0
citations

Contrastive Feature Masking Open-Vocabulary Vision Transformer

ICCV 2023arXiv
0
citations

Adversarial Generative Grammars for Human Activity Prediction

ECCV 2020
0
citations

What Matters in Unsupervised Optical Flow

ECCV 2020
0
citations

Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve

ECCV 2020
0
citations

AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification

ECCV 2020
0
citations

AssembleNet++: Assembling Modality Representations via Attention Connections - Supplementary Material -

ECCV 2020
0
citations

Video Question Answering with Iterative Video-Text Co-Tokenization

ECCV 2022
0
citations

FindIt: Generalized Localization with Natural Language Queries

ECCV 2022
0
citations

VideoComp: Advancing Fine-Grained Compositional and Temporal Alignment in Video-Text Models

CVPR 2025
0
citations

Unsupervised Learning of Depth and Ego-Motion From Monocular Video Using 3D Geometric Constraints

CVPR 2018arXiv
0
citations

TokenLearner: Adaptive Space-Time Tokenization for Videos

NeurIPS 2021
0
citations

Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs

ICML 2017
0
citations