Poster Papers

24,624 papers found • Page 487 of 493

VETRA: A Dataset for Vehicle Tracking in Aerial Imagery - New Challenges for Multi-Object Tracking

Jens Hellekes, Manuel Mühlhaus, Reza Bahmanyar et al.

ECCV 2024poster
3
citations

VFLAIR: A Research Library and Benchmark for Vertical Federated Learning

TIANYUAN ZOU, Zixuan GU, Yu He et al.

ICLR 2024poster

VF-NeRF: Viewshed Fields for Rigid NeRF Registration

Leo Segre, Shai Avidan

ECCV 2024posterarXiv:2404.03349
2
citations

VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models

Junlin Han, Filippos Kokkinos, Philip Torr

ECCV 2024poster
52
citations

V?: Guided Visual Search as a Core Mechanism in Multimodal LLMs

Penghao Wu, Saining Xie

CVPR 2024poster
327
citations

ViC-MAE: Self-Supervised Representation Learning from Images and Video with Contrastive Masked Autoencoders

Jefferson Hernandez, Ruben Villegas, Vicente Ordonez

ECCV 2024poster
11
citations

VicTR: Video-conditioned Text Representations for Activity Recognition

Kumara Kahatapitiya, Anurag Arnab, Arsha Nagrani et al.

CVPR 2024poster
36
citations

ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation

Jiaming Liu, Senqiao Yang, Peidong Jia et al.

ICLR 2024poster

Video2Game: Real-time Interactive Realistic and Browser-Compatible Environment from a Single Video

Hongchi Xia, Chih-Hao Lin, Wei-Chiu Ma et al.

CVPR 2024poster

VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding

Yue Fan, Xiaojian Ma, Rujie Wu et al.

ECCV 2024poster

VideoAgent: Long-form Video Understanding with Large Language Model as Agent

Xiaohan Wang, Yuhui Zhang, Orr Zohar et al.

ECCV 2024poster

Video-Based Human Pose Regression via Decoupled Space-Time Aggregation

Jijie He, Wenwu Yang

CVPR 2024poster

VideoBooth: Diffusion-based Video Generation with Image Prompts

Yuming Jiang, Tianxing Wu, Shuai Yang et al.

CVPR 2024poster
118
citations

VideoClusterNet: Self-Supervised and Adaptive Face Clustering for Videos

Devesh Bilwakumar Walawalkar, Pablo Garrido

ECCV 2024posterarXiv:2407.12214
3
citations

VideoCon: Robust Video-Language Alignment via Contrast Captions

Hritik Bansal, Yonatan Bitton, Idan Szpektor et al.

CVPR 2024poster
28
citations

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Haoxin Chen, Yong Zhang, Xiaodong Cun et al.

CVPR 2024poster

VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation

XuDong Wang, Ishan Misra, Ziyun Zeng et al.

CVPR 2024poster
36
citations

Video Decomposition Prior: Editing Videos Layer by Layer

Gaurav Shrivastava, Ser-Nam Lim, Abhinav Shrivastava

ICLR 2024poster

Video Editing via Factorized Diffusion Distillation

Uriel Singer, Amit Zohar, Yuval Kirstain et al.

ECCV 2024posterarXiv:2403.09334
28
citations

Video Frame Interpolation via Direct Synthesis with the Event-based Reference

Yuhan Liu, Yongjian Deng, Hao Chen et al.

CVPR 2024poster

VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding

Syed Talal Wasim, Muzammal Naseer, Salman Khan et al.

CVPR 2024poster

Video Harmonization with Triplet Spatio-Temporal Variation Patterns

Zonghui Guo, XinYu Han, Jie Zhang et al.

CVPR 2024poster

Video Interpolation with Diffusion Models

Siddhant Jain, Daniel Watson, Aleksander Holynski et al.

CVPR 2024poster
63
citations

Video Language Planning

Yilun Du, Sherry Yang, Pete Florence et al.

ICLR 2024poster
144
citations

VideoLLM-online: Online Video Large Language Model for Streaming Video

Joya Chen, Zhaoyang Lv, Shiwei Wu et al.

CVPR 2024poster
109
citations

VideoMAC: Video Masked Autoencoders Meet ConvNets

Gensheng Pei, Tao Chen, Xiruo Jiang et al.

CVPR 2024poster
20
citations

VideoMamba: Spatio-Temporal Selective State Space Model

Jinyoung Park, Hee-Seon Kim, Kangwook Ko et al.

ECCV 2024poster
23
citations

VideoMamba: State Space Model for Efficient Video Understanding

Kunchang Li, Xinhao Li, Yi Wang et al.

ECCV 2024posterarXiv:2403.06977
401
citations

Video-P2P: Video Editing with Cross-attention Control

Shaoteng Liu, Yuechen Zhang, Wenbo Li et al.

CVPR 2024poster
309
citations

VideoPoet: A Large Language Model for Zero-Shot Video Generation

Dan Kondratyuk, Lijun Yu, Xiuye Gu et al.

ICML 2024poster

Video Prediction by Modeling Videos as Continuous Multi-Dimensional Processes

Gaurav Shrivastava, Abhinav Shrivastava

CVPR 2024poster
16
citations

VideoPrism: A Foundational Visual Encoder for Video Understanding

Long Zhao, Nitesh Bharadwaj Gundavarapu, Liangzhe Yuan et al.

ICML 2024poster

Video Question Answering with Procedural Programs

Rohan Choudhury, Koichiro Niinuma, Kris Kitani et al.

ECCV 2024posterarXiv:2312.00937
37
citations

Video ReCap: Recursive Captioning of Hour-Long Videos

Md Mohaiminul Islam, Vu Bao Ngan Ho, Xitong Yang et al.

CVPR 2024poster
82
citations

Video Recognition in Portrait Mode

Mingfei Han, Linjie Yang, Xiaojie Jin et al.

CVPR 2024poster

VideoRF: Rendering Dynamic Radiance Fields as 2D Feature Video Streams

Liao Wang, Kaixin Yao, Chengcheng Guo et al.

CVPR 2024poster
21
citations

Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion

Xiang Fan, Anand Bhattad, Ranjay Krishna

ECCV 2024poster
23
citations

VideoStudio: Generating Consistent-Content and Multi-Scene Videos

Fuchen Long, Zhaofan Qiu, Ting Yao et al.

ECCV 2024poster

Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention

Xingyu Zhou, Leheng Zhang, Xiaorui Zhao et al.

CVPR 2024poster

VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence

Yuchao Gu, Yipin Zhou, Bichen Wu et al.

CVPR 2024poster
63
citations

VidLA: Video-Language Alignment at Scale

Mamshad Nayeem Rizve, Fan Fei, Jayakrishnan Unnikrishnan et al.

CVPR 2024poster
8
citations

vid-TLDR: Training Free Token Merging for Light-weight Video Transformer

Joonmyung Choi, Sanghyeok Lee, Jaewon Chu et al.

CVPR 2024poster

VidToMe: Video Token Merging for Zero-Shot Video Editing

Xirui Li, Chao Ma, Xiaokang Yang et al.

CVPR 2024poster
89
citations

View-Consistent 3D Editing with Gaussian Splatting

Yuxuan Wang, Xuanyu Yi, Zike Wu et al.

ECCV 2024poster

View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields

Haodi He, Colton Stearns, Adam Harley et al.

ECCV 2024poster
4
citations

View-decoupled Transformer for Person Re-identification under Aerial-ground Camera Network

Quan Zhang, Lei Wang, Vishal M. Patel et al.

CVPR 2024poster
32
citations

ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models

Lukas Höllein, Aljaž Božič, Norman Müller et al.

CVPR 2024poster

ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers

Jinke Li, Xiao He, Chonghua Zhou et al.

ECCV 2024poster
26
citations

View From Above: Orthogonal-View aware Cross-view Localization

Shan Wang, Chuong Nguyen, Jiawei Liu et al.

CVPR 2024poster

ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

Xianghui Yang, Gil Avraham, Yan Zuo et al.

CVPR 2024poster