2024 Poster Papers

8,865 papers found • Page 172 of 178

Filters:2024 poster Clear all

Conference

AAAI 2025 (3,028)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,140)oral (1,594)spotlight (1,421)highlight (975)

VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding

Yue Fan, Xiaojian Ma, Rujie Wu et al.

ECCV 2024posterarXiv:2403.11481

VideoAgent: Long-form Video Understanding with Large Language Model as Agent

Xiaohan Wang, Yuhui Zhang, Orr Zohar et al.

ECCV 2024posterarXiv:2403.10517

Video-Based Human Pose Regression via Decoupled Space-Time Aggregation

Jijie He, Wenwu Yang

CVPR 2024posterarXiv:2403.19926

VideoBooth: Diffusion-based Video Generation with Image Prompts

Yuming Jiang, Tianxing Wu, Shuai Yang et al.

CVPR 2024posterarXiv:2312.00777

118

citations

VideoClusterNet: Self-Supervised and Adaptive Face Clustering for Videos

Devesh Bilwakumar Walawalkar, Pablo Garrido

ECCV 2024posterarXiv:2407.12214

citations

VideoCon: Robust Video-Language Alignment via Contrast Captions

Hritik Bansal, Yonatan Bitton, Idan Szpektor et al.

CVPR 2024posterarXiv:2311.10111

citations

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Haoxin Chen, Yong Zhang, Xiaodong Cun et al.

CVPR 2024posterarXiv:2401.09047

VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation

XuDong Wang, Ishan Misra, Ziyun Zeng et al.

CVPR 2024posterarXiv:2308.14710

citations

Video Decomposition Prior: Editing Videos Layer by Layer

Gaurav Shrivastava, Ser-Nam Lim, Abhinav Shrivastava

ICLR 2024poster

Video Editing via Factorized Diffusion Distillation

Uriel Singer, Amit Zohar, Yuval Kirstain et al.

ECCV 2024posterarXiv:2403.09334

citations

Video Frame Interpolation via Direct Synthesis with the Event-based Reference

Yuhan Liu, Yongjian Deng, Hao Chen et al.

CVPR 2024poster

VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding

Syed Talal Wasim, Muzammal Naseer, Salman Khan et al.

CVPR 2024poster

Video Harmonization with Triplet Spatio-Temporal Variation Patterns

Zonghui Guo, XinYu Han, Jie Zhang et al.

CVPR 2024poster

Video Interpolation with Diffusion Models

Siddhant Jain, Daniel Watson, Aleksander Holynski et al.

CVPR 2024posterarXiv:2404.01203

citations

Video Language Planning

Yilun Du, Sherry Yang, Pete Florence et al.

ICLR 2024posterarXiv:2310.10625

144

citations

VideoLLM-online: Online Video Large Language Model for Streaming Video

Joya Chen, Zhaoyang Lv, Shiwei Wu et al.

CVPR 2024posterarXiv:2406.11816

109

citations

VideoMAC: Video Masked Autoencoders Meet ConvNets

Gensheng Pei, Tao Chen, Xiruo Jiang et al.

CVPR 2024posterarXiv:2402.19082

citations

VideoMamba: Spatio-Temporal Selective State Space Model

Jinyoung Park, Hee-Seon Kim, Kangwook Ko et al.

ECCV 2024posterarXiv:2407.08476

citations

VideoMamba: State Space Model for Efficient Video Understanding

Kunchang Li, Xinhao Li, Yi Wang et al.

ECCV 2024posterarXiv:2403.06977

401

citations

Video-P2P: Video Editing with Cross-attention Control

Shaoteng Liu, Yuechen Zhang, Wenbo Li et al.

CVPR 2024posterarXiv:2303.04761

309

citations

VideoPoet: A Large Language Model for Zero-Shot Video Generation

Dan Kondratyuk, Lijun Yu, Xiuye Gu et al.

ICML 2024posterarXiv:2312.14125

Video Prediction by Modeling Videos as Continuous Multi-Dimensional Processes

Gaurav Shrivastava, Abhinav Shrivastava

CVPR 2024poster

citations

VideoPrism: A Foundational Visual Encoder for Video Understanding

Long Zhao, Nitesh Bharadwaj Gundavarapu, Liangzhe Yuan et al.

ICML 2024posterarXiv:2402.13217

Video Question Answering with Procedural Programs

Rohan Choudhury, Koichiro Niinuma, Kris Kitani et al.

ECCV 2024posterarXiv:2312.00937

citations

Video ReCap: Recursive Captioning of Hour-Long Videos

Md Mohaiminul Islam, Vu Bao Ngan Ho, Xitong Yang et al.

CVPR 2024posterarXiv:2402.13250

citations

Video Recognition in Portrait Mode

Mingfei Han, Linjie Yang, Xiaojie Jin et al.

CVPR 2024posterarXiv:2312.13746

VideoRF: Rendering Dynamic Radiance Fields as 2D Feature Video Streams

Liao Wang, Kaixin Yao, Chengcheng Guo et al.

CVPR 2024posterarXiv:2312.01407

citations

Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion

Xiang Fan, Anand Bhattad, Ranjay Krishna

ECCV 2024posterarXiv:2403.14617

citations

VideoStudio: Generating Consistent-Content and Multi-Scene Videos

Fuchen Long, Zhaofan Qiu, Ting Yao et al.

ECCV 2024posterarXiv:2401.01256

Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention

Xingyu Zhou, Leheng Zhang, Xiaorui Zhao et al.

CVPR 2024posterarXiv:2401.06312

VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence

Yuchao Gu, Yipin Zhou, Bichen Wu et al.

CVPR 2024posterarXiv:2312.02087

citations

VidLA: Video-Language Alignment at Scale

Mamshad Nayeem Rizve, Fan Fei, Jayakrishnan Unnikrishnan et al.

CVPR 2024posterarXiv:2403.14870

citations

vid-TLDR: Training Free Token Merging for Light-weight Video Transformer

Joonmyung Choi, Sanghyeok Lee, Jaewon Chu et al.

CVPR 2024posterarXiv:2403.13347

VidToMe: Video Token Merging for Zero-Shot Video Editing

Xirui Li, Chao Ma, Xiaokang Yang et al.

CVPR 2024posterarXiv:2312.10656

citations

View-Consistent 3D Editing with Gaussian Splatting

Yuxuan Wang, Xuanyu Yi, Zike Wu et al.

ECCV 2024posterarXiv:2403.11868

View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields

Haodi He, Colton Stearns, Adam Harley et al.

ECCV 2024posterarXiv:2405.19678

citations

View-decoupled Transformer for Person Re-identification under Aerial-ground Camera Network

Quan Zhang, Lei Wang, Vishal M. Patel et al.

CVPR 2024posterarXiv:2403.14513

citations

ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models

Lukas Höllein, Aljaž Božič, Norman Müller et al.

CVPR 2024posterarXiv:2403.01807

ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers

Jinke Li, Xiao He, Chonghua Zhou et al.

ECCV 2024posterarXiv:2405.04299

citations

View From Above: Orthogonal-View aware Cross-view Localization

Shan Wang, Chuong Nguyen, Jiawei Liu et al.

CVPR 2024poster

ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

Xianghui Yang, Gil Avraham, Yan Zuo et al.

CVPR 2024posterarXiv:2402.18842

Viewing Transformers Through the Lens of Long Convolutions Layers

Itamar Zimerman, Lior Wolf

ICML 2024poster

Viewpoint-Aware Visual Grounding in 3D Scenes

Xiangxi Shi, Zhonghua Wu, Stefan Lee

CVPR 2024poster

Viewpoint textual inversion: discovering scene representations and 3D view control in 2D diffusion models

James Burgess, Kuan-Chieh Wang, Serena Yeung-Levy

ECCV 2024posterarXiv:2309.07986

citations

View Selection for 3D Captioning via Diffusion Ranking

Tiange Luo, Justin Johnson, Honglak Lee

ECCV 2024posterarXiv:2404.07984

citations

ViG-Bias: Visually Grounded Bias Discovery and Mitigation

Badr-Eddine Marani, Mohamed HANINI, Nihitha Malayarukil et al.

ECCV 2024posterarXiv:2407.01996

citations

ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling

Siming Yan, Min Bai, Weifeng Chen et al.

ECCV 2024posterarXiv:2402.06118

ViLA: Efficient Video-Language Alignment for Video Question Answering

Xijun Wang, Junbang Liang, Chun-Kai Wang et al.

ECCV 2024posterarXiv:2312.08367

citations

ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification

Jiangbo Shi, Chen Li, Tieliang Gong et al.

CVPR 2024posterarXiv:2502.08391

VILA: On Pre-training for Visual Language Models

Ji Lin, Danny Yin, Wei Ping et al.

CVPR 2024posterarXiv:2312.07533

685

citations

← Previous

1...170 171 172 173 174...178