CVPR Poster Papers

4,874 papers found • Page 81 of 98

OmniMotionGPT: Animal Motion Generation with Limited Data

Zhangsihao Yang, Mingyuan Zhou, Mengyi Shan et al.

CVPR 2024posterarXiv:2311.18303
15
citations

OmniParser: A Unified Framework for Text Spotting Key Information Extraction and Table Recognition

Jianqiang Wan, Sibo Song, Wenwen Yu et al.

CVPR 2024posterarXiv:2403.19128

Omni-Q: Omni-Directional Scene Understanding for Unsupervised Visual Grounding

Sai Wang, Yutian Lin, Yu Wu

CVPR 2024poster
4
citations

OmniSDF: Scene Reconstruction using Omnidirectional Signed Distance Functions and Adaptive Binoctrees

Hakyeong Kim, Andreas Meuleman, Hyeonjoong Jang et al.

CVPR 2024posterarXiv:2404.00678

OmniSeg3D: Omniversal 3D Segmentation via Hierarchical Contrastive Learning

Haiyang Ying, Yixuan Yin, Jinzhi Zhang et al.

CVPR 2024posterarXiv:2311.11666
66
citations

OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning

Siddharth Srivastava, Gaurav Sharma

CVPR 2024posterarXiv:2507.13364

OmniViD: A Generative Framework for Universal Video Understanding

Junke Wang, Dongdong Chen, Chong Luo et al.

CVPR 2024posterarXiv:2403.17935
29
citations

Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression

Hancheng Ye, Chong Yu, Peng Ye et al.

CVPR 2024posterarXiv:2403.15835
9
citations

One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion

Minghua Liu, Ruoxi Shi, Linghao Chen et al.

CVPR 2024posterarXiv:2311.07885

One-Class Face Anti-spoofing via Spoof Cue Map-Guided Feature Learning

Pei-Kai Huang, Cheng-Hsuan Chiang, Tzu-Hsien Chen et al.

CVPR 2024poster
20
citations

OneFormer3D: One Transformer for Unified Point Cloud Segmentation

Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin et al.

CVPR 2024posterarXiv:2311.14405

OneLLM: One Framework to Align All Modalities with Language

Jiaming Han, Kaixiong Gong, Yiyuan Zhang et al.

CVPR 2024posterarXiv:2312.03700

One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion Schedule Flaws and Enhancing Low-Frequency Controls

Minghui Hu, Jianbin Zheng, Chuanxia Zheng et al.

CVPR 2024posterarXiv:2311.15744
9
citations

One-Prompt to Segment All Medical Images

Wu, Min Xu

CVPR 2024posterarXiv:2305.10300
47
citations

One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models

Lin Li, Haoyan Guan, Jianing Qiu et al.

CVPR 2024posterarXiv:2403.01849
44
citations

One-Shot Open Affordance Learning with Foundation Models

Gen Li, Deqing Sun, Laura Sevilla-Lara et al.

CVPR 2024posterarXiv:2311.17776

One-Shot Structure-Aware Stylized Image Synthesis

Hansam Cho, Jonghyun Lee, Seunggyu Chang et al.

CVPR 2024posterarXiv:2402.17275
15
citations

One-step Diffusion with Distribution Matching Distillation

Tianwei Yin, Michaël Gharbi, Richard Zhang et al.

CVPR 2024posterarXiv:2311.18828
543
citations

On Exact Inversion of DPM-Solvers

Seongmin Hong, Kyeonghyun Lee, Suh Yoon Jeon et al.

CVPR 2024posterarXiv:2311.18387

Online Task-Free Continual Generative and Discriminative Learning via Dynamic Cluster Memory

飞 叶, Adrian Bors

CVPR 2024poster

On Scaling Up a Multilingual Vision and Language Model

Xi Chen, Josip Djolonga, Piotr Padlewski et al.

CVPR 2024posterarXiv:2305.18565
254
citations

On the Content Bias in Fréchet Video Distance

Songwei Ge, Aniruddha Mahapatra, Gaurav Parmar et al.

CVPR 2024posterarXiv:2404.12391

On the Diversity and Realism of Distilled Dataset: An Efficient Dataset Distillation Paradigm

Peng Sun, Bei Shi, Daiwei Yu et al.

CVPR 2024posterarXiv:2312.03526

On the Faithfulness of Vision Transformer Explanations

Junyi Wu, Weitai Kang, Hao Tang et al.

CVPR 2024posterarXiv:2404.01415

On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving

Kaituo Feng, Changsheng Li, Dongchun Ren et al.

CVPR 2024posterarXiv:2403.01238
14
citations

On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation

Agneet Chatterjee, Tejas Gokhale, Chitta Baral et al.

CVPR 2024posterarXiv:2404.08540

On the Robustness of Large Multimodal Models Against Image Adversarial Attacks

Xuanming Cui, Alejandro Aparcedo, Young Kyun Jang et al.

CVPR 2024posterarXiv:2312.03777
80
citations

On the Scalability of Diffusion-based Text-to-Image Generation

Hao Li, Yang Zou, Ying Wang et al.

CVPR 2024posterarXiv:2404.02883

On the Test-Time Zero-Shot Generalization of Vision-Language Models: Do We Really Need Prompt Learning?

Maxime Zanella, Ismail Ben Ayed

CVPR 2024posterarXiv:2405.02266
49
citations

On Train-Test Class Overlap and Detection for Image Retrieval

Chull Hwan Song, Jooyoung Yoon, Taebaek Hwang et al.

CVPR 2024posterarXiv:2404.01524

OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising

Haichao Zhang, Yi Xu, Hongsheng Lu et al.

CVPR 2024posterarXiv:2404.02227

Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance

Phuc Nguyen, Tuan Duc Ngo, Evangelos Kalogerakis et al.

CVPR 2024posterarXiv:2312.10671

Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships

Sebastian Koch, Narunas Vaskevicius, Mirco Colosi et al.

CVPR 2024posterarXiv:2402.12259
61
citations

OpenEQA: Embodied Question Answering in the Era of Foundation Models

Arjun Majumdar, Anurag Ajay, Xiaohan Zhang et al.

CVPR 2024poster
230
citations

Open-Set Domain Adaptation for Semantic Segmentation

Seun-An Choe, Ah-Hyung Shin, Keon Hee Park et al.

CVPR 2024posterarXiv:2405.19899
18
citations

OpenStreetView-5M: The Many Roads to Global Visual Geolocation

Guillaume Astruc, Nicolas Dufour, Ioannis Siglidis et al.

CVPR 2024posterarXiv:2404.18873
32
citations

Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models

Pablo Marcos-Manchón, Roberto Alcover-Couso, Juan SanMiguel et al.

CVPR 2024posterarXiv:2403.14291

Open-Vocabulary Segmentation with Semantic-Assisted Calibration

Yong Liu, Sule Bai, Guanbin Li et al.

CVPR 2024posterarXiv:2312.04089

Open Vocabulary Semantic Scene Sketch Understanding

Ahmed Bourouis, Judith Fan, Yulia Gryaditskaya

CVPR 2024posterarXiv:2312.12463

Open-Vocabulary Semantic Segmentation with Image Embedding Balancing

Xiangheng Shan, Dongyue Wu, Guilin Zhu et al.

CVPR 2024posterarXiv:2406.09829
28
citations

Open-Vocabulary Video Anomaly Detection

Peng Wu, Xuerong Zhou, Guansong Pang et al.

CVPR 2024posterarXiv:2311.07042
64
citations

Open-World Human-Object Interaction Detection via Multi-modal Prompts

Jie Yang, Bingliang Li, Ailing Zeng et al.

CVPR 2024posterarXiv:2406.07221
31
citations

Open-World Semantic Segmentation Including Class Similarity

Matteo Sodano, Federico Magistri, Lucas Nunes et al.

CVPR 2024posterarXiv:2403.07532

OpticalDR: A Deep Optical Imaging Model for Privacy-Protective Depression Recognition

Yuchen Pan, Junjun Jiang, Kui Jiang et al.

CVPR 2024posterarXiv:2402.18786
6
citations

Optimal Transport Aggregation for Visual Place Recognition

Sergio Izquierdo, Javier Civera

CVPR 2024posterarXiv:2311.15937
138
citations

Optimizing Diffusion Noise Can Serve As Universal Motion Priors

Korrawe Karunratanakul, Konpat Preechakul, Emre Aksan et al.

CVPR 2024posterarXiv:2312.11994
68
citations

Orchestrate Latent Expertise: Advancing Online Continual Learning with Multi-Level Supervision and Reverse Self-Distillation

Hongwei Yan, Liyuan Wang, Kaisheng Ma et al.

CVPR 2024posterarXiv:2404.00417

OrthCaps: An Orthogonal CapsNet with Sparse Attention Routing and Pruning

Geng Xinyu, Jiaming Wang, Jiawei Gong et al.

CVPR 2024posterarXiv:2403.13351
10
citations

Osprey: Pixel Understanding with Visual Instruction Tuning

Yuqian Yuan, Wentong Li, Jian liu et al.

CVPR 2024posterarXiv:2312.10032
147
citations

OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition

Tongjia Chen, Hongshan Yu, Zhengeng Yang et al.

CVPR 2024posterarXiv:2312.00096