CVPR Poster Papers
4,874 papers found • Page 81 of 98
OmniMotionGPT: Animal Motion Generation with Limited Data
Zhangsihao Yang, Mingyuan Zhou, Mengyi Shan et al.
OmniParser: A Unified Framework for Text Spotting Key Information Extraction and Table Recognition
Jianqiang Wan, Sibo Song, Wenwen Yu et al.
Omni-Q: Omni-Directional Scene Understanding for Unsupervised Visual Grounding
Sai Wang, Yutian Lin, Yu Wu
OmniSDF: Scene Reconstruction using Omnidirectional Signed Distance Functions and Adaptive Binoctrees
Hakyeong Kim, Andreas Meuleman, Hyeonjoong Jang et al.
OmniSeg3D: Omniversal 3D Segmentation via Hierarchical Contrastive Learning
Haiyang Ying, Yixuan Yin, Jinzhi Zhang et al.
OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning
Siddharth Srivastava, Gaurav Sharma
OmniViD: A Generative Framework for Universal Video Understanding
Junke Wang, Dongdong Chen, Chong Luo et al.
Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression
Hancheng Ye, Chong Yu, Peng Ye et al.
One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion
Minghua Liu, Ruoxi Shi, Linghao Chen et al.
One-Class Face Anti-spoofing via Spoof Cue Map-Guided Feature Learning
Pei-Kai Huang, Cheng-Hsuan Chiang, Tzu-Hsien Chen et al.
OneFormer3D: One Transformer for Unified Point Cloud Segmentation
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin et al.
OneLLM: One Framework to Align All Modalities with Language
Jiaming Han, Kaixiong Gong, Yiyuan Zhang et al.
One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion Schedule Flaws and Enhancing Low-Frequency Controls
Minghui Hu, Jianbin Zheng, Chuanxia Zheng et al.
One-Prompt to Segment All Medical Images
Wu, Min Xu
One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models
Lin Li, Haoyan Guan, Jianing Qiu et al.
One-Shot Open Affordance Learning with Foundation Models
Gen Li, Deqing Sun, Laura Sevilla-Lara et al.
One-Shot Structure-Aware Stylized Image Synthesis
Hansam Cho, Jonghyun Lee, Seunggyu Chang et al.
One-step Diffusion with Distribution Matching Distillation
Tianwei Yin, Michaël Gharbi, Richard Zhang et al.
On Exact Inversion of DPM-Solvers
Seongmin Hong, Kyeonghyun Lee, Suh Yoon Jeon et al.
Online Task-Free Continual Generative and Discriminative Learning via Dynamic Cluster Memory
飞 叶, Adrian Bors
On Scaling Up a Multilingual Vision and Language Model
Xi Chen, Josip Djolonga, Piotr Padlewski et al.
On the Content Bias in Fréchet Video Distance
Songwei Ge, Aniruddha Mahapatra, Gaurav Parmar et al.
On the Diversity and Realism of Distilled Dataset: An Efficient Dataset Distillation Paradigm
Peng Sun, Bei Shi, Daiwei Yu et al.
On the Faithfulness of Vision Transformer Explanations
Junyi Wu, Weitai Kang, Hao Tang et al.
On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving
Kaituo Feng, Changsheng Li, Dongchun Ren et al.
On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation
Agneet Chatterjee, Tejas Gokhale, Chitta Baral et al.
On the Robustness of Large Multimodal Models Against Image Adversarial Attacks
Xuanming Cui, Alejandro Aparcedo, Young Kyun Jang et al.
On the Scalability of Diffusion-based Text-to-Image Generation
Hao Li, Yang Zou, Ying Wang et al.
On the Test-Time Zero-Shot Generalization of Vision-Language Models: Do We Really Need Prompt Learning?
Maxime Zanella, Ismail Ben Ayed
On Train-Test Class Overlap and Detection for Image Retrieval
Chull Hwan Song, Jooyoung Yoon, Taebaek Hwang et al.
OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising
Haichao Zhang, Yi Xu, Hongsheng Lu et al.
Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance
Phuc Nguyen, Tuan Duc Ngo, Evangelos Kalogerakis et al.
Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships
Sebastian Koch, Narunas Vaskevicius, Mirco Colosi et al.
OpenEQA: Embodied Question Answering in the Era of Foundation Models
Arjun Majumdar, Anurag Ajay, Xiaohan Zhang et al.
Open-Set Domain Adaptation for Semantic Segmentation
Seun-An Choe, Ah-Hyung Shin, Keon Hee Park et al.
OpenStreetView-5M: The Many Roads to Global Visual Geolocation
Guillaume Astruc, Nicolas Dufour, Ioannis Siglidis et al.
Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models
Pablo Marcos-Manchón, Roberto Alcover-Couso, Juan SanMiguel et al.
Open-Vocabulary Segmentation with Semantic-Assisted Calibration
Yong Liu, Sule Bai, Guanbin Li et al.
Open Vocabulary Semantic Scene Sketch Understanding
Ahmed Bourouis, Judith Fan, Yulia Gryaditskaya
Open-Vocabulary Semantic Segmentation with Image Embedding Balancing
Xiangheng Shan, Dongyue Wu, Guilin Zhu et al.
Open-Vocabulary Video Anomaly Detection
Peng Wu, Xuerong Zhou, Guansong Pang et al.
Open-World Human-Object Interaction Detection via Multi-modal Prompts
Jie Yang, Bingliang Li, Ailing Zeng et al.
Open-World Semantic Segmentation Including Class Similarity
Matteo Sodano, Federico Magistri, Lucas Nunes et al.
OpticalDR: A Deep Optical Imaging Model for Privacy-Protective Depression Recognition
Yuchen Pan, Junjun Jiang, Kui Jiang et al.
Optimal Transport Aggregation for Visual Place Recognition
Sergio Izquierdo, Javier Civera
Optimizing Diffusion Noise Can Serve As Universal Motion Priors
Korrawe Karunratanakul, Konpat Preechakul, Emre Aksan et al.
Orchestrate Latent Expertise: Advancing Online Continual Learning with Multi-Level Supervision and Reverse Self-Distillation
Hongwei Yan, Liyuan Wang, Kaisheng Ma et al.
OrthCaps: An Orthogonal CapsNet with Sparse Attention Routing and Pruning
Geng Xinyu, Jiaming Wang, Jiawei Gong et al.
Osprey: Pixel Understanding with Visual Instruction Tuning
Yuqian Yuan, Wentong Li, Jian liu et al.
OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
Tongjia Chen, Hongshan Yu, Zhengeng Yang et al.