Most Cited CVPR "neural network analysis" Papers

5,589 papers found • Page 19 of 28

#3601

Countering Personalized Text-to-Image Generation with Influence Watermarks

Hanwen Liu, Zhicheng Sun, Yadong Mu

CVPR 2024poster
#3602

T-VSL: Text-Guided Visual Sound Source Localization in Mixtures

Tanvir Mahmud, Yapeng Tian, Diana Marculescu

CVPR 2024posterarXiv:2404.01751
#3603

Initialization Matters for Adversarial Transfer Learning

Andong Hua, Jindong Gu, Zhiyu Xue et al.

CVPR 2024posterarXiv:2312.05716
#3604

MindBridge: A Cross-Subject Brain Decoding Framework

Shizun Wang, Songhua Liu, Zhenxiong Tan et al.

CVPR 2024highlightarXiv:2404.07850
#3605

Narrative Action Evaluation with Prompt-Guided Multimodal Interaction

Shiyi Zhang, Sule Bai, Guangyi Chen et al.

CVPR 2024posterarXiv:2404.14471
#3606

JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments

Duy Tho Le, Chenhui Gou, Stavya Datta et al.

CVPR 2024posterarXiv:2404.01686
#3607

Minimal Perspective Autocalibration

Andrea Porfiri Dal Cin, Timothy Duff, Luca Magri et al.

CVPR 2024posterarXiv:2405.05605
#3608

RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos

Hongchi Xia, Yang Fu, Sifei Liu et al.

CVPR 2024posterarXiv:2401.12592
#3609

Aligning and Prompting Everything All at Once for Universal Visual Perception

Yunhang Shen, Chaoyou Fu, Peixian Chen et al.

CVPR 2024posterarXiv:2312.02153
#3610

Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption

Buzhen Huang, Chen Li, Chongyang Xu et al.

CVPR 2024posterarXiv:2404.11291
#3611

Label Propagation for Zero-shot Classification with Vision-Language Models

Vladan Stojnić, Yannis Kalantidis, Giorgos Tolias

CVPR 2024posterarXiv:2404.04072
#3612

IQ-VFI: Implicit Quadratic Motion Estimation for Video Frame Interpolation

Mengshun Hu, Kui Jiang, Zhihang Zhong et al.

CVPR 2024poster
#3613

Efficient Dataset Distillation via Minimax Diffusion

Jianyang Gu, Saeed Vahidian, Vyacheslav Kungurtsev et al.

CVPR 2024posterarXiv:2311.15529
#3614

Enhancing Video Super-Resolution via Implicit Resampling-based Alignment

Kai Xu, Ziwei Yu, Xin Wang et al.

CVPR 2024highlightarXiv:2305.00163
#3615

Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis

Bichen Wu, Ching-Yao Chuang, Xiaoyan Wang et al.

CVPR 2024posterarXiv:2312.13834
#3616

RNb-NeuS: Reflectance and Normal-based Multi-View 3D Reconstruction

Baptiste Brument, Robin Bruneau, Yvain Queau et al.

CVPR 2024posterarXiv:2312.01215
#3617

Prompt-Enhanced Multiple Instance Learning for Weakly Supervised Video Anomaly Detection

Junxi Chen, Liang Li, Li Su et al.

CVPR 2024poster
#3618

DiPrompT: Disentangled Prompt Tuning for Multiple Latent Domain Generalization in Federated Learning

Sikai Bai, Jie ZHANG, Song Guo et al.

CVPR 2024posterarXiv:2403.08506
#3619

HIMap: HybrId Representation Learning for End-to-end Vectorized HD Map Construction

Yi ZHOU, Hui Zhang, Jiaqian Yu et al.

CVPR 2024posterarXiv:2403.08639
#3620

LTA-PCS: Learnable Task-Agnostic Point Cloud Sampling

Jiaheng Liu, Jianhao Li, Kaisiyuan Wang et al.

CVPR 2024poster
#3621

SRTube: Video-Language Pre-Training with Action-Centric Video Tube Features and Semantic Role Labeling

Juhee Lee, Jewon Kang

CVPR 2024poster
#3622

Generative Quanta Color Imaging

Vishal Purohit, Junjie Luo, Yiheng Chi et al.

CVPR 2024posterarXiv:2403.19066
#3623

SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology

Saarthak Kapse, Pushpak Pati, Srijan Das et al.

CVPR 2024posterarXiv:2312.15010
#3624

MINIMA: Modality Invariant Image Matching

Jiangwei Ren, Xingyu Jiang, Zizhuo Li et al.

CVPR 2025posterarXiv:2412.19412
#3625

Mind the Gap: Confidence Discrepancy Can Guide Federated Semi-Supervised Learning Across Pseudo-Mismatch

Yijie Liu, Xinyi Shang, Yiqun Zhang et al.

CVPR 2025posterarXiv:2503.13227
#3626

SATA: Spatial Autocorrelation Token Analysis for Enhancing the Robustness of Vision Transformers

Nikaan Nikzad, YI LIAO, Yongsheng Gao et al.

CVPR 2025posterarXiv:2409.19850
#3627

Tiled Diffusion

Or Madar, Ohad Fried

CVPR 2025posterarXiv:2412.15185
#3628

AI-Face: A Million-Scale Demographically Annotated AI-Generated Face Dataset and Fairness Benchmark

Li Lin, Santosh Santosh, Mingyang Wu et al.

CVPR 2025posterarXiv:2406.00783
#3629

Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models

Yankai Jiang, Peng Zhang, Donglin Yang et al.

CVPR 2025posterarXiv:2505.02753
#3630

Condensing Action Segmentation Datasets via Generative Network Inversion

Guodong Ding, Rongyu Chen, Angela Yao

CVPR 2025posterarXiv:2503.14112
#3631

HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation

Hermann Kumbong, Xian Liu, Tsung-Yi Lin et al.

CVPR 2025posterarXiv:2506.04421
#3632

CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models

Felix Taubner, Ruihang Zhang, Mathieu Tuli et al.

CVPR 2025posterarXiv:2412.12093
#3633

Task-Specific Gradient Adaptation for Few-Shot One-Class Classification

Yunlong Li, Xiabi Liu, Liyuan Pan et al.

CVPR 2025poster
#3634

Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion

Jona Ballé, Luca Versari, Emilien Dupont et al.

CVPR 2025highlightarXiv:2412.00505
#3635

Video Depth Anything: Consistent Depth Estimation for Super-Long Videos

Sili Chen, Hengkai Guo, Shengnan Zhu et al.

CVPR 2025highlightarXiv:2501.12375
#3636

Enhancing Facial Privacy Protection via Weakening Diffusion Purification

Ali Salar, Qing Liu, Yingli Tian et al.

CVPR 2025posterarXiv:2503.10350
#3637

Structured 3D Latents for Scalable and Versatile 3D Generation

Jianfeng XIANG, Zelong Lv, Sicheng Xu et al.

CVPR 2025highlightarXiv:2412.01506
#3638

Practical Solutions to the Relative Pose of Three Calibrated Cameras

Charalambos Tzamos, Viktor Kocur, Yaqing Ding et al.

CVPR 2025posterarXiv:2303.16078
#3639

Temporal Alignment-Free Video Matching for Few-shot Action Recognition

SuBeen Lee, WonJun Moon, Hyun Seok Seong et al.

CVPR 2025posterarXiv:2504.05956
#3640

CRISP: Object Pose and Shape Estimation with Test-Time Adaptation

Jingnan Shi, Rajat Talak, Harry Zhang et al.

CVPR 2025highlightarXiv:2412.01052
#3641

DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models

Keda Tao, Can Qin, Haoxuan You et al.

CVPR 2025posterarXiv:2411.15024
#3642

CoLLM: A Large Language Model for Composed Image Retrieval

Chuong Huynh, Jinyu Yang, Ashish Tawari et al.

CVPR 2025posterarXiv:2503.19910
#3643

CLIP-driven Coarse-to-fine Semantic Guidance for Fine-grained Open-set Semi-supervised Learning

Xiaokun Li, Yaping Huang, Qingji Guan

CVPR 2025poster
#3644

Relative Pose Estimation through Affine Corrections of Monocular Depth Priors

Yifan Yu, Shaohui Liu, Rémi Pautrat et al.

CVPR 2025highlightarXiv:2501.05446
#3645

Locality-Aware Zero-Shot Human-Object Interaction Detection

Sanghyun Kim, Deunsol Jung, Minsu Cho

CVPR 2025posterarXiv:2505.19503
#3646

Arc2Avatar: Generating Expressive 3D Avatars from a Single Image via ID Guidance

Dimitrios Gerogiannis, Foivos Paraperas Papantoniou, Rolandos Alexandros Potamias et al.

CVPR 2025posterarXiv:2501.05379
#3647

LIM: Large Interpolator Model for Dynamic Reconstruction

Remy Sabathier, Niloy J. Mitra, David Novotny

CVPR 2025posterarXiv:2503.22537
#3648

MoFlow: One-Step Flow Matching for Human Trajectory Forecasting via Implicit Maximum Likelihood Estimation based Distillation

Yuxiang Fu, Qi Yan, Ke Li et al.

CVPR 2025posterarXiv:2503.09950
#3649

Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions

Chan Hur, Jeong-hun Hong, Dong-hun Lee et al.

CVPR 2025posterarXiv:2503.05186
#3650

SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation

Claudia Cuttano, Gabriele Trivigno, Gabriele Rosi et al.

CVPR 2025highlightarXiv:2411.17646
#3651

MITracker: Multi-View Integration for Visual Object Tracking

Mengjie Xu, Yitao Zhu, Haotian Jiang et al.

CVPR 2025highlightarXiv:2502.20111
#3652

Bias for Action: Video Implicit Neural Representations with Bias Modulation

Alper Kayabasi, Anil Kumar Vadathya, Guha Balakrishnan et al.

CVPR 2025posterarXiv:2501.09277
#3653

LotusFilter: Fast Diverse Nearest Neighbor Search via a Learned Cutoff Table

Yusuke Matsui

CVPR 2025posterarXiv:2506.04790
#3654

Towards Effective and Sparse Adversarial Attack on Spiking Neural Networks via Breaking Invisible Surrogate Gradients

Li Lun, Kunyu Feng, Qinglong Ni et al.

CVPR 2025posterarXiv:2503.03272
#3655

DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis

Yuming Gu, Phong Tran, Yujian Zheng et al.

CVPR 2025posterarXiv:2503.15667
#3656

h-Edit: Effective and Flexible Diffusion-Based Editing via Doob's h-Transform

Toan Nguyen, Kien Do, Duc Kieu et al.

CVPR 2025posterarXiv:2503.02187
#3657

EDM: Equirectangular Projection-Oriented Dense Kernelized Feature Matching

Dongki Jung, Jaehoon Choi, Yonghan Lee et al.

CVPR 2025posterarXiv:2502.20685
#3658

SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing Images

Kaiyu Li, Ruixun Liu, Xiangyong Cao et al.

CVPR 2025posterarXiv:2410.01768
#3659

RigGS: Rigging of 3D Gaussians for Modeling Articulated Objects in Videos

Yuxin Yao, Zhi Deng, Junhui Hou

CVPR 2025posterarXiv:2503.16822
#3660

Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters

Zhiyang Guo, Jinxu Xiang, Kai Ma et al.

CVPR 2025highlightarXiv:2411.18197
#3661

Unlocking the Potential of Unlabeled Data in Semi-Supervised Domain Generalization

Dongkwan Lee, Kyomin Hwang, Nojun Kwak

CVPR 2025posterarXiv:2503.13915
#3662

Exposure-slot: Exposure-centric Representations Learning with Slot-in-Slot Attention for Region-aware Exposure Correction

Donggoo Jung, DAEHYUN KIM, Guanghui Wang et al.

CVPR 2025poster
#3663

Explainable Saliency: Articulating Reasoning with Contextual Prioritization

Nuo Chen, Ming Jiang, Qi Zhao

CVPR 2025poster
#3664

VisionZip: Longer is Better but Not Necessary in Vision Language Models

Senqiao Yang, Yukang Chen, Zhuotao Tian et al.

CVPR 2025posterarXiv:2412.04467
#3665

HistoFS: Non-IID Histopathologic Whole Slide Image Classification via Federated Style Transfer with RoI-Preserving

Farchan Hakim Raswa, Chun-Shien Lu, Jia-Ching Wang

CVPR 2025poster
#3666

3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning

Yuncong Yang, Han Yang, Jiachen Zhou et al.

CVPR 2025posterarXiv:2411.17735
#3667

EffiDec3D: An Optimized Decoder for High-Performance and Efficient 3D Medical Image Segmentation

Md Mostafijur Rahman, Radu Marculescu

CVPR 2025highlight
#3668

The Language of Motion: Unifying Verbal and Non-verbal Language of 3D Human Motion

Changan Chen, Juze Zhang, Shrinidhi Kowshika Lakshmikanth et al.

CVPR 2025posterarXiv:2412.10523
#3669

Parallelized Autoregressive Visual Generation

Yuqing Wang, Shuhuai Ren, Zhijie Lin et al.

CVPR 2025highlightarXiv:2412.15119
#3670

One Model for ALL: Low-Level Task Interaction Is a Key to Task-Agnostic Image Fusion

Chunyang Cheng, Tianyang Xu, Zhenhua Feng et al.

CVPR 2025posterarXiv:2502.19854
#3671

Using Diffusion Priors for Video Amodal Segmentation

Kaihua Chen, Deva Ramanan, Tarasha Khurana

CVPR 2025posterarXiv:2412.04623
#3672

Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation

Jiantao Lin, Xin Yang, Meixi Chen et al.

CVPR 2025posterarXiv:2503.01370
#3673

ChatGarment: Garment Estimation, Generation and Editing via Large Language Models

Siyuan Bian, Chenghao Xu, Yuliang Xiu et al.

CVPR 2025posterarXiv:2412.17811
#3674

SpecTRe-GS: Modeling Highly Specular Surfaces with Reflected Nearby Objects by Tracing Rays in 3D Gaussian Splatting

Jiajun Tang, Fan Fei, Zhihao Li et al.

CVPR 2025highlight
#3675

Self-supervised ControlNet with Spatio-Temporal Mamba for Real-world Video Super-resolution

Shijun Shi, Jing Xu, Lijing Lu et al.

CVPR 2025posterarXiv:2506.01037
#3676

SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training

Jierun Chen, Dongting Hu, Xijie Huang et al.

CVPR 2025highlightarXiv:2412.09619
#3677

From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech

Jihoon Kim, Jeongsoo Choi, Jaehun Kim et al.

CVPR 2025highlightarXiv:2503.16956
#3678

VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models

Byung-Kwan Lee, Ryo Hachiuma, Yu-Chiang Frank Wang et al.

CVPR 2025posterarXiv:2412.01822
#3679

MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Ho Kei Cheng, Masato Ishii, Akio Hayakawa et al.

CVPR 2025posterarXiv:2412.15322
#3680

Consistency Posterior Sampling for Diverse Image Synthesis

Vishal Purohit, Matthew Repasky, Jianfeng Lu et al.

CVPR 2025poster
#3681

Text Embedding is Not All You Need: Attention Control for Text-to-Image Semantic Alignment with Text Self-Attention Maps

Jeeyung Kim, Erfan Esmaeili Fakhabi, Qiang Qiu

CVPR 2025posterarXiv:2411.15236
#3682

Co-op: Correspondence-based Novel Object Pose Estimation

Sungphill Moon, Hyeontae Son, Dongcheol Hur et al.

CVPR 2025posterarXiv:2503.17731
#3683

Effective Cloud Removal for Remote Sensing Images by an Improved Mean-Reverting Denoising Model with Elucidated Design Space

Yi Liu, Wengen Li, Jihong Guan et al.

CVPR 2025posterarXiv:2503.23717
#3684

StageDesigner: Artistic Stage Generation for Scenography via Theater Scripts

Zhaoxing Gan, Mengtian Li, Ruhua Chen et al.

CVPR 2025posterarXiv:2503.02595
#3685

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Yuhao Dong, Zuyan Liu, Hai-Long Sun et al.

CVPR 2025highlightarXiv:2411.14432
#3686

Scaling Mesh Generation via Compressive Tokenization

Haohan Weng, Zibo Zhao, Biwen Lei et al.

CVPR 2025posterarXiv:2411.07025
#3687

3D Convex Splatting: Radiance Field Rendering with 3D Smooth Convexes

Jan Held, Renaud Vandeghen, Abdullah J Hamdi et al.

CVPR 2025highlightarXiv:2411.14974
#3688

Disco4D: Disentangled 4D Human Generation and Animation from a Single Image

Hui En Pang, Shuai Liu, Zhongang Cai et al.

CVPR 2025posterarXiv:2409.17280
#3689

ShowMak3r: Compositional TV Show Reconstruction

Sangmin Kim, Seunguk Do, Jaesik Park

CVPR 2025posterarXiv:2504.19584
#3690

Vision-Language Model IP Protection via Prompt-based Learning

Lianyu Wang, Meng Wang, Huazhu Fu et al.

CVPR 2025posterarXiv:2503.02393
#3691

Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera

Zhengdi Yu, Stefanos Zafeiriou, Tolga Birdal

CVPR 2025highlightarXiv:2412.12861
#3692

Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration

JUNSEONG KIM, GeonU Kim, Kim Yu-Ji et al.

CVPR 2025highlightarXiv:2502.16652
#3693

GauCho: Gaussian Distributions with Cholesky Decomposition for Oriented Object Detection

Jeffri Erwin Murrugarra Llerena, José Henrique Marques, Claudio Jung

CVPR 2025posterarXiv:2502.01565
#3694

Seeing the Abstract: Translating the Abstract Language for Vision Language Models

Davide Talon, Federico Girella, Ziyue Liu et al.

CVPR 2025posterarXiv:2505.03242
#3695

V^2Dial: Unification of Video and Visual Dialog via Multimodal Experts

Adnen Abdessaied, Anna Rohrbach, Marcus Rohrbach et al.

CVPR 2025poster
#3696

UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image

Xingyu Liu, Gu Wang, Ruida Zhang et al.

CVPR 2025posterarXiv:2411.16106
#3697

Conformal Prediction for Zero-Shot Models

Julio Silva-Rodríguez, Ismail Ben Ayed, Jose Dolz

CVPR 2025posterarXiv:2505.24693
#3698

PhysAnimator: Physics-Guided Generative Cartoon Animation

Tianyi Xie, Yiwei Zhao, Ying Jiang et al.

CVPR 2025posterarXiv:2501.16550
#3699

Pathways on the Image Manifold: Image Editing via Video Generation

Noam Rotstein, Gal Yona, Daniel Silver et al.

CVPR 2025posterarXiv:2411.16819
#3700

Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models

Sangwon Jang, June Suk Choi, Jaehyeong Jo et al.

CVPR 2025posterarXiv:2503.09669
#3701

Generative Omnimatte: Learning to Decompose Video into Layers

Yao-Chih Lee, Erika Lu, Sarah Rumbley et al.

CVPR 2025highlightarXiv:2411.16683
#3702

SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Tracking

Wenrui Cai, Qingjie Liu, Yunhong Wang

CVPR 2025posterarXiv:2503.18338
#3703

Iterative Predictor-Critic Code Decoding for Real-World Image Dehazing

Jiayi Fu, Siyu Liu, Zikun Liu et al.

CVPR 2025posterarXiv:2503.13147
#3704

LT3SD: Latent Trees for 3D Scene Diffusion

Quan Meng, Lei Li, Matthias Nießner et al.

CVPR 2025posterarXiv:2409.08215
#3705

Towards Enhanced Image Inpainting: Mitigating Unwanted Object Insertion and Preserving Color Consistency

Yikai Wang, Chenjie Cao, Junqiu Yu et al.

CVPR 2025highlightarXiv:2312.04831
#3706

GenAssets: Generating in-the-wild 3D Assets in Latent Space

Ze Yang, Jingkang Wang, Haowei Zhang et al.

CVPR 2025poster
#3707

PerLA: Perceptive 3D Language Assistant

Guofeng Mei, Wei Lin, Luigi Riz et al.

CVPR 2025posterarXiv:2411.19774
#3708

HyperNVD: Accelerating Neural Video Decomposition via Hypernetworks

Maria Pilligua, Danna Xue, Javier Vazquez-Corral

CVPR 2025posterarXiv:2503.17276
#3709

Masked Point-Entity Contrast for Open-Vocabulary 3D Scene Understanding

Yan Wang, Baoxiong Jia, Ziyu Zhu et al.

CVPR 2025posterarXiv:2504.19500
#3710

GLane3D: Detecting Lanes with Graph of 3D Keypoints

Halil İbrahim Öztürk, Muhammet Esat Kalfaoglu, Ozsel Kilinc

CVPR 2025posterarXiv:2503.23882
#3711

Hyperbolic Safety-Aware Vision-Language Models

Tobia Poppi, Tejaswi Kasarla, Pascal Mettes et al.

CVPR 2025highlightarXiv:2503.12127
#3712

PanoGS: Gaussian-based Panoptic Segmentation for 3D Open Vocabulary Scene Understanding

Hongjia Zhai, Hai Li, Zhenzhe Li et al.

CVPR 2025posterarXiv:2503.18107
#3713

Community Forensics: Using Thousands of Generators to Train Fake Image Detectors

Jeongsoo Park, Andrew Owens

CVPR 2025posterarXiv:2411.04125
#3714

Can Large Vision-Language Models Correct Semantic Grounding Errors By Themselves?

Yuan-Hong Liao, Rafid Mahmood, Sanja Fidler et al.

CVPR 2025posterarXiv:2404.06510
#3715

Universal Domain Adaptation for Semantic Segmentation

Seun-An Choe, Keon Hee Park, Jinwoo Choi et al.

CVPR 2025posterarXiv:2505.22458
#3716

Simulator HC: Regression-based Online Simulation of Starting Problem-Solution Pairs for Homotopy Continuation in Geometric Vision

Xinyue Zhang, Zijia Dai, Wanting Xu et al.

CVPR 2025highlightarXiv:2411.03745
#3717

ArtFormer: Controllable Generation of Diverse 3D Articulated Objects

Jiayi Su, Youhe Feng, Zheng Li et al.

CVPR 2025posterarXiv:2412.07237
#3718

Faster Parameter-Efficient Tuning with Token Redundancy Reduction

Kwonyoung Kim, Jungin Park, Jin Kim et al.

CVPR 2025posterarXiv:2503.20282
#3719

PhD: A ChatGPT-Prompted Visual Hallucination Evaluation Dataset

Jiazhen Liu, Yuhan Fu, Ruobing Xie et al.

CVPR 2025highlightarXiv:2403.11116
#3720

A Bias-Free Training Paradigm for More General AI-generated Image Detection

Fabrizio Guillaro, Giada Zingarini, Ben Usman et al.

CVPR 2025posterarXiv:2412.17671
#3721

MERGE: Multi-faceted Hierarchical Graph-based GNN for Gene Expression Prediction from Whole Slide Histopathology Images

Aniruddha Ganguly, Debolina Chatterjee, Wentao Huang et al.

CVPR 2025posterarXiv:2412.02601
#3722

Accurate Differential Operators for Hybrid Neural Fields

Aditya Chetan, Guandao Yang, Zichen Wang et al.

CVPR 2025posterarXiv:2312.05984
#3723

Do Computer Vision Foundation Models Learn the Low-level Characteristics of the Human Visual System?

Yancheng Cai, Fei Yin, Dounia Hammou et al.

CVPR 2025highlightarXiv:2502.20256
#3724

DeDe: Detecting Backdoor Samples for SSL Encoders via Decoders

Sizai Hou, Songze Li, Duanyi Yao

CVPR 2025posterarXiv:2411.16154
#3725

FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation

Kefan Chen, Chaerin Min, Linguang Zhang et al.

CVPR 2025highlightarXiv:2412.02690
#3726

RelationField: Relate Anything in Radiance Fields

Sebastian Koch, Johanna Wald, Mirco Colosi et al.

CVPR 2025posterarXiv:2412.13652
#3727

Multitwine: Multi-Object Compositing with Text and Layout Control

Gemma Canet Tarrés, Zhe Lin, Zhifei Zhang et al.

CVPR 2025highlightarXiv:2502.05165
#3728

DepthSplat: Connecting Gaussian Splatting and Depth

Haofei Xu, Songyou Peng, Fangjinhua Wang et al.

CVPR 2025posterarXiv:2410.13862
#3729

Any3DIS: Class-Agnostic 3D Instance Segmentation by 2D Mask Tracking

Phuc Nguyen, Minh Luu, Anh Tran et al.

CVPR 2025posterarXiv:2411.16183
#3730

MultiMorph: On-demand Atlas Construction

Mazdak Abulnaga, Andrew Hoopes, Neel Dey et al.

CVPR 2025posterarXiv:2504.00247
#3731

Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment

Soumya Suvra Ghosal, Souradip Chakraborty, Vaibhav Singh et al.

CVPR 2025posterarXiv:2411.18688
#3732

Marten: Visual Question Answering with Mask Generation for Multi-modal Document Understanding

Zining Wang, Tongkun Guan, Pei Fu et al.

CVPR 2025posterarXiv:2503.14140
#3733

Bridging the Vision-Brain Gap with an Uncertainty-Aware Blur Prior

Haitao Wu, Qing Li, Changqing Zhang et al.

CVPR 2025posterarXiv:2503.04207
#3734

Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents

Yunseok Jang, Yeda Song, Sungryull Sohn et al.

CVPR 2025posterarXiv:2505.12632
#3735

Progressive Focused Transformer for Single Image Super-Resolution

Wei Long, Xingyu Zhou, Leheng Zhang et al.

CVPR 2025posterarXiv:2503.20337
#3736

FreeCloth: Free-form Generation Enhances Challenging Clothed Human Modeling

Hang Ye, Xiaoxuan Ma, Hai Ci et al.

CVPR 2025highlightarXiv:2411.19942
#3737

From Zero to Detail: Deconstructing Ultra-High-Definition Image Restoration from Progressive Spectral Perspective

Chen Zhao, Zhizhou Chen, Yunzhe Xu et al.

CVPR 2025posterarXiv:2503.13165
#3738

Volumetrically Consistent 3D Gaussian Rasterization

Chinmay Talegaonkar, Yash Belhe, Ravi Ramamoorthi et al.

CVPR 2025highlightarXiv:2412.03378
#3739

LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity

Hongjie Wang, Chih-Yao Ma, Yen-Cheng Liu et al.

CVPR 2025posterarXiv:2412.09856
#3740

EditSplat: Multi-View Fusion and Attention-Guided Optimization for View-Consistent 3D Scene Editing with 3D Gaussian Splatting

Dong In Lee, Hyeongcheol Park, Jiyoung Seo et al.

CVPR 2025posterarXiv:2412.11520
#3741

Motion Prompting: Controlling Video Generation with Motion Trajectories

Daniel Geng, Charles Herrmann, Junhwa Hur et al.

CVPR 2025posterarXiv:2412.02700
#3742

ProHOC: Probabilistic Hierarchical Out-of-Distribution Classification via Multi-Depth Networks

Erik Wallin, Fredrik Kahl, Lars Hammarstrand

CVPR 2025posterarXiv:2503.21397
#3743

Flash3D: Super-scaling Point Transformers through Joint Hardware-Geometry Locality

Liyan Chen, Gregory P. Meyer, Zaiwei Zhang et al.

CVPR 2025highlightarXiv:2412.16481
#3744

Exploration-Driven Generative Interactive Environments

Nedko Savov, Naser Kazemi, Mohammad Mahdi et al.

CVPR 2025posterarXiv:2504.02515
#3745

GET: Unlocking the Multi-modal Potential of CLIP for Generalized Category Discovery

Enguang Wang, Zhimao Peng, Zhengyuan Xie et al.

CVPR 2025posterarXiv:2403.09974
#3746

ActiveGAMER: Active GAussian Mapping through Efficient Rendering

Liyan Chen, Huangying Zhan, Kevin Chen et al.

CVPR 2025posterarXiv:2501.06897
#3747

SeaLion: Semantic Part-Aware Latent Point Diffusion Models for 3D Generation

Dekai Zhu, Yan Di, Stefan Gavranovic et al.

CVPR 2025posterarXiv:2505.17721
#3748

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Wenbo Hu, Xiangjun Gao, Xiaoyu Li et al.

CVPR 2025highlightarXiv:2409.02095
#3749

Rethinking Epistemic and Aleatoric Uncertainty for Active Open-Set Annotation: An Energy-Based Approach

Chen-Chen Zong, Sheng-Jun Huang

CVPR 2025posterarXiv:2502.19691
#3750

InteractVLM: 3D Interaction Reasoning from 2D Foundational Models

Sai Kumar Dwivedi, Dimitrije Antić, Shashank Tripathi et al.

CVPR 2025posterarXiv:2504.05303
#3751

Reconstructing Animals and the Wild

Peter Kulits, Michael J. Black, Silvia Zuffi

CVPR 2025posterarXiv:2411.18807
#3752

Controllable Human Image Generation with Personalized Multi-Garments

Yisol Choi, Sangkyung Kwak, Sihyun Yu et al.

CVPR 2025posterarXiv:2411.16801
#3753

PhysVLM: Enabling Visual Language Models to Understand Robotic Physical Reachability

Weijie Zhou, Manli Tao, Chaoyang Zhao et al.

CVPR 2025posterarXiv:2503.08481
#3754

MonoDGP: Monocular 3D Object Detection with Decoupled-Query and Geometry-Error Priors

Fanqi Pu, Yifan Wang, Jiru Deng et al.

CVPR 2025posterarXiv:2410.19590
#3755

Steepest Descent Density Control for Compact 3D Gaussian Splatting

Peihao Wang, Yuehao Wang, Dilin Wang et al.

CVPR 2025posterarXiv:2505.05587
#3756

Interactive Medical Image Analysis with Concept-based Similarity Reasoning

Ta Duc Huy, Sen Kim Tran, Phan Nguyen et al.

CVPR 2025posterarXiv:2503.06873
#3757

StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements

Mingkun Lei, Xue Song, Beier Zhu et al.

CVPR 2025posterarXiv:2412.08503
#3758

Convex Relaxation for Robust Vanishing Point Estimation in Manhattan World

Bangyan Liao, Zhenjun Zhao, Haoang Li et al.

CVPR 2025posterarXiv:2505.04788
#3759

Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation

Henghui Du, Guangyao Li, Chang Zhou et al.

CVPR 2025posterarXiv:2503.13068
#3760

AutoLUT: LUT-Based Image Super-Resolution with Automatic Sampling and Adaptive Residual Learning

Yuheng Xu, Shijie Yang, Xin Liu et al.

CVPR 2025posterarXiv:2503.01565
#3761

Arbitrary-steps Image Super-resolution via Diffusion Inversion

Zongsheng Yue, Kang Liao, Chen Change Loy

CVPR 2025posterarXiv:2412.09013
#3762

MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAM

Vladimir Yugay, Theo Gevers, Martin R. Oswald

CVPR 2025posterarXiv:2411.16785
#3763

LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation

Vladan Stojnić, Yannis Kalantidis, Jiri Matas et al.

CVPR 2025posterarXiv:2503.19777
#3764

EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion

Haotian Wang, Yuzhe Weng, Yueyan Li et al.

CVPR 2025posterarXiv:2411.16726
#3765

Subnet-Aware Dynamic Supernet Training for Neural Architecture Search

Jeimin Jeon, Youngmin Oh, Junghyup Lee et al.

CVPR 2025posterarXiv:2503.10740
#3766

CTRL-O: Language-Controllable Object-Centric Visual Representation Learning

Aniket Rajiv Didolkar, Andrii Zadaianchuk, Rabiul Awal et al.

CVPR 2025posterarXiv:2503.21747
#3767

Detecting Open World Objects via Partial Attribute Assignment

Muli Yang, Gabriel James Goenawan, Huaiyuan Qin et al.

CVPR 2025poster
#3768

Optical-Flow Guided Prompt Optimization for Coherent Video Generation

Hyelin Nam, Jaemin Kim, Dohun Lee et al.

CVPR 2025posterarXiv:2411.15540
#3769

Let's Verify and Reinforce Image Generation Step by Step

Renrui Zhang, Chengzhuo Tong, Zhizheng Zhao et al.

CVPR 2025poster
#3770

Blurred LiDAR for Sharper 3D: Robust Handheld 3D Scanning with Diffuse LiDAR and RGB

Nikhil Behari, Aaron Young, Siddharth Somasundaram et al.

CVPR 2025highlightarXiv:2411.19474
#3771

LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding

Hongyu Li, Jinyu Chen, Ziyu Wei et al.

CVPR 2025posterarXiv:2501.08282
#3772

Dynamic Neural Surfaces for Elastic 4D Shape Representation and Analysis

Awais Nizamani, Hamid Laga, Guanjin Wang et al.

CVPR 2025posterarXiv:2503.03132
#3773

SGC-Net: Stratified Granular Comparison Network for Open-Vocabulary HOI Detection

Xin Lin, Chong Shi, Zuopeng Yang et al.

CVPR 2025posterarXiv:2503.00414
#3774

FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training

Anjia Cao, Xing Wei, Zhiheng Ma

CVPR 2025posterarXiv:2411.11927
#3775

ProxyTransformation: Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding

Qihang Peng, Henry Zheng, Gao Huang

CVPR 2025posterarXiv:2502.19247
#3776

Automatic Joint Structured Pruning and Quantization for Efficient Neural Network Training and Compression

Xiaoyi Qu, David Aponte, Colby Banbury et al.

CVPR 2025posterarXiv:2502.16638
#3777

VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models

Lei Li, wei yuancheng, Zhihui Xie et al.

CVPR 2025highlightarXiv:2411.17451
#3778

Feature-Preserving Mesh Decimation for Normal Integration

Moritz Heep, Sven Behnke, Eduard Zell

CVPR 2025posterarXiv:2504.00867
#3779

BiomedCoOp: Learning to Prompt for Biomedical Vision-Language Models

Taha Koleilat, Hojat Asgariandehkordi, Hassan Rivaz et al.

CVPR 2025posterarXiv:2411.15232
#3780

Conical Visual Concentration for Efficient Large Vision-Language Models

Long Xing, Qidong Huang, Xiaoyi Dong et al.

CVPR 2025poster
#3781

Prior-free 3D Object Tracking

Xiuqiang Song, Li Jin, Zhengxian Zhang et al.

CVPR 2025highlight
#3782

OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels

Meng Lou, Yizhou Yu

CVPR 2025posterarXiv:2502.20087
#3783

Cross-modal Information Flow in Multimodal Large Language Models

Zhi Zhang, Srishti Yadav, Fengze Han et al.

CVPR 2025posterarXiv:2411.18620
#3784

Sparse Point Cloud Patches Rendering via Splitting 2D Gaussians

Changfeng Ma, Ran Bi, Jie Guo et al.

CVPR 2025posterarXiv:2505.09413
#3785

Digital Twin Catalog: A Large-Scale Photorealistic 3D Object Digital Twin Dataset

Zhao Dong, Ka chen, Zhaoyang Lv et al.

CVPR 2025highlightarXiv:2504.08541
#3786

IEEE Computer Society

CVPR 2025
#3787

RORem: Training a Robust Object Remover with Human-in-the-Loop

Ruibin Li, Tao Yang, Song Guo et al.

CVPR 2025posterarXiv:2501.00740
#3788

FG^2: Fine-Grained Cross-View Localization by Fine-Grained Feature Matching

Zimin Xia, Alex Alahi

CVPR 2025posterarXiv:2503.18725
#3789

MatAnyone: Stable Video Matting with Consistent Memory Propagation

Peiqing Yang, Shangchen Zhou, Jixin Zhao et al.

CVPR 2025posterarXiv:2501.14677
#3790

Hierarchical Compact Clustering Attention (COCA) for Unsupervised Object-Centric Learning

Can Küçüksözen, Yucel Yemez

CVPR 2025posterarXiv:2505.02071
#3791

Pay Attention to the Foreground in Object-Centric Learning

Pinzhuo Tian, Shengjie Yang, Hang Yu et al.

CVPR 2025poster
#3792

MetaShadow: Object-Centered Shadow Detection, Removal, and Synthesis

Tianyu Wang, Jianming Zhang, Haitian Zheng et al.

CVPR 2025posterarXiv:2412.02635
#3793

Population Normalization for Federated Learning

Zhuoyao Wang, Fan Yi, Peizhu Gong et al.

CVPR 2025poster
#3794

DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture

Qianlong Xiang, Miao Zhang, Yuzhang Shang et al.

CVPR 2025posterarXiv:2409.03550
#3795

Satellite to GroundScape - Large-scale Consistent Ground View Generation from Satellite Views

Ningli Xu, Rongjun Qin

CVPR 2025posterarXiv:2504.15786
#3796

QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge

Xuan Shen, Weize Ma, Jing Liu et al.

CVPR 2025posterarXiv:2503.16709
#3797

HUNet: Homotopy Unfolding Network for Image Compressive Sensing

Feiyang Shen, Hongping Gan

CVPR 2025poster
#3798

The Devil is in Low-Level Features for Cross-Domain Few-Shot Segmentation

Yuhan Liu, Yixiong Zou, Yuhua Li et al.

CVPR 2025posterarXiv:2503.21150
#3799

A Physics-Informed Blur Learning Framework for Imaging Systems

liqun.chen, Yuxuan Li, Jun Dai et al.

CVPR 2025poster
#3800

Hiding Images in Diffusion Models by Editing Learned Score Functions

Haoyu Chen, Yunqiao Yang, Nan Zhong et al.

CVPR 2025posterarXiv:2503.18459