Most Cited 2024 "gui agents" Papers

12,324 papers found • Page 54 of 62

#10601

Global and Hierarchical Geometry Consistency Priors for Few-shot NeRFs in Indoor Scenes

Xiaotian Sun, Qingshan Xu, Xinjie Yang et al.

CVPR 2024poster
#10602

Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation

Zihan Wang, Xiangyang Li, Jiahao Yang et al.

CVPR 2024highlightarXiv:2404.01943
#10603

Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis

Simon Niedermayr, Josef Stumpfegger, rüdiger westermann

CVPR 2024posterarXiv:2401.02436
#10604

The STVchrono Dataset: Towards Continuous Change Recognition in Time

Yanjun Sun, Yue Qiu, Mariia Khan et al.

CVPR 2024poster
#10605

Unleashing Channel Potential: Space-Frequency Selection Convolution for SAR Object Detection

Ke Li, Di Wang, Zhangyuan Hu et al.

CVPR 2024poster
#10606

Motion Blur Decomposition with Cross-shutter Guidance

Xiang Ji, Haiyang Jiang, Yinqiang Zheng

CVPR 2024posterarXiv:2404.01120
#10607

LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge

Gongwei Chen, Leyang Shen, Rui Shao et al.

CVPR 2024posterarXiv:2311.11860
#10608

Pixel-Aligned Language Model

Jiarui Xu, Xingyi Zhou, Shen Yan et al.

CVPR 2024poster
#10609

Eclipse: Disambiguating Illumination and Materials using Unintended Shadows

Dor Verbin, Ben Mildenhall, Peter Hedman et al.

CVPR 2024posterarXiv:2305.16321
#10610

ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis

Muhammad Hamza Mughal, Rishabh Dabral, Ikhsanul Habibie et al.

CVPR 2024posterarXiv:2403.17936
#10611

2S-UDF: A Novel Two-stage UDF Learning Method for Robust Non-watertight Model Reconstruction from Multi-view Images

Junkai Deng, Fei Hou, Xuhui Chen et al.

CVPR 2024posterarXiv:2303.15368
#10612

ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models

Lukas Höllein, Aljaž Božič, Norman Müller et al.

CVPR 2024posterarXiv:2403.01807
#10613

Taming Stable Diffusion for Text to 360 Panorama Image Generation

Cheng Zhang, Qianyi Wu, Camilo Cruz Gambardella et al.

CVPR 2024highlightarXiv:2404.07949
#10614

CAMEL: CAusal Motion Enhancement Tailored for Lifting Text-driven Video Editing

Guiwei Zhang, Tianyu Zhang, Guanglin Niu et al.

CVPR 2024poster
#10615

DuPL: Dual Student with Trustworthy Progressive Learning for Robust Weakly Supervised Semantic Segmentation

Yuanchen Wu, Xichen Ye, KequanYang et al.

CVPR 2024posterarXiv:2403.11184
#10616

A Physics-informed Low-rank Deep Neural Network for Blind and Universal Lens Aberration Correction

Jin Gong, Runzhao Yang, Weihang Zhang et al.

CVPR 2024poster
#10617

NAPGuard: Towards Detecting Naturalistic Adversarial Patches

Siyang Wu, Jiakai Wang, Jiejie Zhao et al.

CVPR 2024poster
#10618

Descriptor and Word Soups: Overcoming the Parameter Efficiency Accuracy Tradeoff for Out-of-Distribution Few-shot Learning

Christopher Liao, Theodoros Tsiligkaridis, Brian Kulis

CVPR 2024posterarXiv:2311.13612
#10619

A Stealthy Wrongdoer: Feature-Oriented Reconstruction Attack against Split Learning

Xiaoyang Xu, Mengda Yang, Wenzhe Yi et al.

CVPR 2024posterarXiv:2405.04115
#10620

Bootstrapping SparseFormers from Vision Foundation Models

Ziteng Gao, Zhan Tong, Kevin Qinghong Lin et al.

CVPR 2024posterarXiv:2312.01987
#10621

Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity

Ruijie Quan, Wenguan Wang, Zhibo Tian et al.

CVPR 2024posterarXiv:2403.20022
#10622

G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images

Zixiong Huang, Qi Chen, Libo Sun et al.

CVPR 2024posterarXiv:2404.07474
#10623

Active Prompt Learning in Vision Language Models

Jihwan Bang, Sumyeong Ahn, Jae-Gil Lee

CVPR 2024posterarXiv:2311.11178
#10624

Generating Handwritten Mathematical Expressions From Symbol Graphs: An End-to-End Pipeline

Yu chen, Fei Gao, YanguangZhang et al.

CVPR 2024poster
#10625

On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation

Agneet Chatterjee, Tejas Gokhale, Chitta Baral et al.

CVPR 2024posterarXiv:2404.08540
#10626

SingularTrajectory: Universal Trajectory Predictor Using Diffusion Model

Inhwan Bae, Young-Jae Park, Hae-Gon Jeon

CVPR 2024posterarXiv:2403.18452
#10627

Domain Separation Graph Neural Networks for Saliency Object Ranking

Zijian Wu, Jun Lu, Jing Han et al.

CVPR 2024poster
#10628

Solving the Catastrophic Forgetting Problem in Generalized Category Discovery

Xinzi Cao, Xiawu Zheng, Guanhong Wang et al.

CVPR 2024posterarXiv:2501.05272
#10629

Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving

Yuqi Wang, Jiawei He, Lue Fan et al.

CVPR 2024posterarXiv:2311.17918
#10630

WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models

Changhoon Kim, Kyle Min, Maitreya Patel et al.

CVPR 2024posterarXiv:2306.04744
#10631

MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images

Junwen Huang, Hao Yu, Kuan-Ting Yu et al.

CVPR 2024posterarXiv:2403.01517
#10632

Resource-Efficient Transformer Pruning for Finetuning of Large Models

Fatih Ilhan, Gong Su, Selim Tekin et al.

CVPR 2024poster
#10633

Link-Context Learning for Multimodal LLMs

Yan Tai, Weichen Fan, Zhao Zhang et al.

CVPR 2024posterarXiv:2308.07891
#10634

The Manga Whisperer: Automatically Generating Transcriptions for Comics

Ragav Sachdeva, Andrew Zisserman

CVPR 2024posterarXiv:2401.10224
#10635

Deep-TROJ: An Inference Stage Trojan Insertion Algorithm through Efficient Weight Replacement Attack

Sabbir Ahmed, RANYANG ZHOU, Shaahin Angizi et al.

CVPR 2024poster
#10636

Dynamic LiDAR Re-simulation using Compositional Neural Fields

Hanfeng Wu, Xingxing Zuo, Stefan Leutenegger et al.

CVPR 2024highlightarXiv:2312.05247
#10637

Language-aware Visual Semantic Distillation for Video Question Answering

Bo Zou, Chao Yang, Yu Qiao et al.

CVPR 2024poster
#10638

3DInAction: Understanding Human Actions in 3D Point Clouds

Yizhak Ben-Shabat, Oren Shrout, Stephen Gould

CVPR 2024highlightarXiv:2303.06346
#10639

DiLiGenRT: A Photometric Stereo Dataset with Quantified Roughness and Translucency

Heng Guo, Jieji Ren, Feishi Wang et al.

CVPR 2024poster
#10640

StyLitGAN: Image-Based Relighting via Latent Control

Anand Bhattad, James Soole, David Forsyth

CVPR 2024poster
#10641

Label-Efficient Group Robustness via Out-of-Distribution Concept Curation

Yiwei Yang, Anthony Liu, Robert Wolfe et al.

CVPR 2024poster
#10642

Unsupervised Universal Image Segmentation

XuDong Wang, Dantong Niu, Xinyang Han et al.

CVPR 2024posterarXiv:2312.17243
#10643

Batch Normalization Alleviates the Spectral Bias in Coordinate Networks

Zhicheng Cai, Hao Zhu, Qiu Shen et al.

CVPR 2024poster
#10644

Not All Classes Stand on Same Embeddings: Calibrating a Semantic Distance with Metric Tensor

Jae Hyeon Park, Gyoomin Lee, Seunggi Park et al.

CVPR 2024poster
#10645

CodedEvents: Optimal Point-Spread-Function Engineering for 3D-Tracking with Event Cameras

Sachin Shah, Matthew Chan, Haoming Cai et al.

CVPR 2024posterarXiv:2406.09409
#10646

Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling

Zhe Li, Zerong Zheng, Lizhen Wang et al.

CVPR 2024poster
#10647

Retrieval-Augmented Open-Vocabulary Object Detection

Jooyeon Kim, Eulrang Cho, Sehyung Kim et al.

CVPR 2024posterarXiv:2404.05687
#10648

NB-GTR: Narrow-Band Guided Turbulence Removal

Yifei Xia, Chu Zhou, Chengxuan Zhu et al.

CVPR 2024poster
#10649

LangSplat: 3D Language Gaussian Splatting

Minghan Qin, Wanhua Li, Jiawei ZHOU et al.

CVPR 2024highlightarXiv:2312.16084
#10650

Positive-Unlabeled Learning by Latent Group-Aware Meta Disambiguation

Lin Long, Haobo Wang, Zhijie Jiang et al.

CVPR 2024poster
#10651

Text-conditional Attribute Alignment across Latent Spaces for 3D Controllable Face Image Synthesis

FeiFan Xu, Rui Li, Si Wu et al.

CVPR 2024poster
#10652

SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering

Antoine Guédon, Vincent Lepetit

CVPR 2024posterarXiv:2311.12775
#10653

DiffusionPoser: Real-time Human Motion Reconstruction From Arbitrary Sparse Sensors Using Autoregressive Diffusion

Tom Van Wouwe, Seunghwan Lee, Antoine Falisse et al.

CVPR 2024posterarXiv:2308.16682
#10654

HumanRef: Single Image to 3D Human Generation via Reference-Guided Diffusion

Jingbo Zhang, Xiaoyu Li, Qi Zhang et al.

CVPR 2024posterarXiv:2311.16961
#10655

CurveCloudNet: Processing Point Clouds with 1D Structure

Colton Stearns, Alex Fu, Jiateng Liu et al.

CVPR 2024posterarXiv:2303.12050
#10656

Harnessing Meta-Learning for Improving Full-Frame Video Stabilization

Muhammad Kashif Ali, Eun Woo Im, Dongjin Kim et al.

CVPR 2024posterarXiv:2403.03662
#10657

Physical 3D Adversarial Attacks against Monocular Depth Estimation in Autonomous Driving

Junhao Zheng, Chenhao Lin, Jiahao Sun et al.

CVPR 2024posterarXiv:2403.17301
#10658

SeaBird: Segmentation in Bird’s View with Dice Loss Improves Monocular 3D Detection of Large Objects

Abhinav Kumar, Yuliang Guo, Xinyu Huang et al.

CVPR 2024posterarXiv:2403.20318
#10659

MoML: Online Meta Adaptation for 3D Human Motion Prediction

Xiaoning Sun, Huaijiang Sun, Bin Li et al.

CVPR 2024poster
#10660

Learning with Structural Labels for Learning with Noisy Labels

Noo-ri Kim, Jin-Seop Lee, Jee-Hyong Lee

CVPR 2024poster
#10661

What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models

Letian Zhang, Xiaotong Zhai, Zhongkai Zhao et al.

CVPR 2024posterarXiv:2310.06627
#10662

Incremental Nuclei Segmentation from Histopathological Images via Future-class Awareness and Compatibility-inspired Distillation

Huyong Wang, Huisi Wu, Jing Qin

CVPR 2024poster
#10663

Model Inversion Robustness: Can Transfer Learning Help?

Sy-Tuyen Ho, Koh Jun Hao, Keshigeyan Chandrasegaran et al.

CVPR 2024posterarXiv:2405.05588
#10664

Scene-adaptive and Region-aware Multi-modal Prompt for Open Vocabulary Object Detection

Xiaowei Zhao, Xianglong Liu, Duorui Wang et al.

CVPR 2024poster
#10665

InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models

Jiun Tian Hoe, Xudong Jiang, Chee Seng Chan et al.

CVPR 2024posterarXiv:2312.05849
#10666

MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection

Boyang Peng, Sanqing Qu, Yong Wu et al.

CVPR 2024posterarXiv:2403.04149
#10667

Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis

Xin Zhou, Dingkang Liang, Wei Xu et al.

CVPR 2024posterarXiv:2403.01439
#10668

EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation

Md Mostafijur Rahman, Mustafa Munir, Radu Marculescu

CVPR 2024posterarXiv:2405.06880
#10669

On Exact Inversion of DPM-Solvers

Seongmin Hong, Kyeonghyun Lee, Suh Yoon Jeon et al.

CVPR 2024posterarXiv:2311.18387
#10670

Generate Like Experts: Multi-Stage Font Generation by Incorporating Font Transfer Process into Diffusion Models

Bin Fu, Fanghua Yu, Anran Liu et al.

CVPR 2024poster
#10671

A Unified Diffusion Framework for Scene-aware Human Motion Estimation from Sparse Signals

Jiangnan Tang, Jingya Wang, Kaiyang Ji et al.

CVPR 2024posterarXiv:2404.04890
#10672

MaskCLR: Attention-Guided Contrastive Learning for Robust Action Representation Learning

Mohamed Abdelfattah, Mariam Hassan, Alex Alahi

CVPR 2024poster
#10673

D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection

Dinh Phat Do, Taehoon Kim, JAEMIN NA et al.

CVPR 2024posterarXiv:2403.09359
#10674

MAGICK: A Large-scale Captioned Dataset from Matting Generated Images using Chroma Keying

Ryan Burgert, Brian Price, Jason Kuen et al.

CVPR 2024poster
#10675

Intrinsic Image Diffusion for Indoor Single-view Material Estimation

Peter Kocsis, Vincent Sitzmann, Matthias Nießner

CVPR 2024posterarXiv:2312.12274
#10676

Prompt Highlighter: Interactive Control for Multi-Modal LLMs

Yuechen Zhang, Shengju Qian, Bohao Peng et al.

CVPR 2024posterarXiv:2312.04302
#10677

Can Protective Perturbation Safeguard Personal Data from Being Exploited by Stable Diffusion?

Zhengyue Zhao, Jinhao Duan, Kaidi Xu et al.

CVPR 2024posterarXiv:2312.00084
#10678

NetTrack: Tracking Highly Dynamic Objects with a Net

Guangze Zheng, Shijie Lin, Haobo Zuo et al.

CVPR 2024posterarXiv:2403.11186
#10679

Scaling Up Video Summarization Pretraining with Large Language Models

Dawit Argaw Argaw, Seunghyun Yoon, Fabian Caba Heilbron et al.

CVPR 2024posterarXiv:2404.03398
#10680

Online Task-Free Continual Generative and Discriminative Learning via Dynamic Cluster Memory

飞 叶, Adrian Bors

CVPR 2024poster
#10681

FADES: Fair Disentanglement with Sensitive Relevance

Taeuk Jang, Xiaoqian Wang

CVPR 2024poster
#10682

Versatile Navigation Under Partial Observability via Value-guided Diffusion Policy

Gengyu Zhang, Hao Tang, Yan Yan

CVPR 2024posterarXiv:2404.02176
#10683

Improving Depth Completion via Depth Feature Upsampling

Yufei Wang, Ge Zhang, Shaoqian Wang et al.

CVPR 2024poster
#10684

Deep Single Image Camera Calibration by Heatmap Regression to Recover Fisheye Images Under Manhattan World Assumption

Nobuhiko Wakai, Satoshi Sato, Yasunori Ishii et al.

CVPR 2024posterarXiv:2303.17166
#10685

MRFS: Mutually Reinforcing Image Fusion and Segmentation

HAO ZHANG, Xuhui Zuo, Jie Jiang et al.

CVPR 2024poster
#10686

Multi-agent Long-term 3D Human Pose Forecasting via Interaction-aware Trajectory Conditioning

Jaewoo Jeong, Daehee Park, Kuk-Jin Yoon

CVPR 2024highlightarXiv:2404.05218
#10687

OrCo: Towards Better Generalization via Orthogonality and Contrast for Few-Shot Class-Incremental Learning

Noor Ahmed, Anna Kukleva, Bernt Schiele

CVPR 2024highlightarXiv:2403.18550
#10688

3D-LFM: Lifting Foundation Model

Mosam Dabhi, László A. Jeni, Simon Lucey

CVPR 2024posterarXiv:2312.11894
#10689

LASIL: Learner-Aware Supervised Imitation Learning For Long-term Microscopic Traffic Simulation

Ke Guo, Zhenwei Miao, Wei Jing et al.

CVPR 2024posterarXiv:2403.17601
#10690

HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces

Haithem Turki, Vasu Agrawal, Samuel Rota Bulò et al.

CVPR 2024highlightarXiv:2312.03160
#10691

IIRP-Net: Iterative Inference Residual Pyramid Network for Enhanced Image Registration

Tai Ma, zhangsuwei, Jiafeng Li et al.

CVPR 2024poster
#10692

SEED-Bench: Benchmarking Multimodal Large Language Models

Bohao Li, Yuying Ge, Yixiao Ge et al.

CVPR 2024poster
#10693

Style Aligned Image Generation via Shared Attention

Amir Hertz, Andrey Voynov, Shlomi Fruchter et al.

CVPR 2024posterarXiv:2312.02133
#10694

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows

Zhenggang Tang, Jason Ren, Xiaoming Zhao et al.

CVPR 2024posterarXiv:2406.10543
#10695

BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models

Fengyuan Shi, Jiaxi Gu, Hang Xu et al.

CVPR 2024posterarXiv:2312.02813
#10696

Active Domain Adaptation with False Negative Prediction for Object Detection

Yuzuru Nakamura, Yasunori Ishii, Takayoshi Yamashita

CVPR 2024highlight
#10697

LTGC: Long-tail Recognition via Leveraging LLMs-driven Generated Content

Qihao Zhao, Yalun Dai, Hao Li et al.

CVPR 2024posterarXiv:2403.05854
#10698

How to Train Neural Field Representations: A Comprehensive Study and Benchmark

Samuele Papa, Riccardo Valperga, David Knigge et al.

CVPR 2024posterarXiv:2312.10531
#10699

Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring

Chengxu Liu, Xuan Wang, Xiangyu Xu et al.

CVPR 2024posterarXiv:2404.13153
#10700

SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation

Zhixuan Liu, Peter Schaldenbrand, Beverley-Claire Okogwu et al.

CVPR 2024posterarXiv:2401.08053
#10701

Reg-PTQ: Regression-specialized Post-training Quantization for Fully Quantized Object Detector

Yifu Ding, Weilun Feng, Chuyan Chen et al.

CVPR 2024poster
#10702

FREE: Faster and Better Data-Free Meta-Learning

Yongxian Wei, Zixuan Hu, Zhenyi Wang et al.

CVPR 2024posterarXiv:2405.00984
#10703

Open Vocabulary Semantic Scene Sketch Understanding

Ahmed Bourouis, Judith Fan, Yulia Gryaditskaya

CVPR 2024posterarXiv:2312.12463
#10704

You Only Need Less Attention at Each Stage in Vision Transformers

Shuoxi Zhang, Hanpeng Liu, Stephen Lin et al.

CVPR 2024posterarXiv:2406.00427
#10705

Hierarchical Patch Diffusion Models for High-Resolution Video Generation

Ivan Skorokhodov, Willi Menapace, Aliaksandr Siarohin et al.

CVPR 2024posterarXiv:2406.07792
#10706

Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection

Chuangchuang Tan, Huan Liu, Yao Zhao et al.

CVPR 2024posterarXiv:2312.10461
#10707

Building a Strong Pre-Training Baseline for Universal 3D Large-Scale Perception

Haoming Chen, Zhizhong Zhang, Yanyun Qu et al.

CVPR 2024posterarXiv:2405.07201
#10708

BoQ: A Place is Worth a Bag of Learnable Queries

Amar Ali-bey, Brahim Chaib-draa, Philippe Giguère

CVPR 2024posterarXiv:2405.07364
#10709

UFC-Net: Unrolling Fixed-point Continuous Network for Deep Compressive Sensing

Xiaoyang Wang, Hongping Gan

CVPR 2024poster
#10710

Symphonize 3D Semantic Scene Completion with Contextual Instance Queries

Haoyi Jiang, Tianheng Cheng, Naiyu Gao et al.

CVPR 2024posterarXiv:2306.15670
#10711

CPLIP: Zero-Shot Learning for Histopathology with Comprehensive Vision-Language Alignment

Sajid Javed, Arif Mahmood, IYYAKUTTI IYAPPAN GANAPATHI et al.

CVPR 2024posterarXiv:2406.05205
#10712

MaskPLAN: Masked Generative Layout Planning from Partial Input

Hang Zhang, Anton Savov, Benjamin Dillenburger

CVPR 2024poster
#10713

Solving Masked Jigsaw Puzzles with Diffusion Vision Transformers

Jinyang Liu, Wondmgezahu Teshome, Sandesh Ghimire et al.

CVPR 2024posterarXiv:2404.07292
#10714

Towards Memorization-Free Diffusion Models

Chen Chen, Daochang Liu, Chang Xu

CVPR 2024posterarXiv:2404.00922
#10715

AV-RIR: Audio-Visual Room Impulse Response Estimation

Anton Ratnarajah, Sreyan Ghosh, Sonal Kumar et al.

CVPR 2024posterarXiv:2312.00834
#10716

Entangled View-Epipolar Information Aggregation for Generalizable Neural Radiance Fields

Zhiyuan Min, Yawei Luo, Wei Yang et al.

CVPR 2024posterarXiv:2311.11845
#10717

A-Teacher: Asymmetric Network for 3D Semi-Supervised Object Detection

Hanshi Wang, Zhipeng Zhang, Jin Gao et al.

CVPR 2024poster
#10718

HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances

Supreeth Narasimhaswamy, Uttaran Bhattacharya, Xiang Chen et al.

CVPR 2024posterarXiv:2403.01693
#10719

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Bin Xiao, Haiping Wu, Weijian Xu et al.

CVPR 2024posterarXiv:2311.06242
#10720

DMR: Decomposed Multi-Modality Representations for Frames and Events Fusion in Visual Reinforcement Learning

Haoran Xu, Peixi Peng, Guang Tan et al.

CVPR 2024poster
#10721

3D Feature Tracking via Event Camera

Siqi Li, Zhou Zhikuan, Zhou Xue et al.

CVPR 2024poster
#10722

Frequency-aware Event-based Video Deblurring for Real-World Motion Blur

Taewoo Kim, Hoonhee Cho, Kuk-Jin Yoon

CVPR 2024poster
#10723

FedHCA2: Towards Hetero-Client Federated Multi-Task Learning

Yuxiang Lu, Suizhi Huang, Yuwen Yang et al.

CVPR 2024poster
#10724

Improving Unsupervised Hierarchical Representation with Reinforcement Learning

Ruyi An, Yewen Li, Xu He et al.

CVPR 2024poster
#10725

Global Latent Neural Rendering

Thomas Tanay, Matteo Maggioni

CVPR 2024posterarXiv:2312.08338
#10726

Data Poisoning based Backdoor Attacks to Contrastive Learning

Jinghuai Zhang, Hongbin Liu, Jinyuan Jia et al.

CVPR 2024posterarXiv:2211.08229
#10727

SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos

Tao Wu, Runyu He, Gangshan Wu et al.

CVPR 2024posterarXiv:2404.04565
#10728

Classes Are Not Equal: An Empirical Study on Image Recognition Fairness

Jiequan Cui, Beier Zhu, Xin Wen et al.

CVPR 2024posterarXiv:2402.18133
#10729

ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object

Chenshuang Zhang, Fei Pan, Junmo Kim et al.

CVPR 2024highlightarXiv:2403.18775
#10730

BlockGCN: Redefine Topology Awareness for Skeleton-Based Action Recognition

Yuxuan Zhou, Xudong Yan, Zhi-Qi Cheng et al.

CVPR 2024poster
#10731

Dynamic Inertial Poser (DynaIP): Part-Based Motion Dynamics Learning for Enhanced Human Pose Estimation with Sparse Inertial Sensors

Yu Zhang, Songpengcheng Xia, Lei Chu et al.

CVPR 2024posterarXiv:2312.02196
#10732

Person-in-WiFi 3D: End-to-End Multi-Person 3D Pose Estimation with Wi-Fi

Kangwei Yan, Fei Wang, Bo Qian et al.

CVPR 2024poster
#10733

ERMVP: Communication-Efficient and Collaboration-Robust Multi-Vehicle Perception in Challenging Environments

Jingyu Zhang, Kun Yang, Yilei Wang et al.

CVPR 2024poster
#10734

GRAM: Global Reasoning for Multi-Page VQA

Itshak Blau, Sharon Fogel, Roi Ronen et al.

CVPR 2024posterarXiv:2401.03411
#10735

HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data

Qifan Yu, Juncheng Li, Longhui Wei et al.

CVPR 2024posterarXiv:2311.13614
#10736

Tri-Perspective View Decomposition for Geometry-Aware Depth Completion

Zhiqiang Yan, Yuankai Lin, Kun Wang et al.

CVPR 2024posterarXiv:2403.15008
#10737

MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric

Haokun Lin, Haoli Bai, Zhili Liu et al.

CVPR 2024posterarXiv:2403.07839
#10738

DiffusionRegPose: Enhancing Multi-Person Pose Estimation using a Diffusion-Based End-to-End Regression Approach

Dayi Tan, Hansheng Chen, Wei Tian et al.

CVPR 2024poster
#10739

Tumor Micro-environment Interactions Guided Graph Learning for Survival Analysis of Human Cancers from Whole-slide Pathological Images

WEI SHAO, YangYang Shi, Daoqiang Zhang et al.

CVPR 2024poster
#10740

Perception-Oriented Video Frame Interpolation via Asymmetric Blending

Guangyang Wu, Xin Tao, Changlin Li et al.

CVPR 2024posterarXiv:2404.06692
#10741

Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation

Qi Yang, Xing Nie, Tong Li et al.

CVPR 2024highlightarXiv:2312.06462
#10742

Exact Fusion via Feature Distribution Matching for Few-shot Image Generation

Yingbo Zhou, Yutong Ye, Pengyu Zhang et al.

CVPR 2024poster
#10743

Fooling Polarization-Based Vision using Locally Controllable Polarizing Projection

Zhuoxiao Li, Zhihang Zhong, Shohei Nobuhara et al.

CVPR 2024posterarXiv:2303.17890
#10744

Affine Equivariant Networks Based on Differential Invariants

Yikang Li, Yeqing Qiu, Yuxuan Chen et al.

CVPR 2024poster
#10745

Diffusion-based Blind Text Image Super-Resolution

Yuzhe Zhang, jiawei zhang, Hao Li et al.

CVPR 2024posterarXiv:2312.08886
#10746

Improving Generalized Zero-Shot Learning by Exploring the Diverse Semantics from External Class Names

Yapeng Li, Yong Luo, Zengmao Wang et al.

CVPR 2024poster
#10747

Continual Learning for Motion Prediction Model via Meta-Representation Learning and Optimal Memory Buffer Retention Strategy

Dae Jun Kang, Dongsuk Kum, Sanmin Kim

CVPR 2024poster
#10748

FlowDiffuser: Advancing Optical Flow Estimation with Diffusion Models

Ao Luo, XIN LI, Fan Yang et al.

CVPR 2024highlight
#10749

3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting

Zhiyin Qian, Shaofei Wang, Marko Mihajlovic et al.

CVPR 2024posterarXiv:2312.09228
#10750

Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation

Haofeng Liu, Chenshu Xu, Yifei Yang et al.

CVPR 2024posterarXiv:2404.01050
#10751

AdaRevD: Adaptive Patch Exiting Reversible Decoder Pushes the Limit of Image Deblurring

Xintian Mao, Xiwen Gao, Yan Wang

CVPR 2024posterarXiv:2406.09135
#10752

Puff-Net: Efficient Style Transfer with Pure Content and Style Feature Fusion Network

Sizhe Zheng, Pan Gao, Peng Zhou et al.

CVPR 2024posterarXiv:2405.19775
#10753

SynSP: Synergy of Smoothness and Precision in Pose Sequences Refinement

Tao Wang, Lei Jin, Zheng Wang et al.

CVPR 2024poster
#10754

Building Vision-Language Models on Solid Foundations with Masked Distillation

Sepehr Sameni, Kushal Kafle, Hao Tan et al.

CVPR 2024poster
#10755

MS-DETR: Efficient DETR Training with Mixed Supervision

Chuyang Zhao, Yifan Sun, Wenhao Wang et al.

CVPR 2024posterarXiv:2401.03989
#10756

FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation

Pengchong Qiao, Lei Shang, Chang Liu et al.

CVPR 2024posterarXiv:2403.06775
#10757

Towards High-fidelity Artistic Image Vectorization via Texture-Encapsulated Shape Parameterization

Ye Chen, Bingbing Ni, Jinfan Liu et al.

CVPR 2024poster
#10758

OmniSDF: Scene Reconstruction using Omnidirectional Signed Distance Functions and Adaptive Binoctrees

Hakyeong Kim, Andreas Meuleman, Hyeonjoong Jang et al.

CVPR 2024posterarXiv:2404.00678
#10759

Deformable One-shot Face Stylization via DINO Semantic Guidance

Yang Zhou, Zichong Chen, Hui Huang

CVPR 2024posterarXiv:2403.00459
#10760

Density-Guided Semi-Supervised 3D Semantic Segmentation with Dual-Space Hardness Sampling

Jianan Li, Qiulei Dong

CVPR 2024poster
#10761

Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework

Vu Minh Hieu Phan, Yutong Xie, Yuankai Qi et al.

CVPR 2024posterarXiv:2403.07636
#10762

LP++: A Surprisingly Strong Linear Probe for Few-Shot CLIP

Yunshi HUANG, Fereshteh Shakeri, Jose Dolz et al.

CVPR 2024posterarXiv:2404.02285
#10763

1-Lipschitz Layers Compared: Memory Speed and Certifiable Robustness

Bernd Prach, Fabio Brau, Giorgio Buttazzo et al.

CVPR 2024poster
#10764

VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models

Hyeonho Jeong, Geon Yeong Park, Jong Chul Ye

CVPR 2024posterarXiv:2312.00845
#10765

PoNQ: a Neural QEM-based Mesh Representation

Nissim Maruani, Maks Ovsjanikov, Pierre Alliez et al.

CVPR 2024posterarXiv:2403.12870
#10766

M3-UDA: A New Benchmark for Unsupervised Domain Adaptive Fetal Cardiac Structure Detection

Bin Pu, Liwen Wang, Jiewen Yang et al.

CVPR 2024poster
#10767

Expandable Subspace Ensemble for Pre-Trained Model-Based Class-Incremental Learning

Da-Wei Zhou, Hai-Long Sun, Han-Jia Ye et al.

CVPR 2024posterarXiv:2403.12030
#10768

Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss

Jaeha Kim, Junghun Oh, Kyoung Mu Lee

CVPR 2024posterarXiv:2404.01692
#10769

Point-VOS: Pointing Up Video Object Segmentation

Sabarinath Mahadevan, Idil Esen Zulfikar, Paul Voigtlaender et al.

CVPR 2024posterarXiv:2402.05917
#10770

3D Face Tracking from 2D Video through Iterative Dense UV to Image Flow

Felix Taubner, Prashant Raina, Mathieu Tuli et al.

CVPR 2024posterarXiv:2404.09819
#10771

HIT: Estimating Internal Human Implicit Tissues from the Body Surface

Marilyn Keller, Vaibhav ARORA, Abdelmouttaleb Dakri et al.

CVPR 2024poster
#10772

Authentic Hand Avatar from a Phone Scan via Universal Hand Model

Gyeongsik Moon, Weipeng Xu, Rohan Joshi et al.

CVPR 2024posterarXiv:2405.07933
#10773

Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection

Jin Yang, Ping Wei, Huan Li et al.

CVPR 2024posterarXiv:2404.09263
#10774

Multiway Point Cloud Mosaicking with Diffusion and Global Optimization

Shengze Jin, Iro Armeni, Marc Pollefeys et al.

CVPR 2024posterarXiv:2404.00429
#10775

NeRSP: Neural 3D Reconstruction for Reflective Objects with Sparse Polarized Images

Yufei Han, Heng Guo, Koki Fukai et al.

CVPR 2024posterarXiv:2406.07111
#10776

HDRFlow: Real-Time HDR Video Reconstruction with Large Motions

Gangwei Xu, Yujin Wang, Jinwei Gu et al.

CVPR 2024posterarXiv:2403.03447
#10777

Beyond Average: Individualized Visual Scanpath Prediction

Xianyu Chen, Ming Jiang, Qi Zhao

CVPR 2024posterarXiv:2404.12235
#10778

Beyond Text: Frozen Large Language Models in Visual Signal Comprehension

Lei Zhu, Fangyun Wei, Yanye Lu

CVPR 2024posterarXiv:2403.07874
#10779

LEDITS++: Limitless Image Editing using Text-to-Image Models

Manuel Brack, Felix Friedrich, Katharina Kornmeier et al.

CVPR 2024posterarXiv:2311.16711
#10780

CAM Back Again: Large Kernel CNNs from a Weakly Supervised Object Localization Perspective

Shunsuke Yasuki, Masato Taki

CVPR 2024posterarXiv:2403.06676
#10781

Regularized Parameter Uncertainty for Improving Generalization in Reinforcement Learning

Pehuen Moure, Longbiao Cheng, Joachim Ott et al.

CVPR 2024poster
#10782

Robust Noisy Correspondence Learning with Equivariant Similarity Consistency

Yuchen Yang, Erkun Yang, Likai Wang et al.

CVPR 2024poster
#10783

Situational Awareness Matters in 3D Vision Language Reasoning

Yunze Man, Liang-Yan Gui, Yu-Xiong Wang

CVPR 2024posterarXiv:2406.07544
#10784

Decentralized Directed Collaboration for Personalized Federated Learning

Yingqi Liu, Yifan Shi, Qinglun Li et al.

CVPR 2024posterarXiv:2405.17876
#10785

Task-Driven Wavelets using Constrained Empirical Risk Minimization

Eric Marcus, Ray Sheombarsing, Jan-Jakob Sonke et al.

CVPR 2024poster
#10786

SIFU: Side-view Conditioned Implicit Function for Real-world Usable Clothed Human Reconstruction

Zechuan Zhang, Zongxin Yang, Yi Yang

CVPR 2024highlightarXiv:2312.06704
#10787

OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning

Siddharth Srivastava, Gaurav Sharma

CVPR 2024posterarXiv:2507.13364
#10788

Probing Synergistic High-Order Interaction in Infrared and Visible Image Fusion

Naishan Zheng, Man Zhou, Jie Huang et al.

CVPR 2024poster
#10789

Scaling Up Dynamic Human-Scene Interaction Modeling

Nan Jiang, Zhiyuan Zhang, Hongjie Li et al.

CVPR 2024highlightarXiv:2403.08629
#10790

Utility-Fairness Trade-Offs and How to Find Them

Sepehr Dehdashtian, Bashir Sadeghi, Vishnu Naresh Boddeti

CVPR 2024posterarXiv:2404.09454
#10791

Data-Free Quantization via Pseudo-label Filtering

Chunxiao Fan, Ziqi Wang, Dan Guo et al.

CVPR 2024poster
#10792

Fitting Flats to Flats

Gabriel Dogadov, Ugo Finnendahl, Marc Alexa

CVPR 2024poster
#10793

HOIST-Former: Hand-held Objects Identification Segmentation and Tracking in the Wild

Supreeth Narasimhaswamy, Huy Anh Nguyen, Lihan Huang et al.

CVPR 2024poster
#10794

Animating General Image with Large Visual Motion Model

Dengsheng Chen, Xiaoming Wei, Xiaolin Wei

CVPR 2024poster
#10795

MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning

Yixin Liu, Chenrui Fan, Yutong Dai et al.

CVPR 2024posterarXiv:2311.13127
#10796

EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams

Christen Millerdurai, Hiroyasu Akada, Jian Wang et al.

CVPR 2024posterarXiv:2404.08640
#10797

ModaVerse: Efficiently Transforming Modalities with LLMs

Xinyu Wang, Bohan Zhuang, Qi Wu

CVPR 2024posterarXiv:2401.06395
#10798

Improving Generalization via Meta-Learning on Hard Samples

Nishant Jain, Arun Suggala, Pradeep Shenoy

CVPR 2024posterarXiv:2403.12236
#10799

WaveFace: Authentic Face Restoration with Efficient Frequency Recovery

Yunqi Miao, Jiankang Deng, Jungong Han

CVPR 2024posterarXiv:2403.12760
#10800

Hierarchical Histogram Threshold Segmentation – Auto-terminating High-detail Oversegmentation

Thomas Chang, Simon Seibt, Bartosz von Rymon Lipinski

CVPR 2024poster