Most Cited 2024 "image stylization" Papers

12,324 papers found • Page 54 of 62

#10601

SnAG: Scalable and Accurate Video Grounding

Fangzhou Mu, Sicheng Mo, Yin Li

CVPR 2024posterarXiv:2404.02257
#10602

GLaMM: Pixel Grounding Large Multimodal Model

Hanoona Rasheed, Muhammad Maaz, Sahal Shaji Mullappilly et al.

CVPR 2024posterarXiv:2311.03356
#10603

ManiFPT: Defining and Analyzing Fingerprints of Generative Models

Hae Jin Song, Mahyar Khayatkhoei, Wael AbdAlmageed

CVPR 2024posterarXiv:2402.10401
#10604

Self-Calibrating Vicinal Risk Minimisation for Model Calibration

Jiawei Liu, Changkun Ye, Ruikai Cui et al.

CVPR 2024poster
#10605

Cinematic Behavior Transfer via NeRF-based Differentiable Filming

Xuekun Jiang, Anyi Rao, Jingbo Wang et al.

CVPR 2024posterarXiv:2311.17754
#10606

Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning

Leonardo Iurada, Marco Ciccone, Tatiana Tommasi

CVPR 2024posterarXiv:2406.01820
#10607

CORE-MPI: Consistency Object Removal with Embedding MultiPlane Image

Donggeun Yoon, Donghyeon Cho

CVPR 2024poster
#10608

ScoreHypo: Probabilistic Human Mesh Estimation with Hypothesis Scoring

Yuan Xu, Xiaoxuan Ma, Jiajun Su et al.

CVPR 2024poster
#10609

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models

Nataniel Ruiz, Yuanzhen Li, Varun Jampani et al.

CVPR 2024posterarXiv:2307.06949
#10610

Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models

Chang Liu, Haoning Wu, Yujie Zhong et al.

CVPR 2024posterarXiv:2306.00973
#10611

UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather

Haimei Zhao, Jing Zhang, Zhuo Chen et al.

CVPR 2024posterarXiv:2404.05145
#10612

LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs

Yunsheng Ma, Can Cui, Xu Cao et al.

CVPR 2024posterarXiv:2312.04372
#10613

Dynamic Policy-Driven Adaptive Multi-Instance Learning for Whole Slide Image Classification

Tingting Zheng, Kui Jiang, Hongxun Yao

CVPR 2024highlightarXiv:2403.07939
#10614

BilevelPruning: Unified Dynamic and Static Channel Pruning for Convolutional Neural Networks

Shangqian Gao, Yanfu Zhang, Feihu Huang et al.

CVPR 2024poster
#10615

Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models

Jiayi Guo, Xingqian Xu, Yifan Pu et al.

CVPR 2024posterarXiv:2312.04410
#10616

Editable Scene Simulation for Autonomous Driving via Collaborative LLM-Agents

Yuxi Wei, Zi Wang, Yifan Lu et al.

CVPR 2024highlightarXiv:2402.05746
#10617

Learning Continuous 3D Words for Text-to-Image Generation

Ta-Ying Cheng, Matheus Gadelha, Thibault Groueix et al.

CVPR 2024posterarXiv:2402.08654
#10618

Content-Adaptive Non-Local Convolution for Remote Sensing Pansharpening

Yule Duan, Xiao Wu, Haoyu Deng et al.

CVPR 2024posterarXiv:2404.07543
#10619

A Conditional Denoising Diffusion Probabilistic Model for Point Cloud Upsampling

Wentao Qu, Yuantian Shao, Lingwu Meng et al.

CVPR 2024posterarXiv:2312.02719
#10620

APISR: Anime Production Inspired Real-World Anime Super-Resolution

Boyang Wang, Fengyu Yang, Xihang Yu et al.

CVPR 2024posterarXiv:2403.01598
#10621

Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions

Weizhen He, Yiheng Deng, SHIXIANG TANG et al.

CVPR 2024posterarXiv:2306.07520
#10622

Device-Wise Federated Network Pruning

Shangqian Gao, Junyi Li, Zeyu Zhang et al.

CVPR 2024poster
#10623

SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers

Ioannis Kakogeorgiou, Spyros Gidaris, Konstantinos Karantzalos et al.

CVPR 2024highlightarXiv:2312.00648
#10624

MRC-Net: 6-DoF Pose Estimation with MultiScale Residual Correlation

Yuelong Li, Yafei Mao, Raja Bala et al.

CVPR 2024posterarXiv:2403.08019
#10625

Progress-Aware Online Action Segmentation for Egocentric Procedural Task Videos

Yuhan Shen, Ehsan Elhamifar

CVPR 2024poster
#10626

MOHO: Learning Single-view Hand-held Object Reconstruction with Multi-view Occlusion-Aware Supervision

Chenyangguang Zhang, Guanlong Jiao, Yan Di et al.

CVPR 2024posterarXiv:2310.11696
#10627

Faces that Speak: Jointly Synthesising Talking Face and Speech from Text

Youngjoon Jang, Jihoon Kim, Junseok Ahn et al.

CVPR 2024posterarXiv:2405.10272
#10628

Learning to Segment Referred Objects from Narrated Egocentric Videos

Yuhan Shen, Huiyu Wang, Xitong Yang et al.

CVPR 2024poster
#10629

EGTR: Extracting Graph from Transformer for Scene Graph Generation

Jinbae Im, JeongYeon Nam, Nokyung Park et al.

CVPR 2024posterarXiv:2404.02072
#10630

Distributionally Generative Augmentation for Fair Facial Attribute Classification

Fengda Zhang, Qianpei He, Kun Kuang et al.

CVPR 2024posterarXiv:2403.06606
#10631

PikeLPN: Mitigating Overlooked Inefficiencies of Low-Precision Neural Networks

Marina Neseem, Conor McCullough, Randy Hsin et al.

CVPR 2024posterarXiv:2404.00103
#10632

THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models

Prannay Kaul, Zhizhong Li, Hao Yang et al.

CVPR 2024posterarXiv:2405.05256
#10633

UniVS: Unified and Universal Video Segmentation with Prompts as Queries

Minghan LI, Shuai Li, Xindong Zhang et al.

CVPR 2024posterarXiv:2402.18115
#10634

Inlier Confidence Calibration for Point Cloud Registration

Yongzhe Yuan, Yue Wu, Xiaolong Fan et al.

CVPR 2024poster
#10635

CLIP-BEVFormer: Enhancing Multi-View Image-Based BEV Detector with Ground Truth Flow

Chenbin Pan, Burhan Yaman, Senem Velipasalar et al.

CVPR 2024posterarXiv:2403.08919
#10636

ADFactory: An Effective Framework for Generalizing Optical Flow with NeRF

Han Ling, Quansen Sun, Yinghui Sun et al.

CVPR 2024posterarXiv:2311.04246
#10637

3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions

Weijia Li, Haote Yang, Zhenghao Hu et al.

CVPR 2024posterarXiv:2404.04823
#10638

In Search of a Data Transformation That Accelerates Neural Field Training

Junwon Seo, Sangyoon Lee, Kwang In Kim et al.

CVPR 2024posterarXiv:2311.17094
#10639

FastMAC: Stochastic Spectral Sampling of Correspondence Graph

Yifei Zhang, Hao Zhao, Hongyang Li et al.

CVPR 2024posterarXiv:2403.08770
#10640

PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution

Honghao Chen, Xiangxiang Chu, Renyongjian et al.

CVPR 2024posterarXiv:2403.07589
#10641

Towards Generalizing to Unseen Domains with Few Labels

Chamuditha Jayanga Galappaththige, Sanoojan Baliah, Malitha Gunawardhana et al.

CVPR 2024posterarXiv:2403.11674
#10642

Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts

Jialin Wu, Xia Hu, Yaqing Wang et al.

CVPR 2024highlightarXiv:2312.00968
#10643

Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps

Octave Mariotti, Oisin Mac Aodha, Hakan Bilen

CVPR 2024posterarXiv:2312.13216
#10644

Learning Degradation-Independent Representations for Camera ISP Pipelines

Yanhui Guo, Fangzhou Luo, Xiaolin Wu

CVPR 2024posterarXiv:2307.00761
#10645

A Subspace-Constrained Tyler's Estimator and its Applications to Structure from Motion

Feng Yu, Teng Zhang, Gilad Lerman

CVPR 2024posterarXiv:2404.11590
#10646

Low-Resource Vision Challenges for Foundation Models

Yunhua Zhang, Hazel Doughty, Cees G. M. Snoek

CVPR 2024posterarXiv:2401.04716
#10647

Low-Latency Neural Stereo Streaming

Qiqi Hou, Farzad Farhadzadeh, Amir Said et al.

CVPR 2024posterarXiv:2403.17879
#10648

Your Transferability Barrier is Fragile: Free-Lunch for Transferring the Non-Transferable Learning

Ziming Hong, Li Shen, Tongliang Liu

CVPR 2024highlight
#10649

ARTrackV2: Prompting Autoregressive Tracker Where to Look and How to Describe

Yifan Bai, Zeyang Zhao, Yihong Gong et al.

CVPR 2024posterarXiv:2312.17133
#10650

DPHMs: Diffusion Parametric Head Models for Depth-based Tracking

Jiapeng Tang, Angela Dai, Yinyu Nie et al.

CVPR 2024posterarXiv:2312.01068
#10651

MaxQ: Multi-Axis Query for N:M Sparsity Network

Jingyang Xiang, Siqi Li, Junhao Chen et al.

CVPR 2024poster
#10652

The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes

Myeongseob Ko, Feiyang Kang, Weiyan Shi et al.

CVPR 2024posterarXiv:2402.08922
#10653

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Haoning Wu, Zicheng Zhang, Erli Zhang et al.

CVPR 2024posterarXiv:2311.06783
#10654

Efficient Scene Recovery Using Luminous Flux Prior

ZhongYu Li, Lei Zhang

CVPR 2024poster
#10655

Structured Gradient-based Interpretations via Norm-Regularized Adversarial Training

Shizhan Gong, Qi Dou, Farzan Farnia

CVPR 2024posterarXiv:2404.04647
#10656

Revisiting Global Translation Estimation with Feature Tracks

Peilin Tao, Hainan Cui, Mengqi Rong et al.

CVPR 2024poster
#10657

Forgery-aware Adaptive Transformer for Generalizable Synthetic Image Detection

Huan Liu, Zichang Tan, Chuangchuang Tan et al.

CVPR 2024posterarXiv:2312.16649
#10658

MeaCap: Memory-Augmented Zero-shot Image Captioning

Zequn Zeng, Yan Xie, Hao Zhang et al.

CVPR 2024posterarXiv:2403.03715
#10659

MuseChat: A Conversational Music Recommendation System for Videos

Zhikang Dong, Bin Chen, Xiulong Liu et al.

CVPR 2024highlightarXiv:2310.06282
#10660

Novel View Synthesis with View-Dependent Effects from a Single Image

Juan Luis Gonzalez Bello, Munchurl Kim

CVPR 2024posterarXiv:2312.08071
#10661

Orchestrate Latent Expertise: Advancing Online Continual Learning with Multi-Level Supervision and Reverse Self-Distillation

Hongwei Yan, Liyuan Wang, Kaisheng Ma et al.

CVPR 2024posterarXiv:2404.00417
#10662

DisCo: Disentangled Control for Realistic Human Dance Generation

Tan Wang, Linjie Li, Kevin Lin et al.

CVPR 2024posterarXiv:2307.00040
#10663

Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing

Xun Lin, Shuai Wang, RIZHAO CAI et al.

CVPR 2024highlightarXiv:2402.19298
#10664

Constructing and Exploring Intermediate Domains in Mixed Domain Semi-supervised Medical Image Segmentation

Qinghe Ma, Jian Zhang, Lei Qi et al.

CVPR 2024posterarXiv:2404.08951
#10665

LAMP: Learn A Motion Pattern for Few-Shot Video Generation

Rui-Qi Wu, Liangyu Chen, Tong Yang et al.

CVPR 2024poster
#10666

PixelLM: Pixel Reasoning with Large Multimodal Model

Zhongwei Ren, Zhicheng Huang, Yunchao Wei et al.

CVPR 2024posterarXiv:2312.02228
#10667

Towards CLIP-driven Language-free 3D Visual Grounding via 2D-3D Relational Enhancement and Consistency

Yuqi Zhang, Han Luo, Yinjie Lei

CVPR 2024poster
#10668

iKUN: Speak to Trackers without Retraining

Yunhao Du, Cheng Lei, Zhicheng Zhao et al.

CVPR 2024posterarXiv:2312.16245
#10669

Neural Fields as Distributions: Signal Processing Beyond Euclidean Space

Daniel Rebain, Soroosh Yazdani, Kwang Moo Yi et al.

CVPR 2024poster
#10670

Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation

Xingqun Qi, Jiahao Pan, Peng Li et al.

CVPR 2024posterarXiv:2311.17532
#10671

LaRE^2: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection

Yunpeng Luo, Junlong Du, Ke Yan et al.

CVPR 2024posterarXiv:2403.17465
#10672

Stratified Avatar Generation from Sparse Observations

Han Feng, Wenchao Ma, Quankai Gao et al.

CVPR 2024posterarXiv:2405.20786
#10673

Few-shot Learner Parameterization by Diffusion Time-steps

Zhongqi Yue, Pan Zhou, Richang Hong et al.

CVPR 2024posterarXiv:2403.02649
#10674

Global and Hierarchical Geometry Consistency Priors for Few-shot NeRFs in Indoor Scenes

Xiaotian Sun, Qingshan Xu, Xinjie Yang et al.

CVPR 2024poster
#10675

Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation

Zihan Wang, Xiangyang Li, Jiahao Yang et al.

CVPR 2024highlightarXiv:2404.01943
#10676

Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis

Simon Niedermayr, Josef Stumpfegger, rüdiger westermann

CVPR 2024posterarXiv:2401.02436
#10677

The STVchrono Dataset: Towards Continuous Change Recognition in Time

Yanjun Sun, Yue Qiu, Mariia Khan et al.

CVPR 2024poster
#10678

Unleashing Channel Potential: Space-Frequency Selection Convolution for SAR Object Detection

Ke Li, Di Wang, Zhangyuan Hu et al.

CVPR 2024poster
#10679

Motion Blur Decomposition with Cross-shutter Guidance

Xiang Ji, Haiyang Jiang, Yinqiang Zheng

CVPR 2024posterarXiv:2404.01120
#10680

LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge

Gongwei Chen, Leyang Shen, Rui Shao et al.

CVPR 2024posterarXiv:2311.11860
#10681

Pixel-Aligned Language Model

Jiarui Xu, Xingyi Zhou, Shen Yan et al.

CVPR 2024poster
#10682

Eclipse: Disambiguating Illumination and Materials using Unintended Shadows

Dor Verbin, Ben Mildenhall, Peter Hedman et al.

CVPR 2024posterarXiv:2305.16321
#10683

ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis

Muhammad Hamza Mughal, Rishabh Dabral, Ikhsanul Habibie et al.

CVPR 2024posterarXiv:2403.17936
#10684

2S-UDF: A Novel Two-stage UDF Learning Method for Robust Non-watertight Model Reconstruction from Multi-view Images

Junkai Deng, Fei Hou, Xuhui Chen et al.

CVPR 2024posterarXiv:2303.15368
#10685

ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models

Lukas Höllein, Aljaž Božič, Norman Müller et al.

CVPR 2024posterarXiv:2403.01807
#10686

Taming Stable Diffusion for Text to 360 Panorama Image Generation

Cheng Zhang, Qianyi Wu, Camilo Cruz Gambardella et al.

CVPR 2024highlightarXiv:2404.07949
#10687

CAMEL: CAusal Motion Enhancement Tailored for Lifting Text-driven Video Editing

Guiwei Zhang, Tianyu Zhang, Guanglin Niu et al.

CVPR 2024poster
#10688

DuPL: Dual Student with Trustworthy Progressive Learning for Robust Weakly Supervised Semantic Segmentation

Yuanchen Wu, Xichen Ye, KequanYang et al.

CVPR 2024posterarXiv:2403.11184
#10689

A Physics-informed Low-rank Deep Neural Network for Blind and Universal Lens Aberration Correction

Jin Gong, Runzhao Yang, Weihang Zhang et al.

CVPR 2024poster
#10690

NAPGuard: Towards Detecting Naturalistic Adversarial Patches

Siyang Wu, Jiakai Wang, Jiejie Zhao et al.

CVPR 2024poster
#10691

Descriptor and Word Soups: Overcoming the Parameter Efficiency Accuracy Tradeoff for Out-of-Distribution Few-shot Learning

Christopher Liao, Theodoros Tsiligkaridis, Brian Kulis

CVPR 2024posterarXiv:2311.13612
#10692

A Stealthy Wrongdoer: Feature-Oriented Reconstruction Attack against Split Learning

Xiaoyang Xu, Mengda Yang, Wenzhe Yi et al.

CVPR 2024posterarXiv:2405.04115
#10693

Bootstrapping SparseFormers from Vision Foundation Models

Ziteng Gao, Zhan Tong, Kevin Qinghong Lin et al.

CVPR 2024posterarXiv:2312.01987
#10694

Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity

Ruijie Quan, Wenguan Wang, Zhibo Tian et al.

CVPR 2024posterarXiv:2403.20022
#10695

G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images

Zixiong Huang, Qi Chen, Libo Sun et al.

CVPR 2024posterarXiv:2404.07474
#10696

Active Prompt Learning in Vision Language Models

Jihwan Bang, Sumyeong Ahn, Jae-Gil Lee

CVPR 2024posterarXiv:2311.11178
#10697

Generating Handwritten Mathematical Expressions From Symbol Graphs: An End-to-End Pipeline

Yu chen, Fei Gao, YanguangZhang et al.

CVPR 2024poster
#10698

On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation

Agneet Chatterjee, Tejas Gokhale, Chitta Baral et al.

CVPR 2024posterarXiv:2404.08540
#10699

SingularTrajectory: Universal Trajectory Predictor Using Diffusion Model

Inhwan Bae, Young-Jae Park, Hae-Gon Jeon

CVPR 2024posterarXiv:2403.18452
#10700

Domain Separation Graph Neural Networks for Saliency Object Ranking

Zijian Wu, Jun Lu, Jing Han et al.

CVPR 2024poster
#10701

Solving the Catastrophic Forgetting Problem in Generalized Category Discovery

Xinzi Cao, Xiawu Zheng, Guanhong Wang et al.

CVPR 2024posterarXiv:2501.05272
#10702

Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving

Yuqi Wang, Jiawei He, Lue Fan et al.

CVPR 2024posterarXiv:2311.17918
#10703

WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models

Changhoon Kim, Kyle Min, Maitreya Patel et al.

CVPR 2024posterarXiv:2306.04744
#10704

MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images

Junwen Huang, Hao Yu, Kuan-Ting Yu et al.

CVPR 2024posterarXiv:2403.01517
#10705

Resource-Efficient Transformer Pruning for Finetuning of Large Models

Fatih Ilhan, Gong Su, Selim Tekin et al.

CVPR 2024poster
#10706

Link-Context Learning for Multimodal LLMs

Yan Tai, Weichen Fan, Zhao Zhang et al.

CVPR 2024posterarXiv:2308.07891
#10707

The Manga Whisperer: Automatically Generating Transcriptions for Comics

Ragav Sachdeva, Andrew Zisserman

CVPR 2024posterarXiv:2401.10224
#10708

Deep-TROJ: An Inference Stage Trojan Insertion Algorithm through Efficient Weight Replacement Attack

Sabbir Ahmed, RANYANG ZHOU, Shaahin Angizi et al.

CVPR 2024poster
#10709

Dynamic LiDAR Re-simulation using Compositional Neural Fields

Hanfeng Wu, Xingxing Zuo, Stefan Leutenegger et al.

CVPR 2024highlightarXiv:2312.05247
#10710

Language-aware Visual Semantic Distillation for Video Question Answering

Bo Zou, Chao Yang, Yu Qiao et al.

CVPR 2024poster
#10711

3DInAction: Understanding Human Actions in 3D Point Clouds

Yizhak Ben-Shabat, Oren Shrout, Stephen Gould

CVPR 2024highlightarXiv:2303.06346
#10712

DiLiGenRT: A Photometric Stereo Dataset with Quantified Roughness and Translucency

Heng Guo, Jieji Ren, Feishi Wang et al.

CVPR 2024poster
#10713

StyLitGAN: Image-Based Relighting via Latent Control

Anand Bhattad, James Soole, David Forsyth

CVPR 2024poster
#10714

Label-Efficient Group Robustness via Out-of-Distribution Concept Curation

Yiwei Yang, Anthony Liu, Robert Wolfe et al.

CVPR 2024poster
#10715

Unsupervised Universal Image Segmentation

XuDong Wang, Dantong Niu, Xinyang Han et al.

CVPR 2024posterarXiv:2312.17243
#10716

Batch Normalization Alleviates the Spectral Bias in Coordinate Networks

Zhicheng Cai, Hao Zhu, Qiu Shen et al.

CVPR 2024poster
#10717

Not All Classes Stand on Same Embeddings: Calibrating a Semantic Distance with Metric Tensor

Jae Hyeon Park, Gyoomin Lee, Seunggi Park et al.

CVPR 2024poster
#10718

CodedEvents: Optimal Point-Spread-Function Engineering for 3D-Tracking with Event Cameras

Sachin Shah, Matthew Chan, Haoming Cai et al.

CVPR 2024posterarXiv:2406.09409
#10719

Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling

Zhe Li, Zerong Zheng, Lizhen Wang et al.

CVPR 2024poster
#10720

Retrieval-Augmented Open-Vocabulary Object Detection

Jooyeon Kim, Eulrang Cho, Sehyung Kim et al.

CVPR 2024posterarXiv:2404.05687
#10721

NB-GTR: Narrow-Band Guided Turbulence Removal

Yifei Xia, Chu Zhou, Chengxuan Zhu et al.

CVPR 2024poster
#10722

LangSplat: 3D Language Gaussian Splatting

Minghan Qin, Wanhua Li, Jiawei ZHOU et al.

CVPR 2024highlightarXiv:2312.16084
#10723

Positive-Unlabeled Learning by Latent Group-Aware Meta Disambiguation

Lin Long, Haobo Wang, Zhijie Jiang et al.

CVPR 2024poster
#10724

Text-conditional Attribute Alignment across Latent Spaces for 3D Controllable Face Image Synthesis

FeiFan Xu, Rui Li, Si Wu et al.

CVPR 2024poster
#10725

SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering

Antoine Guédon, Vincent Lepetit

CVPR 2024posterarXiv:2311.12775
#10726

DiffusionPoser: Real-time Human Motion Reconstruction From Arbitrary Sparse Sensors Using Autoregressive Diffusion

Tom Van Wouwe, Seunghwan Lee, Antoine Falisse et al.

CVPR 2024posterarXiv:2308.16682
#10727

HumanRef: Single Image to 3D Human Generation via Reference-Guided Diffusion

Jingbo Zhang, Xiaoyu Li, Qi Zhang et al.

CVPR 2024posterarXiv:2311.16961
#10728

CurveCloudNet: Processing Point Clouds with 1D Structure

Colton Stearns, Alex Fu, Jiateng Liu et al.

CVPR 2024posterarXiv:2303.12050
#10729

Harnessing Meta-Learning for Improving Full-Frame Video Stabilization

Muhammad Kashif Ali, Eun Woo Im, Dongjin Kim et al.

CVPR 2024posterarXiv:2403.03662
#10730

Physical 3D Adversarial Attacks against Monocular Depth Estimation in Autonomous Driving

Junhao Zheng, Chenhao Lin, Jiahao Sun et al.

CVPR 2024posterarXiv:2403.17301
#10731

SeaBird: Segmentation in Bird’s View with Dice Loss Improves Monocular 3D Detection of Large Objects

Abhinav Kumar, Yuliang Guo, Xinyu Huang et al.

CVPR 2024posterarXiv:2403.20318
#10732

MoML: Online Meta Adaptation for 3D Human Motion Prediction

Xiaoning Sun, Huaijiang Sun, Bin Li et al.

CVPR 2024poster
#10733

Learning with Structural Labels for Learning with Noisy Labels

Noo-ri Kim, Jin-Seop Lee, Jee-Hyong Lee

CVPR 2024poster
#10734

What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models

Letian Zhang, Xiaotong Zhai, Zhongkai Zhao et al.

CVPR 2024posterarXiv:2310.06627
#10735

Incremental Nuclei Segmentation from Histopathological Images via Future-class Awareness and Compatibility-inspired Distillation

Huyong Wang, Huisi Wu, Jing Qin

CVPR 2024poster
#10736

Model Inversion Robustness: Can Transfer Learning Help?

Sy-Tuyen Ho, Koh Jun Hao, Keshigeyan Chandrasegaran et al.

CVPR 2024posterarXiv:2405.05588
#10737

Scene-adaptive and Region-aware Multi-modal Prompt for Open Vocabulary Object Detection

Xiaowei Zhao, Xianglong Liu, Duorui Wang et al.

CVPR 2024poster
#10738

InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models

Jiun Tian Hoe, Xudong Jiang, Chee Seng Chan et al.

CVPR 2024posterarXiv:2312.05849
#10739

MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection

Boyang Peng, Sanqing Qu, Yong Wu et al.

CVPR 2024posterarXiv:2403.04149
#10740

Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis

Xin Zhou, Dingkang Liang, Wei Xu et al.

CVPR 2024posterarXiv:2403.01439
#10741

EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation

Md Mostafijur Rahman, Mustafa Munir, Radu Marculescu

CVPR 2024posterarXiv:2405.06880
#10742

On Exact Inversion of DPM-Solvers

Seongmin Hong, Kyeonghyun Lee, Suh Yoon Jeon et al.

CVPR 2024posterarXiv:2311.18387
#10743

Generate Like Experts: Multi-Stage Font Generation by Incorporating Font Transfer Process into Diffusion Models

Bin Fu, Fanghua Yu, Anran Liu et al.

CVPR 2024poster
#10744

A Unified Diffusion Framework for Scene-aware Human Motion Estimation from Sparse Signals

Jiangnan Tang, Jingya Wang, Kaiyang Ji et al.

CVPR 2024posterarXiv:2404.04890
#10745

MaskCLR: Attention-Guided Contrastive Learning for Robust Action Representation Learning

Mohamed Abdelfattah, Mariam Hassan, Alex Alahi

CVPR 2024poster
#10746

D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection

Dinh Phat Do, Taehoon Kim, JAEMIN NA et al.

CVPR 2024posterarXiv:2403.09359
#10747

MAGICK: A Large-scale Captioned Dataset from Matting Generated Images using Chroma Keying

Ryan Burgert, Brian Price, Jason Kuen et al.

CVPR 2024poster
#10748

Intrinsic Image Diffusion for Indoor Single-view Material Estimation

Peter Kocsis, Vincent Sitzmann, Matthias Nießner

CVPR 2024posterarXiv:2312.12274
#10749

Prompt Highlighter: Interactive Control for Multi-Modal LLMs

Yuechen Zhang, Shengju Qian, Bohao Peng et al.

CVPR 2024posterarXiv:2312.04302
#10750

Can Protective Perturbation Safeguard Personal Data from Being Exploited by Stable Diffusion?

Zhengyue Zhao, Jinhao Duan, Kaidi Xu et al.

CVPR 2024posterarXiv:2312.00084
#10751

NetTrack: Tracking Highly Dynamic Objects with a Net

Guangze Zheng, Shijie Lin, Haobo Zuo et al.

CVPR 2024posterarXiv:2403.11186
#10752

Scaling Up Video Summarization Pretraining with Large Language Models

Dawit Argaw Argaw, Seunghyun Yoon, Fabian Caba Heilbron et al.

CVPR 2024posterarXiv:2404.03398
#10753

Online Task-Free Continual Generative and Discriminative Learning via Dynamic Cluster Memory

飞 叶, Adrian Bors

CVPR 2024poster
#10754

FADES: Fair Disentanglement with Sensitive Relevance

Taeuk Jang, Xiaoqian Wang

CVPR 2024poster
#10755

Versatile Navigation Under Partial Observability via Value-guided Diffusion Policy

Gengyu Zhang, Hao Tang, Yan Yan

CVPR 2024posterarXiv:2404.02176
#10756

Improving Depth Completion via Depth Feature Upsampling

Yufei Wang, Ge Zhang, Shaoqian Wang et al.

CVPR 2024poster
#10757

Deep Single Image Camera Calibration by Heatmap Regression to Recover Fisheye Images Under Manhattan World Assumption

Nobuhiko Wakai, Satoshi Sato, Yasunori Ishii et al.

CVPR 2024posterarXiv:2303.17166
#10758

MRFS: Mutually Reinforcing Image Fusion and Segmentation

HAO ZHANG, Xuhui Zuo, Jie Jiang et al.

CVPR 2024poster
#10759

Multi-agent Long-term 3D Human Pose Forecasting via Interaction-aware Trajectory Conditioning

Jaewoo Jeong, Daehee Park, Kuk-Jin Yoon

CVPR 2024highlightarXiv:2404.05218
#10760

OrCo: Towards Better Generalization via Orthogonality and Contrast for Few-Shot Class-Incremental Learning

Noor Ahmed, Anna Kukleva, Bernt Schiele

CVPR 2024highlightarXiv:2403.18550
#10761

3D-LFM: Lifting Foundation Model

Mosam Dabhi, László A. Jeni, Simon Lucey

CVPR 2024posterarXiv:2312.11894
#10762

LASIL: Learner-Aware Supervised Imitation Learning For Long-term Microscopic Traffic Simulation

Ke Guo, Zhenwei Miao, Wei Jing et al.

CVPR 2024posterarXiv:2403.17601
#10763

HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces

Haithem Turki, Vasu Agrawal, Samuel Rota Bulò et al.

CVPR 2024highlightarXiv:2312.03160
#10764

IIRP-Net: Iterative Inference Residual Pyramid Network for Enhanced Image Registration

Tai Ma, zhangsuwei, Jiafeng Li et al.

CVPR 2024poster
#10765

SEED-Bench: Benchmarking Multimodal Large Language Models

Bohao Li, Yuying Ge, Yixiao Ge et al.

CVPR 2024poster
#10766

Style Aligned Image Generation via Shared Attention

Amir Hertz, Andrey Voynov, Shlomi Fruchter et al.

CVPR 2024posterarXiv:2312.02133
#10767

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows

Zhenggang Tang, Jason Ren, Xiaoming Zhao et al.

CVPR 2024posterarXiv:2406.10543
#10768

BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models

Fengyuan Shi, Jiaxi Gu, Hang Xu et al.

CVPR 2024posterarXiv:2312.02813
#10769

Active Domain Adaptation with False Negative Prediction for Object Detection

Yuzuru Nakamura, Yasunori Ishii, Takayoshi Yamashita

CVPR 2024highlight
#10770

LTGC: Long-tail Recognition via Leveraging LLMs-driven Generated Content

Qihao Zhao, Yalun Dai, Hao Li et al.

CVPR 2024posterarXiv:2403.05854
#10771

How to Train Neural Field Representations: A Comprehensive Study and Benchmark

Samuele Papa, Riccardo Valperga, David Knigge et al.

CVPR 2024posterarXiv:2312.10531
#10772

Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring

Chengxu Liu, Xuan Wang, Xiangyu Xu et al.

CVPR 2024posterarXiv:2404.13153
#10773

SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation

Zhixuan Liu, Peter Schaldenbrand, Beverley-Claire Okogwu et al.

CVPR 2024posterarXiv:2401.08053
#10774

Reg-PTQ: Regression-specialized Post-training Quantization for Fully Quantized Object Detector

Yifu Ding, Weilun Feng, Chuyan Chen et al.

CVPR 2024poster
#10775

FREE: Faster and Better Data-Free Meta-Learning

Yongxian Wei, Zixuan Hu, Zhenyi Wang et al.

CVPR 2024posterarXiv:2405.00984
#10776

Open Vocabulary Semantic Scene Sketch Understanding

Ahmed Bourouis, Judith Fan, Yulia Gryaditskaya

CVPR 2024posterarXiv:2312.12463
#10777

You Only Need Less Attention at Each Stage in Vision Transformers

Shuoxi Zhang, Hanpeng Liu, Stephen Lin et al.

CVPR 2024posterarXiv:2406.00427
#10778

Hierarchical Patch Diffusion Models for High-Resolution Video Generation

Ivan Skorokhodov, Willi Menapace, Aliaksandr Siarohin et al.

CVPR 2024posterarXiv:2406.07792
#10779

Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection

Chuangchuang Tan, Huan Liu, Yao Zhao et al.

CVPR 2024posterarXiv:2312.10461
#10780

Building a Strong Pre-Training Baseline for Universal 3D Large-Scale Perception

Haoming Chen, Zhizhong Zhang, Yanyun Qu et al.

CVPR 2024posterarXiv:2405.07201
#10781

BoQ: A Place is Worth a Bag of Learnable Queries

Amar Ali-bey, Brahim Chaib-draa, Philippe Giguère

CVPR 2024posterarXiv:2405.07364
#10782

UFC-Net: Unrolling Fixed-point Continuous Network for Deep Compressive Sensing

Xiaoyang Wang, Hongping Gan

CVPR 2024poster
#10783

Symphonize 3D Semantic Scene Completion with Contextual Instance Queries

Haoyi Jiang, Tianheng Cheng, Naiyu Gao et al.

CVPR 2024posterarXiv:2306.15670
#10784

CPLIP: Zero-Shot Learning for Histopathology with Comprehensive Vision-Language Alignment

Sajid Javed, Arif Mahmood, IYYAKUTTI IYAPPAN GANAPATHI et al.

CVPR 2024posterarXiv:2406.05205
#10785

MaskPLAN: Masked Generative Layout Planning from Partial Input

Hang Zhang, Anton Savov, Benjamin Dillenburger

CVPR 2024poster
#10786

Solving Masked Jigsaw Puzzles with Diffusion Vision Transformers

Jinyang Liu, Wondmgezahu Teshome, Sandesh Ghimire et al.

CVPR 2024posterarXiv:2404.07292
#10787

Towards Memorization-Free Diffusion Models

Chen Chen, Daochang Liu, Chang Xu

CVPR 2024posterarXiv:2404.00922
#10788

AV-RIR: Audio-Visual Room Impulse Response Estimation

Anton Ratnarajah, Sreyan Ghosh, Sonal Kumar et al.

CVPR 2024posterarXiv:2312.00834
#10789

Entangled View-Epipolar Information Aggregation for Generalizable Neural Radiance Fields

Zhiyuan Min, Yawei Luo, Wei Yang et al.

CVPR 2024posterarXiv:2311.11845
#10790

A-Teacher: Asymmetric Network for 3D Semi-Supervised Object Detection

Hanshi Wang, Zhipeng Zhang, Jin Gao et al.

CVPR 2024poster
#10791

HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances

Supreeth Narasimhaswamy, Uttaran Bhattacharya, Xiang Chen et al.

CVPR 2024posterarXiv:2403.01693
#10792

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Bin Xiao, Haiping Wu, Weijian Xu et al.

CVPR 2024posterarXiv:2311.06242
#10793

DMR: Decomposed Multi-Modality Representations for Frames and Events Fusion in Visual Reinforcement Learning

Haoran Xu, Peixi Peng, Guang Tan et al.

CVPR 2024poster
#10794

3D Feature Tracking via Event Camera

Siqi Li, Zhou Zhikuan, Zhou Xue et al.

CVPR 2024poster
#10795

Frequency-aware Event-based Video Deblurring for Real-World Motion Blur

Taewoo Kim, Hoonhee Cho, Kuk-Jin Yoon

CVPR 2024poster
#10796

FedHCA2: Towards Hetero-Client Federated Multi-Task Learning

Yuxiang Lu, Suizhi Huang, Yuwen Yang et al.

CVPR 2024poster
#10797

Improving Unsupervised Hierarchical Representation with Reinforcement Learning

Ruyi An, Yewen Li, Xu He et al.

CVPR 2024poster
#10798

Global Latent Neural Rendering

Thomas Tanay, Matteo Maggioni

CVPR 2024posterarXiv:2312.08338
#10799

Data Poisoning based Backdoor Attacks to Contrastive Learning

Jinghuai Zhang, Hongbin Liu, Jinyuan Jia et al.

CVPR 2024posterarXiv:2211.08229
#10800

RoHM: Robust Human Motion Reconstruction via Diffusion

Siwei Zhang, Bharat Lal Bhatnagar, Yuanlu Xu et al.

CVPR 2024posterarXiv:2401.08570