Most Cited 2024 "force-aware contact representation" Papers

12,324 papers found • Page 5 of 62

#801

Posterior Distillation Sampling

Juil Koo, Chanho Park, Minhyuk Sung

CVPR 2024posterarXiv:2311.13831
43
citations
#802

Boosting Object Detection with Zero-Shot Day-Night Domain Adaptation

Zhipeng Du, Miaojing Shi, Jiankang Deng

CVPR 2024posterarXiv:2312.01220
43
citations
#803

AffineQuant: Affine Transformation Quantization for Large Language Models

Yuexiao Ma, Huixia Li, Xiawu Zheng et al.

ICLR 2024posterarXiv:2403.12544
43
citations
#804

AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation

Haonan Wang, Qixiang ZHANG, Yi Li et al.

CVPR 2024posterarXiv:2403.01818
42
citations
#805

BAMM: Bidirectional Autoregressive Motion Model

Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Pu Wang et al.

ECCV 2024posterarXiv:2403.19435
42
citations
#806

Towards Diverse Behaviors: A Benchmark for Imitation Learning with Human Demonstrations

Xiaogang Jia, Denis Blessing, Xinkai Jiang et al.

ICLR 2024posterarXiv:2402.14606
42
citations
#807

Two-stage LLM Fine-tuning with Less Specialization and More Generalization

Yihan Wang, Si Si, Daliang Li et al.

ICLR 2024posterarXiv:2211.00635
42
citations
#808

Learning the 3D Fauna of the Web

Zizhang Li, Dor Litvak, Ruining Li et al.

CVPR 2024posterarXiv:2401.02400
42
citations
#809

Curriculum reinforcement learning for quantum architecture search under hardware errors

Yash J. Patel, Akash Kundu, Mateusz Ostaszewski et al.

ICLR 2024posterarXiv:2402.03500
42
citations
#810

UniBind: LLM-Augmented Unified and Balanced Representation Space to Bind Them All

Yuanhuiyi Lyu, Xu Zheng, Jiazhou Zhou et al.

CVPR 2024posterarXiv:2403.12532
42
citations
#811

Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion

Kiran Chhatre, Radek Danecek, Nikos Athanasiou et al.

CVPR 2024posterarXiv:2312.04466
42
citations
#812

Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning

Yiwen Ye, Yutong Xie, Jianpeng Zhang et al.

CVPR 2024highlightarXiv:2311.17597
42
citations
#813

Exploiting Diffusion Prior for Generalizable Dense Prediction

Hsin-Ying Lee, Hung-Yu Tseng, Hsin-Ying Lee et al.

CVPR 2024posterarXiv:2311.18832
42
citations
#814

A Watermark-Conditioned Diffusion Model for IP Protection

Rui Min, Sen Li, Hongyang Chen et al.

ECCV 2024posterarXiv:2403.10893
42
citations
#815

One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models

Lin Li, Haoyan Guan, Jianing Qiu et al.

CVPR 2024posterarXiv:2403.01849
42
citations
#816

SemCity: Semantic Scene Generation with Triplane Diffusion

Jumin Lee, Sebin Lee, Changho Jo et al.

CVPR 2024posterarXiv:2403.07773
42
citations
#817

HouseCat6D - A Large-Scale Multi-Modal Category Level 6D Object Perception Dataset with Household Objects in Realistic Scenarios

HyunJun Jung, Shun-Cheng Wu, Patrick Ruhkamp et al.

CVPR 2024highlightarXiv:2212.10428
42
citations
#818

Towards Robust Event-guided Low-Light Image Enhancement: A Large-Scale Real-World Event-Image Dataset and Novel Approach

Guoqiang Liang, Kanghao Chen, Hangyu Li et al.

CVPR 2024posterarXiv:2404.00834
41
citations
#819

Class-Incremental Learning with CLIP: Adaptive Representation Adjustment and Parameter Fusion

Linlan Huang, Xusheng Cao, Haori Lu et al.

ECCV 2024posterarXiv:2407.14143
41
citations
#820

Vision-and-Language Navigation via Causal Learning

Liuyi Wang, Zongtao He, Ronghao Dang et al.

CVPR 2024posterarXiv:2404.10241
41
citations
#821

PREFER: Prompt Ensemble Learning via Feedback-Reflect-Refine

Chenrui Zhang, Lin Liu, Chuyuan Wang et al.

AAAI 2024paperarXiv:2308.12033
41
citations
#822

Benchmarking and Improving Generator-Validator Consistency of Language Models

XIANG LI, Vaishnavi Shrivastava, Siyan Li et al.

ICLR 2024posterarXiv:2310.01846
41
citations
#823

Generative Proxemics: A Prior for 3D Social Interaction from Images

Vickie Ye, Vickie Ye, Georgios Pavlakos et al.

CVPR 2024posterarXiv:2306.09337
41
citations
#824

FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization

Shuai Tan, Bin Ji, Ye Pan

CVPR 2024posterarXiv:2403.06375
41
citations
#825

DiffAvatar: Simulation-Ready Garment Optimization with Differentiable Simulation

Yifei Li, Hsiaoyu Chen, Egor Larionov et al.

CVPR 2024posterarXiv:2311.12194
41
citations
#826

Investigating and Mitigating the Side Effects of Noisy Views for Self-Supervised Clustering Algorithms in Practical Multi-View Scenarios

Jie Xu, Yazhou Ren, Xiaolong Wang et al.

CVPR 2024posterarXiv:2303.17245
41
citations
#827

T-MARS: Improving Visual Representations by Circumventing Text Feature Learning

Pratyush Maini, Sachin Goyal, Zachary Lipton et al.

ICLR 2024posterarXiv:2307.03132
41
citations
#828

MAPE-PPI: Towards Effective and Efficient Protein-Protein Interaction Prediction via Microenvironment-Aware Protein Embedding

Lirong Wu, Yijun Tian, Yufei Huang et al.

ICLR 2024spotlightarXiv:2402.14391
41
citations
#829

TC-LIF: A Two-Compartment Spiking Neuron Model for Long-Term Sequential Modelling

Shimin Zhang, Qu Yang, Chenxiang Ma et al.

AAAI 2024paperarXiv:2308.13250
41
citations
#830

A Compact Dynamic 3D Gaussian Representation for Real-Time Dynamic View Synthesis

Kai Katsumata, Duc Minh Vo, Hideki Nakayama

ECCV 2024posterarXiv:2311.12897
41
citations
#831

On the Error Analysis of 3D Gaussian Splatting and an Optimal Projection Strategy

Letian Huang, Jiayang Bai, Jie Guo et al.

ECCV 2024posterarXiv:2402.00752
41
citations
#832

Few-Shot Detection of Machine-Generated Text using Style Representations

Rafael Rivera Soto, Kailin Koch, Aleem Khan et al.

ICLR 2024posterarXiv:2401.06712
41
citations
#833

DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning

Zhengxiang Shi, Aldo Lipani

ICLR 2024posterarXiv:2309.05173
41
citations
#834

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

Bu Jin, Yupeng Zheng, Pengfei Li et al.

ECCV 2024posterarXiv:2403.19589
40
citations
#835

Texture-GS: Disentangle the Geometry and Texture for 3D Gaussian Splatting Editing

Tian-Xing Xu, WENBO HU, Yu-Kun Lai et al.

ECCV 2024posterarXiv:2403.10050
40
citations
#836

TransFusion -- A Transparency-Based Diffusion Model for Anomaly Detection

Matic Fučka, Vitjan Zavrtanik, Danijel Skocaj

ECCV 2024posterarXiv:2311.09999
40
citations
#837

UnScene3D: Unsupervised 3D Instance Segmentation for Indoor Scenes

David Rozenberszki, Or Litany, Angela Dai

CVPR 2024posterarXiv:2303.14541
40
citations
#838

WOODS: Benchmarks for Out-of-Distribution Generalization in Time Series

Irina Rish, Kartik Ahuja, Mohammad Javad Darvishi Bayazi et al.

ICLR 2024poster
40
citations
#839

ID-like Prompt Learning for Few-Shot Out-of-Distribution Detection

Yichen Bai, Zongbo Han, Bing Cao et al.

CVPR 2024posterarXiv:2311.15243
40
citations
#840

A Diffusion-Based Framework for Multi-Class Anomaly Detection

Haoyang He, Jiangning Zhang, Hongxu Chen et al.

AAAI 2024paperarXiv:2312.06607
40
citations
#841

EgoLifter: Open-world 3D Segmentation for Egocentric Perception

Qiao Gu, Zhaoyang Lv, Duncan Frost et al.

ECCV 2024posterarXiv:2403.18118
40
citations
#842

Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification

Jiangming Shi, Xiangbo Yin, Yeyun Chen et al.

ECCV 2024posterarXiv:2401.06825
40
citations
#843

SceneTex: High-Quality Texture Synthesis for Indoor Scenes via Diffusion Priors

Dave Zhenyu Chen, Haoxuan Li, Hsin-Ying Lee et al.

CVPR 2024highlightarXiv:2311.17261
40
citations
#844

Prompt Learning via Meta-Regularization

Jinyoung Park, Juyeon Ko, Hyunwoo J. Kim

CVPR 2024posterarXiv:2404.00851
40
citations
#845

Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval

Zhihang Liu, Jun Li, Hongtao Xie et al.

AAAI 2024paperarXiv:2312.12155
40
citations
#846

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

Hallee E. Wong, Marianne Rakic, John Guttag et al.

ECCV 2024posterarXiv:2312.07381
40
citations
#847

Scene Adaptive Sparse Transformer for Event-based Object Detection

Yansong Peng, Li Hebei, Yueyi Zhang et al.

CVPR 2024posterarXiv:2404.01882
40
citations
#848

A Vision Check-up for Language Models

Pratyusha Sharma, Tamar Rott Shaham, Manel Baradad et al.

CVPR 2024highlightarXiv:2401.01862
40
citations
#849

DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs

Donghyun Kim, Byeongho Heo, Dongyoon Han

ECCV 2024posterarXiv:2403.19588
40
citations
#850

Attribute-Missing Graph Clustering Network

Wenxuan Tu, Renxiang Guan, Sihang Zhou et al.

AAAI 2024paper
40
citations
#851

Devignet: High-Resolution Vignetting Removal via a Dual Aggregated Fusion Transformer with Adaptive Channel Expansion

Shenghong Luo, Xuhang Chen, Weiwen Chen et al.

AAAI 2024paperarXiv:2308.13739
40
citations
#852

NOPE: Novel Object Pose Estimation from a Single Image

Van Nguyen Nguyen, Thibault Groueix, Georgy Ponimatkin et al.

CVPR 2024posterarXiv:2303.13612
40
citations
#853

Does CLIP’s generalization performance mainly stem from high train-test similarity?

Prasanna Mayilvahanan, Thaddäus Wiedemer, Evgenia Rusak et al.

ICLR 2024posterarXiv:2310.09562
40
citations
#854

Stream Query Denoising for Vectorized HD-Map Construction

Shuo Wang, Fan Jia, Weixin Mao et al.

ECCV 2024posterarXiv:2401.09112
40
citations
#855

GalLop: Learning global and local prompts for vision-language models

Marc Lafon, Elias Ramzi, Clément Rambour et al.

ECCV 2024posterarXiv:2407.01400
39
citations
#856

Multi-Architecture Multi-Expert Diffusion Models

Yunsung Lee, Jin-Young Kim, Hyojun Go et al.

AAAI 2024paperarXiv:2306.04990
39
citations
#857

STanHop: Sparse Tandem Hopfield Model for Memory-Enhanced Time Series Prediction

Yu-Hsuan Wu, Jerry Hu, Weijian Li et al.

ICLR 2024oralarXiv:2312.17346
39
citations
#858

DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception

Yibo Wang, Ruiyuan Gao, Kai Chen et al.

CVPR 2024posterarXiv:2403.13304
39
citations
#859

Dual RL: Unification and New Methods for Reinforcement and Imitation Learning

Harshit Sikchi, Qinqing Zheng, Amy Zhang et al.

ICLR 2024spotlightarXiv:2302.08560
39
citations
#860

PAD: Patch-Agnostic Defense against Adversarial Patch Attacks

Lihua Jing, Rui Wang, Wenqi Ren et al.

CVPR 2024posterarXiv:2404.16452
39
citations
#861

SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views

Chao Xu, Ang Li, Linghao Chen et al.

ECCV 2024posterarXiv:2408.10195
39
citations
#862

No Prejudice! Fair Federated Graph Neural Networks for Personalized Recommendation

Nimesh Agrawal, Anuj Sirohi, Sandeep Kumar et al.

AAAI 2024paperarXiv:2312.10080
39
citations
#863

A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames

Pinelopi Papalampidi, Skanda Koppula, Shreya Pathak et al.

CVPR 2024posterarXiv:2312.07395
39
citations
#864

Provable Offline Preference-Based Reinforcement Learning

Wenhao Zhan, Masatoshi Uehara, Nathan Kallus et al.

ICLR 2024spotlightarXiv:2305.14816
39
citations
#865

AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation

Qingping SUN, Yanjun Wang, Ailing Zeng et al.

CVPR 2024posterarXiv:2403.17934
39
citations
#866

Gradient Reweighting: Towards Imbalanced Class-Incremental Learning

Jiangpeng He

CVPR 2024posterarXiv:2402.18528
39
citations
#867

GLACE: Global Local Accelerated Coordinate Encoding

Fangjinhua Wang, Xudong Jiang, Silvano Galliani et al.

CVPR 2024posterarXiv:2406.04340
39
citations
#868

ZeroShape: Regression-based Zero-shot Shape Reconstruction

Zixuan Huang, Stefan Stojanov, Anh Thai et al.

CVPR 2024posterarXiv:2312.14198
39
citations
#869

Controllable Mind Visual Diffusion Model

Bohan Zeng, Shanglin Li, Xuhui Liu et al.

AAAI 2024paperarXiv:2305.10135
39
citations
#870

PolyGCL: GRAPH CONTRASTIVE LEARNING via Learnable Spectral Polynomial Filters

Jingyu Chen, Runlin Lei, Zhewei Wei

ICLR 2024spotlight
39
citations
#871

Towards Continual Knowledge Graph Embedding via Incremental Distillation

Jiajun Liu, Ke Wenjun, Peng Wang et al.

AAAI 2024paperarXiv:2405.04453
39
citations
#872

NoiseCLR: A Contrastive Learning Approach for Unsupervised Discovery of Interpretable Directions in Diffusion Models

Yusuf Dalva, Pinar Yanardag

CVPR 2024posterarXiv:2312.05390
39
citations
#873

Learning Diffusion Texture Priors for Image Restoration

Tian Ye, Sixiang Chen, Wenhao Chai et al.

CVPR 2024highlight
39
citations
#874

Quality-Diversity through AI Feedback

Herbie Bradley, Andrew Dai, Hannah Teufel et al.

ICLR 2024posterarXiv:2310.13032
38
citations
#875

Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer

Yaoting Wang, Liu Weisong, Guangyao Li et al.

AAAI 2024paperarXiv:2309.07929
38
citations
#876

AirPhyNet: Harnessing Physics-Guided Neural Networks for Air Quality Prediction

Kethmi Hirushini Hettige, Jiahao Ji, Shili Xiang et al.

ICLR 2024oralarXiv:2402.03784
38
citations
#877

Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment

Mengting Chen, Xi Chen, Zhonghua Zhai et al.

ECCV 2024posterarXiv:2403.12965
38
citations
#878

TESTAM: A Time-Enhanced Spatio-Temporal Attention Model with Mixture of Experts

Hyunwook Lee, Sungahn Ko

ICLR 2024oralarXiv:2403.02600
38
citations
#879

Stable Neural Stochastic Differential Equations in Analyzing Irregular Time Series Data

YongKyung Oh, Dongyoung Lim, Sungil Kim

ICLR 2024spotlightarXiv:2402.14989
38
citations
#880

Few Shot Part Segmentation Reveals Compositional Logic for Industrial Anomaly Detection

Soopil Kim, Sion An, Philip Chikontwe et al.

AAAI 2024paperarXiv:2312.13783
38
citations
#881

Test-Time Domain Generalization for Face Anti-Spoofing

Qianyu Zhou, Ke-Yue Zhang, Taiping Yao et al.

CVPR 2024posterarXiv:2403.19334
38
citations
#882

EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning

Hongxia Xie, Chu-Jun Peng, Yu-Wen Tseng et al.

CVPR 2024posterarXiv:2404.16670
38
citations
#883

Latent Space Editing in Transformer-Based Flow Matching

Vincent Tao Hu, Wei Zhang, Meng Tang et al.

AAAI 2024paperarXiv:2312.10825
38
citations
#884

Better Call SAL: Towards Learning to Segment Anything in Lidar

Aljoša Ošep, Tim Meinhardt, Francesco Ferroni et al.

ECCV 2024posterarXiv:2403.13129
38
citations
#885

Text-Guided Molecule Generation with Diffusion Language Model

Haisong Gong, Qiang Liu, Shu Wu et al.

AAAI 2024paperarXiv:2402.13040
38
citations
#886

Multi-view Aggregation Network for Dichotomous Image Segmentation

Qian Yu, Xiaoqi Zhao, Youwei Pang et al.

CVPR 2024highlightarXiv:2404.07445
38
citations
#887

XKD: Cross-Modal Knowledge Distillation with Domain Alignment for Video Representation Learning

Pritam Sarkar, Ali Etemad

AAAI 2024paperarXiv:2211.13929
38
citations
#888

RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization

Mengqi Huang, Zhendong Mao, Mingcong Liu et al.

CVPR 2024posterarXiv:2403.00483
38
citations
#889

Frequency Spectrum Is More Effective for Multimodal Representation and Fusion: A Multimodal Spectrum Rumor Detector

An Lao, Qi Zhang, Chongyang Shi et al.

AAAI 2024paperarXiv:2312.11023
38
citations
#890

SmartRefine: A Scenario-Adaptive Refinement Framework for Efficient Motion Prediction

Yang Zhou, Hao Shao, Letian Wang et al.

CVPR 2024posterarXiv:2403.11492
38
citations
#891

When Do Prompting and Prefix-Tuning Work? A Theory of Capabilities and Limitations

Aleksandar Petrov, Philip Torr, Adel Bibi

ICLR 2024posterarXiv:2310.19698
38
citations
#892

PINNACLE: PINN Adaptive ColLocation and Experimental points selection

Gregory Kang Ruey Lau, Apivich Hemachandra, See-Kiong Ng et al.

ICLR 2024spotlightarXiv:2404.07662
38
citations
#893

Revisiting Single Image Reflection Removal In the Wild

Yurui Zhu, Bo Li, Xueyang Fu et al.

CVPR 2024posterarXiv:2311.17320
37
citations
#894

Federated Adaptive Prompt Tuning for Multi-Domain Collaborative Learning

Shangchao Su, Mingzhao Yang, Bin Li et al.

AAAI 2024paperarXiv:2211.07864
37
citations
#895

6DGS: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model

Matteo Bortolon, Theodoros Tsesmelis, Stuart James et al.

ECCV 2024posterarXiv:2407.15484
37
citations
#896

Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On

Xu Yang, Changxing Ding, Zhibin Hong et al.

CVPR 2024posterarXiv:2404.01089
37
citations
#897

Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models

Xin Li, Yunfei Wu, Xinghua Jiang et al.

CVPR 2024posterarXiv:2402.19014
37
citations
#898

Video Question Answering with Procedural Programs

Rohan Choudhury, Koichiro Niinuma, Kris Kitani et al.

ECCV 2024posterarXiv:2312.00937
37
citations
#899

SMooDi: Stylized Motion Diffusion Model

Lei Zhong, Yiming Xie, Varun Jampani et al.

ECCV 2024posterarXiv:2407.12783
37
citations
#900

Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model

Zhicai Wang, Longhui Wei, Tan Wang et al.

CVPR 2024posterarXiv:2403.19600
37
citations
#901

SimAC: A Simple Anti-Customization Method for Protecting Face Privacy against Text-to-Image Synthesis of Diffusion Models

Feifei Wang, Zhentao Tan, Tianyi Wei et al.

CVPR 2024posterarXiv:2312.07865
37
citations
#902

HIR-Diff: Unsupervised Hyperspectral Image Restoration Via Improved Diffusion Models

Li Pang, Xiangyu Rui, Long Cui et al.

CVPR 2024posterarXiv:2402.15865
37
citations
#903

MathAttack: Attacking Large Language Models towards Math Solving Ability

Zihao Zhou, Qiufeng Wang, Mingyu Jin et al.

AAAI 2024paperarXiv:2309.01686
37
citations
#904

SlowTrack: Increasing the Latency of Camera-Based Perception in Autonomous Driving Using Adversarial Examples

Chen Ma, Ningfei Wang, Qi Alfred Chen et al.

AAAI 2024paperarXiv:2312.09520
37
citations
#905

Making RL with Preference-based Feedback Efficient via Randomization

Runzhe Wu, Wen Sun

ICLR 2024posterarXiv:2310.14554
37
citations
#906

FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models

Jinglin Xu, Yijie Guo, Yuxin Peng

CVPR 2024highlightarXiv:2405.05216
37
citations
#907

Disentangled Prompt Representation for Domain Generalization

De Cheng, Zhipeng Xu, XINYANG JIANG et al.

CVPR 2024poster
37
citations
#908

Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving

JunDa Cheng, Wei Yin, Kaixuan Wang et al.

CVPR 2024posterarXiv:2403.07535
37
citations
#909

SegPoint: Segment Any Point Cloud via Large Language Model

Shuting He, Henghui Ding, Xudong Jiang et al.

ECCV 2024posterarXiv:2407.13761
37
citations
#910

STEM: Unleashing the Power of Embeddings for Multi-Task Recommendation

Liangcai Su, Junwei Pan, Ximei Wang et al.

AAAI 2024paperarXiv:2308.13537
37
citations
#911

PEM: Prototype-based Efficient MaskFormer for Image Segmentation

Niccolò Cavagnero, Gabriele Rosi, Claudia Cuttano et al.

CVPR 2024posterarXiv:2402.19422
37
citations
#912

ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation

Moayed Haji Ali, Guha Balakrishnan, Vicente Ordonez

CVPR 2024posterarXiv:2311.18822
37
citations
#913

Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D Prior

Fangfu Liu, Diankun Wu, Yi Wei et al.

CVPR 2024posterarXiv:2312.06655
37
citations
#914

A Unified and General Framework for Continual Learning

Zhenyi Wang, Yan Li, Li Shen et al.

ICLR 2024posterarXiv:2403.13249
37
citations
#915

Unbiased Faster R-CNN for Single-source Domain Generalized Object Detection

Yajing Liu, Shijun Zhou, Xiyao Liu et al.

CVPR 2024highlightarXiv:2405.15225
37
citations
#916

Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion

Lucas Nunes, Rodrigo Marcuzzi, Benedikt Mersch et al.

CVPR 2024posterarXiv:2403.13470
37
citations
#917

MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization

Zhao Tianchen, Xuefei Ning, Tongcheng Fang et al.

ECCV 2024posterarXiv:2405.17873
36
citations
#918

DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure Alignment

Jiuming Liu, Dong Zhuo, Zhiheng Feng et al.

ECCV 2024posterarXiv:2403.18274
36
citations
#919

DiffAM: Diffusion-based Adversarial Makeup Transfer for Facial Privacy Protection

Yuhao Sun, Lingyun Yu, Hongtao Xie et al.

CVPR 2024posterarXiv:2405.09882
36
citations
#920

VicTR: Video-conditioned Text Representations for Activity Recognition

Kumara Kahatapitiya, Anurag Arnab, Arsha Nagrani et al.

CVPR 2024posterarXiv:2304.02560
36
citations
#921

MesonGS: Post-training Compression of 3D Gaussians via Efficient Attribute Transformation

Shuzhao Xie, Weixiang Zhang, Chen Tang et al.

ECCV 2024posterarXiv:2409.09756
36
citations
#922

Control4D: Efficient 4D Portrait Editing with Text

Ruizhi Shao, Jingxiang Sun, Cheng Peng et al.

CVPR 2024posterarXiv:2305.20082
36
citations
#923

Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark

Ziyang Chen, Israel D. Gebru, Christian Richardt et al.

CVPR 2024highlightarXiv:2403.18821
36
citations
#924

Pyramid Diffusion for Fine 3D Large Scene Generation

Yuheng Liu, Xinke Li, Xueting Li et al.

ECCV 2024posterarXiv:2311.12085
36
citations
#925

Mitigating Motion Blur in Neural Radiance Fields with Events and Frames

Marco Cannici, Davide Scaramuzza

CVPR 2024posterarXiv:2403.19780
36
citations
#926

Addressing Loss of Plasticity and Catastrophic Forgetting in Continual Learning

Mohamed Elsayed, A. Rupam Mahmood

ICLR 2024posterarXiv:2404.00781
36
citations
#927

SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance

Lukas Hoyer, David Tan, Muhammad Ferjad Naeem et al.

ECCV 2024posterarXiv:2311.16241
36
citations
#928

Distilling Semantic Priors from SAM to Efficient Image Restoration Models

Quan Zhang, Xiaoyu Liu, Wei Li et al.

CVPR 2024posterarXiv:2403.16368
36
citations
#929

Vamos: Versatile Action Models for Video Understanding

Shijie Wang, Qi Zhao, Minh Quan et al.

ECCV 2024posterarXiv:2311.13627
36
citations
#930

Score-Guided Diffusion for 3D Human Recovery

Anastasis Stathopoulos, Ligong Han, Dimitris N. Metaxas

CVPR 2024posterarXiv:2403.09623
36
citations
#931

OpenTab: Advancing Large Language Models as Open-domain Table Reasoners

Kezhi Kong, Jiani Zhang, Zhengyuan Shen et al.

ICLR 2024posterarXiv:2402.14361
36
citations
#932

DAMSDet: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion

Junjie Guo, Chenqiang Gao, Fangcen liu et al.

ECCV 2024posterarXiv:2403.00326
36
citations
#933

How to Configure Good In-Context Sequence for Visual Question Answering

Li Li, Jiawei Peng, huiyi chen et al.

CVPR 2024posterarXiv:2312.01571
36
citations
#934

Towards Efficient Replay in Federated Incremental Learning

Yichen Li, Qunwei Li, Haozhao Wang et al.

CVPR 2024posterarXiv:2403.05890
36
citations
#935

VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation

XuDong Wang, Ishan Misra, Ziyun Zeng et al.

CVPR 2024posterarXiv:2308.14710
36
citations
#936

Large Language Models are Efficient Learners of Noise-Robust Speech Recognition

Yuchen Hu, CHEN CHEN, Chao-Han Huck Yang et al.

ICLR 2024spotlightarXiv:2401.10446
36
citations
#937

U-mixer: An Unet-Mixer Architecture with Stationarity Correction for Time Series Forecasting

Xiang Ma, Xuemei Li, Lexin Fang et al.

AAAI 2024paperarXiv:2401.02236
36
citations
#938

Question Aware Vision Transformer for Multimodal Reasoning

Roy Ganz, Yair Kittenplon, Aviad Aberdam et al.

CVPR 2024highlightarXiv:2402.05472
36
citations
#939

FedSelect: Personalized Federated Learning with Customized Selection of Parameters for Fine-Tuning

Rishub Tamirisa, Chulin Xie, Wenxuan Bao et al.

CVPR 2024posterarXiv:2404.02478
36
citations
#940

Amodal Completion via Progressive Mixed Context Diffusion

Katherine Xu, Lingzhi Zhang, Jianbo Shi

CVPR 2024highlightarXiv:2312.15540
36
citations
#941

Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction

Jiatong Shi, Hirofumi Inaguma, Xutai Ma et al.

ICLR 2024spotlightarXiv:2310.02720
36
citations
#942

DiffBEV: Conditional Diffusion Model for Bird’s Eye View Perception

Jiayu Zou, Kun Tian, Zheng Zhu et al.

AAAI 2024paperarXiv:2303.08333
36
citations
#943

R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection

Zheyuan Zhou, Wang Le, Naiyu Fang et al.

ECCV 2024posterarXiv:2407.10862
36
citations
#944

GenZI: Zero-Shot 3D Human-Scene Interaction Generation

Lei Li, Angela Dai

CVPR 2024posterarXiv:2311.17737
36
citations
#945

Communication-Efficient Federated Learning with Accelerated Client Gradient

Geeho Kim, Jinkyu Kim, Bohyung Han

CVPR 2024posterarXiv:2201.03172
36
citations
#946

Goldfish: Vision-Language Understanding of Arbitrarily Long Videos

Kirolos Ataallah, Xiaoqian Shen, Eslam mohamed abdelrahman et al.

ECCV 2024posterarXiv:2407.12679
36
citations
#947

Producing and Leveraging Online Map Uncertainty in Trajectory Prediction

Xunjiang Gu, Guanyu Song, Igor Gilitschenski et al.

CVPR 2024posterarXiv:2403.16439
36
citations
#948

DragVideo: Interactive Drag-style Video Editing

Yufan Deng, Ruida Wang, Yuhao ZHANG et al.

ECCV 2024posterarXiv:2312.02216
36
citations
#949

MBR and QE Finetuning: Training-time Distillation of the Best and Most Expensive Decoding Methods

Mara Finkelstein, Markus Freitag

ICLR 2024posterarXiv:2309.10966
36
citations
#950

Translate Meanings, Not Just Words: IdiomKB’s Role in Optimizing Idiomatic Translation with Language Models

Shuang Li, Jiangjie Chen, Siyu Yuan et al.

AAAI 2024paperarXiv:2308.13961
35
citations
#951

V-IRL: Grounding Virtual Intelligence in Real Life

Jihan YANG, Runyu Ding, Ellis L Brown et al.

ECCV 2024posterarXiv:2402.03310
35
citations
#952

NeuSurf: On-Surface Priors for Neural Surface Reconstruction from Sparse Input Views

Han Huang, Yulun Wu, Junsheng Zhou et al.

AAAI 2024paperarXiv:2312.13977
35
citations
#953

MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices

Yang Zhao, Zhisheng Xiao, Yanwu Xu et al.

ECCV 2024posterarXiv:2311.16567
35
citations
#954

RobustSAM: Segment Anything Robustly on Degraded Images

Wei-Ting Chen, Yu Jiet Vong, Sy-Yen Kuo et al.

CVPR 2024highlightarXiv:2406.09627
35
citations
#955

Mono3DVG: 3D Visual Grounding in Monocular Images

Yangfan Zhan, Yuan Yuan, Zhitong Xiong

AAAI 2024paperarXiv:2312.08022
35
citations
#956

FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis

Ke Fan, Junshu Tang, Weijian Cao et al.

ECCV 2024posterarXiv:2405.15763
35
citations
#957

Exploiting Label Skews in Federated Learning with Model Concatenation

Yiqun Diao, Qinbin Li, Bingsheng He

AAAI 2024paperarXiv:2312.06290
35
citations
#958

Tokenize Anything via Prompting

Ting Pan, Lulu Tang, Xinlong Wang et al.

ECCV 2024posterarXiv:2312.09128
35
citations
#959

LION: Implicit Vision Prompt Tuning

Haixin Wang, Jianlong Chang, Yihang Zhai et al.

AAAI 2024paperarXiv:2303.09992
35
citations
#960

Interactive Continual Learning: Fast and Slow Thinking

Biqing Qi, Xinquan Chen, Junqi Gao et al.

CVPR 2024posterarXiv:2403.02628
35
citations
#961

Towards Multimodal Sentiment Analysis Debiasing via Bias Purification

Dingkang Yang, Mingcheng Li, Dongling Xiao et al.

ECCV 2024posterarXiv:2403.05023
35
citations
#962

ICP-Flow: LiDAR Scene Flow Estimation with ICP

Yancong Lin, Holger Caesar

CVPR 2024posterarXiv:2402.17351
35
citations
#963

UnO: Unsupervised Occupancy Fields for Perception and Forecasting

Ben Agro, Quinlan Sykora, Sergio Casas et al.

CVPR 2024posterarXiv:2406.08691
35
citations
#964

How to Fine-Tune Vision Models with SGD

Ananya Kumar, Ruoqi Shen, Sebastien Bubeck et al.

ICLR 2024posterarXiv:2211.09359
35
citations
#965

Frequency Decoupling for Motion Magnification via Multi-Level Isomorphic Architecture

Fei Wang, Dan Guo, Kun Li et al.

CVPR 2024posterarXiv:2403.07347
35
citations
#966

UGG: Unified Generative Grasping

Jiaxin Lu, Hao Kang, Haoxiang Li et al.

ECCV 2024posterarXiv:2311.16917
35
citations
#967

SD-MVS: Segmentation-Driven Deformation Multi-View Stereo with Spherical Refinement and EM Optimization

Zhenlong Yuan, Jiakai Cao, Zhaoxin Li et al.

AAAI 2024paperarXiv:2401.06385
35
citations
#968

Multi-granularity Correspondence Learning from Long-term Noisy Videos

Yijie Lin, Jie Zhang, Zhenyu Huang et al.

ICLR 2024oralarXiv:2401.16702
35
citations
#969

Do Generated Data Always Help Contrastive Learning?

Yifei Wang, Jizhe Zhang, Yisen Wang

ICLR 2024posterarXiv:2403.12448
35
citations
#970

Alternate Diverse Teaching for Semi-supervised Medical Image Segmentation

Zhen Zhao, Zicheng Wang, Dian Yu et al.

ECCV 2024posterarXiv:2311.17325
35
citations
#971

Making Large Language Models Better Planners with Reasoning-Decision Alignment

Zhijian Huang, Tao Tang, Shaoxiang Chen et al.

ECCV 2024posterarXiv:2408.13890
35
citations
#972

GPSFormer: A Global Perception and Local Structure Fitting-based Transformer for Point Cloud Understanding

Changshuo Wang, Meiqing Wu, Siew-Kei Lam et al.

ECCV 2024posterarXiv:2407.13519
35
citations
#973

Alchemist: Parametric Control of Material Properties with Diffusion Models

Prafull Sharma, Varun Jampani, Yuanzhen Li et al.

CVPR 2024posterarXiv:2312.02970
35
citations
#974

Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance

Dazhong Shen, Guanglu Song, Zeyue Xue et al.

CVPR 2024posterarXiv:2404.05384
35
citations
#975

Towards Energy Efficient Spiking Neural Networks: An Unstructured Pruning Framework

Xinyu Shi, Jianhao Ding, Zecheng Hao et al.

ICLR 2024spotlight
35
citations
#976

MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation

Hanzhe Hu, Zhizhuo Zhou, Varun Jampani et al.

CVPR 2024posterarXiv:2404.03656
35
citations
#977

Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models

Shweta Mahajan, Tanzila Rahman, Kwang Moo Yi et al.

CVPR 2024posterarXiv:2312.12416
35
citations
#978

SemGrasp: Semantic Grasp Generation via Language Aligned Discretization

Kailin Li, Jingbo Wang, Lixin Yang et al.

ECCV 2024posterarXiv:2404.03590
34
citations
#979

Disentangled Clothed Avatar Generation from Text Descriptions

Jionghao Wang, Yuan Liu, Zhiyang Dou et al.

ECCV 2024posterarXiv:2312.05295
34
citations
#980

Transformer-Based No-Reference Image Quality Assessment via Supervised Contrastive Learning

Jinsong Shi, Pan Gao, Jie Qin

AAAI 2024paperarXiv:2312.06995
34
citations
#981

GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction

Xiao Chen, Quanyi Li, Tai Wang et al.

CVPR 2024posterarXiv:2402.16174
34
citations
#982

ExtDM: Distribution Extrapolation Diffusion Model for Video Prediction

Zhicheng Zhang, Junyao Hu, Wentao Cheng et al.

CVPR 2024poster
34
citations
#983

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm

Yi Wu, Ziqiang Li, Heliang Zheng et al.

ECCV 2024posterarXiv:2403.11781
34
citations
#984

Progressive Semantic-Guided Vision Transformer for Zero-Shot Learning

Shiming Chen, Wenjin Hou, Salman Khan et al.

CVPR 2024posterarXiv:2404.07713
34
citations
#985

Fake It till You Make It: Curricular Dynamic Forgery Augmentations towards General Deepfake Detection

Yuzhen Lin, Wentang Song, Bin Li et al.

ECCV 2024posterarXiv:2409.14444
34
citations
#986

Multi-Modal Latent Space Learning for Chain-of-Thought Reasoning in Language Models

Liqi He, Zuchao Li, Xiantao Cai et al.

AAAI 2024paperarXiv:2312.08762
34
citations
#987

Unraveling Instance Associations: A Closer Look for Audio-Visual Segmentation

Yuanhong Chen, Yuyuan Liu, Hu Wang et al.

CVPR 2024posterarXiv:2304.02970
34
citations
#988

HiFi-123: Towards High-fidelity One Image to 3D Content Generation

Wangbo Yu, Li Yuan, Yanpei Cao et al.

ECCV 2024posterarXiv:2310.06744
34
citations
#989

Neural Redshift: Random Networks are not Random Functions

Damien Teney, Armand Nicolicioiu, Valentin Hartmann et al.

CVPR 2024posterarXiv:2403.02241
34
citations
#990

MonoNPHM: Dynamic Head Reconstruction from Monocular Videos

Simon Giebenhain, Tobias Kirschstein, Markos Georgopoulos et al.

CVPR 2024highlightarXiv:2312.06740
34
citations
#991

Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model

Decheng Liu, Xijun Wang, Chunlei Peng et al.

AAAI 2024paperarXiv:2312.11285
34
citations
#992

Generalizable Human Gaussians for Sparse View Synthesis

Youngjoong Kwon, Baole Fang, Yixing Lu et al.

ECCV 2024posterarXiv:2407.12777
34
citations
#993

Prompting Language-Informed Distribution for Compositional Zero-Shot Learning

Wentao Bao, Lichang Chen, Heng Huang et al.

ECCV 2024posterarXiv:2305.14428
34
citations
#994

Active Generalized Category Discovery

Shijie Ma, Fei Zhu, Zhun Zhong et al.

CVPR 2024posterarXiv:2403.04272
34
citations
#995

LingoQA: Video Question Answering for Autonomous Driving

Ana-Maria Marcu, Long Chen, Jan Hünermann et al.

ECCV 2024poster
34
citations
#996

The Consensus Game: Language Model Generation via Equilibrium Search

Athul Jacob, Yikang Shen, Gabriele Farina et al.

ICLR 2024spotlightarXiv:2310.09139
34
citations
#997

SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Qingwen Zhang, Yi Yang, Peizheng Li et al.

ECCV 2024posterarXiv:2407.01702
34
citations
#998

Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object Detection

Junjie Huang, Yun Ye, Zhujin Liang et al.

ECCV 2024posterarXiv:2311.07152
34
citations
#999

SafeDreamer: Safe Reinforcement Learning with World Models

Weidong Huang, Jiaming Ji, Chunhe Xia et al.

ICLR 2024posterarXiv:2307.07176
34
citations
#1000

Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models

Peifei Zhu, Tsubasa Takahashi, Hirokatsu Kataoka

CVPR 2024posterarXiv:2404.09401
34
citations