Most Cited 2024 "transfer reinforcement learning" Papers

12,324 papers found • Page 13 of 62

#2401

GPSFormer: A Global Perception and Local Structure Fitting-based Transformer for Point Cloud Understanding

Changshuo Wang, Meiqing Wu, Siew-Kei Lam et al.

ECCV 2024arXiv:2407.13519
36
citations
#2402

When Model Meets New Normals: Test-Time Adaptation for Unsupervised Time-Series Anomaly Detection

AAAI 2024paperarXiv:2312.11976
36
citations
#2403

Equivariant Plug-and-Play Image Reconstruction

Matthieu Terris, Thomas Moreau, Nelly Pustelnik et al.

CVPR 2024arXiv:2312.01831
36
citations
#2404

Equivariant Frames and the Impossibility of Continuous Canonicalization

Nadav Dym, Hannah Lawrence, Jonathan Siegel

ICML 2024arXiv:2402.16077
36
citations
#2405

Hide in Thicket: Generating Imperceptible and Rational Adversarial Perturbations on 3D Point Clouds

Tianrui Lou, Xiaojun Jia, Jindong Gu et al.

CVPR 2024arXiv:2403.05247
36
citations
#2406

Robust Emotion Recognition in Context Debiasing

Dingkang Yang, Kun Yang, Mingcheng Li et al.

CVPR 2024arXiv:2403.05963
36
citations
#2407

Privacy-Preserving Instructions for Aligning Large Language Models

Da Yu, Peter Kairouz, Sewoong Oh et al.

ICML 2024arXiv:2402.13659
36
citations
#2408

Fake It till You Make It: Curricular Dynamic Forgery Augmentations towards General Deepfake Detection

Yuzhen Lin, Wentang Song, Bin Li et al.

ECCV 2024arXiv:2409.14444
36
citations
#2409

Reasoning with Latent Diffusion in Offline Reinforcement Learning

Siddarth Venkatraman, Shivesh Khaitan, Ravi Tej Akella et al.

ICLR 2024oralarXiv:2309.06599
36
citations
#2410

OpenTab: Advancing Large Language Models as Open-domain Table Reasoners

Kezhi Kong, Jiani Zhang, Zhengyuan Shen et al.

ICLR 2024arXiv:2402.14361
36
citations
#2411

Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark

Ziyang Chen, Israel D. Gebru, Christian Richardt et al.

CVPR 2024highlightarXiv:2403.18821
36
citations
#2412

Large Language Models are Efficient Learners of Noise-Robust Speech Recognition

Yuchen Hu, CHEN CHEN, Chao-Han Huck Yang et al.

ICLR 2024spotlightarXiv:2401.10446
36
citations
#2413

Situational Awareness Matters in 3D Vision Language Reasoning

Yunze Man, Liang-Yan Gui, Yu-Xiong Wang

CVPR 2024arXiv:2406.07544
36
citations
#2414

VLM2Scene: Self-Supervised Image-Text-LiDAR Learning with Foundation Models for Autonomous Driving Scene Understanding

Guibiao Liao, Jiankun Li, Xiaoqing Ye

AAAI 2024paper
36
citations
#2415

DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching

Guanghe Li, Yixiang Shan, Zhengbang Zhu et al.

ICML 2024arXiv:2402.02439
36
citations
#2416

NeuSurf: On-Surface Priors for Neural Surface Reconstruction from Sparse Input Views

Han Huang, Yulun Wu, Junsheng Zhou et al.

AAAI 2024paperarXiv:2312.13977
36
citations
#2417

Producing and Leveraging Online Map Uncertainty in Trajectory Prediction

Xunjiang Gu, Guanyu Song, Igor Gilitschenski et al.

CVPR 2024arXiv:2403.16439
36
citations
#2418

Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction

Jiatong Shi, Hirofumi Inaguma, Xutai Ma et al.

ICLR 2024spotlightarXiv:2310.02720
36
citations
#2419

Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation

Seung Hyun Lee, Yinxiao Li, Junjie Ke et al.

ECCV 2024arXiv:2401.05675
36
citations
#2420

Training Generative Image Super-Resolution Models by Wavelet-Domain Losses Enables Better Control of Artifacts

Cansu Korkmaz, Ahmet Murat Tekalp, Zafer Dogan

CVPR 2024arXiv:2402.19215
36
citations
#2421

Mono3DVG: 3D Visual Grounding in Monocular Images

Yangfan Zhan, Yuan Yuan, Zhitong Xiong

AAAI 2024paperarXiv:2312.08022
36
citations
#2422

SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction

Marko Mihajlovic, Sergey Prokudin, Siyu Tang et al.

ECCV 2024arXiv:2409.11211
36
citations
#2423

FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language Models

Jingwei Sun, Ziyue Xu, Hongxu Yin et al.

ICML 2024arXiv:2310.01467
36
citations
#2424

When to Show a Suggestion? Integrating Human Feedback in AI-Assisted Programming

Hussein Mozannar, Gagan Bansal, Adam Fourney et al.

AAAI 2024paperarXiv:2306.04930
36
citations
#2425

Tactile-Augmented Radiance Fields

Yiming Dou, Fengyu Yang, Yi Liu et al.

CVPR 2024arXiv:2405.04534
36
citations
#2426

Provable Compositional Generalization for Object-Centric Learning

Thaddäus Wiedemer, Jack Brady, Alexander Panfilov et al.

ICLR 2024arXiv:2310.05327
36
citations
#2427

How to Fine-Tune Vision Models with SGD

Ananya Kumar, Ruoqi Shen, Sebastien Bubeck et al.

ICLR 2024arXiv:2211.09359
36
citations
#2428

Neural Redshift: Random Networks are not Random Functions

Damien Teney, Armand Nicolicioiu, Valentin Hartmann et al.

CVPR 2024arXiv:2403.02241
36
citations
#2429

View-decoupled Transformer for Person Re-identification under Aerial-ground Camera Network

Quan Zhang, Lei Wang, Vishal M. Patel et al.

CVPR 2024arXiv:2403.14513
36
citations
#2430

MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices

Yang Zhao, Zhisheng Xiao, Yanwu Xu et al.

ECCV 2024arXiv:2311.16567
36
citations
#2431

PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology

YUXUAN SUN, Hao Wu, Chenglu Zhu et al.

ECCV 2024arXiv:2401.16355
36
citations
#2432

From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers

Muhammed Emrullah Ildiz, Yixiao HUANG, Yingcong Li et al.

ICML 2024arXiv:2402.13512
36
citations
#2433

On the Generalization of Stochastic Gradient Descent with Momentum

Ali Ramezani-Kebrya, Kimon Antonakopoulos, Volkan Cevher et al.

ICML 2024arXiv:1809.04564
36
citations
#2434

m&m’s: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks

Zixian Ma, Weikai Huang, Jieyu Zhang et al.

ECCV 2024arXiv:2403.11085
36
citations
#2435

Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis

Bichen Wu, Ching-Yao Chuang, Xiaoyan Wang et al.

CVPR 2024arXiv:2312.13834
36
citations
#2436

V-IRL: Grounding Virtual Intelligence in Real Life

Jihan YANG, Runyu Ding, Ellis L Brown et al.

ECCV 2024arXiv:2402.03310
36
citations
#2437

CosmicMan: A Text-to-Image Foundation Model for Humans

Shikai Li, Jianglin Fu, Kaiyuan Liu et al.

CVPR 2024highlightarXiv:2404.01294
36
citations
#2438

Interactive Continual Learning: Fast and Slow Thinking

Biqing Qi, Xinquan Chen, Junqi Gao et al.

CVPR 2024arXiv:2403.02628
36
citations
#2439

Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices

Nathaniel Cohen, Vladimir Kulikov, Matan Kleiner et al.

ICML 2024oralarXiv:2405.12211
36
citations
#2440

Distilling Semantic Priors from SAM to Efficient Image Restoration Models

Quan Zhang, Xiaoyu Liu, Wei Li et al.

CVPR 2024arXiv:2403.16368
36
citations
#2441

Momentum Benefits Non-iid Federated Learning Simply and Provably

Ziheng Cheng, Xinmeng Huang, Pengfei Wu et al.

ICLR 2024arXiv:2306.16504
36
citations
#2442

Synergistic Multiscale Detail Refinement via Intrinsic Supervision for Underwater Image Enhancement

Dehuan Zhang, Jingchun Zhou, Chunle Guo et al.

AAAI 2024paperarXiv:2308.11932
36
citations
#2443

LION: Implicit Vision Prompt Tuning

Haixin Wang, Jianlong Chang, Yihang Zhai et al.

AAAI 2024paperarXiv:2303.09992
36
citations
#2444

LitCab: Lightweight Language Model Calibration over Short- and Long-form Responses

Xin Liu, Muhammad Khalifa, Lu Wang

ICLR 2024arXiv:2310.19208
36
citations
#2445

Gradient Alignment for Cross-Domain Face Anti-Spoofing

MINH BINH LE, Simon Woo

CVPR 2024arXiv:2402.18817
36
citations
#2446

AMSP-UOD: When Vortex Convolution and Stochastic Perturbation Meet Underwater Object Detection

Jingchun Zhou, Zongxin He, Kin-Man Lam et al.

AAAI 2024paperarXiv:2308.11918
36
citations
#2447

Progressive Semantic-Guided Vision Transformer for Zero-Shot Learning

Shiming Chen, Wenjin Hou, Salman Khan et al.

CVPR 2024arXiv:2404.07713
36
citations
#2448

Control4D: Efficient 4D Portrait Editing with Text

Ruizhi Shao, Jingxiang Sun, Cheng Peng et al.

CVPR 2024arXiv:2305.20082
36
citations
#2449

Federated Recommendation with Additive Personalization

Zhiwei Li, Guodong Long, Tianyi Zhou

ICLR 2024arXiv:2301.09109
36
citations
#2450

Vamos: Versatile Action Models for Video Understanding

Shijie Wang, Qi Zhao, Minh Quan et al.

ECCV 2024arXiv:2311.13627
36
citations
#2451

Knowledge Distillation Based on Transformed Teacher Matching

Kaixiang Zheng, EN-HUI YANG

ICLR 2024arXiv:2402.11148
36
citations
#2452

6385 Efficient Spiking Neural Networks with Sparse Selective Activation for Continual Learning

Jiangrong Shen, Wenyao Ni, Qi Xu et al.

AAAI 2024paper
35
citations
#2453

Dynamic Sparse Training with Structured Sparsity

Mike Lasby, Anna Golubeva, Utku Evci et al.

ICLR 2024arXiv:2305.02299
35
citations
#2454

Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models

Fangzhao Zhang, Mert Pilanci

ICML 2024arXiv:2402.02347
35
citations
#2455

Token-Level Contrastive Learning with Modality-Aware Prompting for Multimodal Intent Recognition

Qianrui Zhou, Hua Xu, Hao Li et al.

AAAI 2024paperarXiv:2312.14667
35
citations
#2456

SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers

Ioannis Kakogeorgiou, Spyros Gidaris, Konstantinos Karantzalos et al.

CVPR 2024highlightarXiv:2312.00648
35
citations
#2457

Prompting Language-Informed Distribution for Compositional Zero-Shot Learning

Wentao Bao, Lichang Chen, Heng Huang et al.

ECCV 2024arXiv:2305.14428
35
citations
#2458

Asynchronous Large Language Model Enhanced Planner for Autonomous Driving

Yuan Chen, Zi-han Ding, Ziqin Wang et al.

ECCV 2024arXiv:2406.14556
35
citations
#2459

GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction

Xiao Chen, Quanyi Li, Tai Wang et al.

CVPR 2024arXiv:2402.16174
35
citations
#2460

Revisiting Link Prediction: a data perspective

Haitao Mao, Juanhui Li, Harry Shomer et al.

ICLR 2024arXiv:2310.00793
35
citations
#2461

Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues

Antonio Orvieto, Soham De, Caglar Gulcehre et al.

ICML 2024arXiv:2307.11888
35
citations
#2462

Bespoke Solvers for Generative Flow Models

Neta Shaul, Juan Perez, Ricky T. Q. Chen et al.

ICLR 2024spotlightarXiv:2310.19075
35
citations
#2463

Towards Energy Efficient Spiking Neural Networks: An Unstructured Pruning Framework

Xinyu Shi, Jianhao Ding, Zecheng Hao et al.

ICLR 2024spotlight
35
citations
#2464

Towards Few-Shot Adaptation of Foundation Models via Multitask Finetuning

Zhuoyan Xu, Zhenmei Shi, Junyi Wei et al.

ICLR 2024arXiv:2402.15017
35
citations
#2465

Collaborating Foundation Models for Domain Generalized Semantic Segmentation

Yasser Benigmim, Subhankar Roy, Slim Essid et al.

CVPR 2024arXiv:2312.09788
35
citations
#2466

LiDAR4D: Dynamic Neural Fields for Novel Space-time View LiDAR Synthesis

Zehan Zheng, Fan Lu, Weiyi Xue et al.

CVPR 2024arXiv:2404.02742
35
citations
#2467

Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences

Axel Barroso-Laguna, Sowmya Munukutla, Victor Adrian Prisacariu et al.

CVPR 2024arXiv:2404.06337
35
citations
#2468

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Trung Dao, Thuan Nguyen, Thanh Van Le et al.

ECCV 2024arXiv:2408.14176
35
citations
#2469

Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching

Yuchen Zhang, Tianle Zhang, Kai Wang et al.

ICML 2024arXiv:2402.05011
35
citations
#2470

HRVDA: High-Resolution Visual Document Assistant

Chaohu Liu, Kun Yin, Haoyu Cao et al.

CVPR 2024arXiv:2404.06918
35
citations
#2471

A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks

Behrad Moniri, Donghwan Lee, Hamed Hassani et al.

ICML 2024arXiv:2310.07891
35
citations
#2472

Larimar: Large Language Models with Episodic Memory Control

Payel Das, Subhajit Chaudhury, Elliot Nelson et al.

ICML 2024arXiv:2403.11901
35
citations
#2473

FedMut: Generalized Federated Learning via Stochastic Mutation

Ming Hu, Cao Yue, Anran Li et al.

AAAI 2024paper
35
citations
#2474

Full-Atom Peptide Design based on Multi-modal Flow Matching

Jiahan Li, Chaoran Cheng, Zuofan Wu et al.

ICML 2024arXiv:2406.00735
35
citations
#2475

HiFi-123: Towards High-fidelity One Image to 3D Content Generation

Wangbo Yu, Li Yuan, Yanpei Cao et al.

ECCV 2024arXiv:2310.06744
35
citations
#2476

PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models

Yiming Zhang, Zhening Xing, Yanhong Zeng et al.

CVPR 2024arXiv:2312.13964
35
citations
#2477

RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences

Jie Cheng, Gang Xiong, Xingyuan Dai et al.

ICML 2024spotlightarXiv:2402.17257
35
citations
#2478

LaWa: Using Latent Space for In-Generation Image Watermarking

Ahmad Rezaei, Mohammad Akbari, Saeed Ranjbar Alvar et al.

ECCV 2024arXiv:2408.05868
35
citations
#2479

How I Warped Your Noise: a Temporally-Correlated Noise Prior for Diffusion Models

Pascal Chang, Jingwei Tang, Markus Gross et al.

ICLR 2024oralarXiv:2504.03072
35
citations
#2480

Frequency Decoupling for Motion Magnification via Multi-Level Isomorphic Architecture

Fei Wang, Dan Guo, Kun Li et al.

CVPR 2024arXiv:2403.07347
35
citations
#2481

UGG: Unified Generative Grasping

Jiaxin Lu, Hao Kang, Haoxiang Li et al.

ECCV 2024arXiv:2311.16917
35
citations
#2482

Social-Transmotion: Promptable Human Trajectory Prediction

Saeed Saadatnejad, Yang Gao, Kaouther Messaoud et al.

ICLR 2024oralarXiv:2312.16168
35
citations
#2483

Do Generated Data Always Help Contrastive Learning?

Yifei Wang, Jizhe Zhang, Yisen Wang

ICLR 2024arXiv:2403.12448
35
citations
#2484

Open-World Human-Object Interaction Detection via Multi-modal Prompts

Jie Yang, Bingliang Li, Ailing Zeng et al.

CVPR 2024arXiv:2406.07221
35
citations
#2485

Training-Free Quantum Architecture Search

Zhimin He, Maijie Deng, Shenggen Zheng et al.

AAAI 2024paper
35
citations
#2486

ReMasker: Imputing Tabular Data with Masked Autoencoding

Tianyu Du, Luca Melis, Ting Wang

ICLR 2024arXiv:2309.13793
35
citations
#2487

Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval

Hailang Huang, Zhijie Nie, Ziqiao Wang et al.

AAAI 2024paperarXiv:2403.05261
35
citations
#2488

The Consensus Game: Language Model Generation via Equilibrium Search

Athul Jacob, Yikang Shen, Gabriele Farina et al.

ICLR 2024spotlightarXiv:2310.09139
35
citations
#2489

Offline Training of Language Model Agents with Functions as Learnable Weights

Shaokun Zhang, Jieyu Zhang, Jiale Liu et al.

ICML 2024arXiv:2402.11359
35
citations
#2490

SD-MVS: Segmentation-Driven Deformation Multi-View Stereo with Spherical Refinement and EM Optimization

Zhenlong Yuan, Jiakai Cao, Zhaoxin Li et al.

AAAI 2024paperarXiv:2401.06385
35
citations
#2491

Propagation Tree Is Not Deep: Adaptive Graph Contrastive Learning Approach for Rumor Detection

AAAI 2024paperarXiv:2508.07201
35
citations
#2492

LoRA Training in the NTK Regime has No Spurious Local Minima

Uijeong Jang, Jason Lee, Ernest Ryu

ICML 2024arXiv:2402.11867
35
citations
#2493

Generative Multi-Modal Knowledge Retrieval with Large Language Models

Xinwei Long, Jiali Zeng, Fandong Meng et al.

AAAI 2024paperarXiv:2401.08206
35
citations
#2494

Offline Actor-Critic Reinforcement Learning Scales to Large Models

Jost Tobias Springenberg, Abbas Abdolmaleki, Jingwei Zhang et al.

ICML 2024oralarXiv:2402.05546
35
citations
#2495

Subtractive Mixture Models via Squaring: Representation and Learning

Lorenzo Loconte, Aleksanteri Sladek, Stefan Mengel et al.

ICLR 2024spotlightarXiv:2310.00724
35
citations
#2496

SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Qingwen Zhang, Yi Yang, Peizheng Li et al.

ECCV 2024arXiv:2407.01702
35
citations
#2497

Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction

Hao Li, Ying Chen, Yifei Chen et al.

CVPR 2024arXiv:2402.19326
35
citations
#2498

ICP-Flow: LiDAR Scene Flow Estimation with ICP

Yancong Lin, Holger Caesar

CVPR 2024arXiv:2402.17351
35
citations
#2499

Alchemist: Parametric Control of Material Properties with Diffusion Models

Prafull Sharma, Varun Jampani, Yuanzhen Li et al.

CVPR 2024arXiv:2312.02970
35
citations
#2500

Learning Continuous Implicit Field with Local Distance Indicator for Arbitrary-Scale Point Cloud Upsampling

Shujuan Li, Junsheng Zhou, Baorui Ma et al.

AAAI 2024paperarXiv:2312.15133
35
citations
#2501

Do Vision and Language Encoders Represent the World Similarly?

Mayug Maniparambil, Raiymbek Akshulakov, YASSER ABDELAZIZ DAHOU DJILALI et al.

CVPR 2024arXiv:2401.05224
35
citations
#2502

Towards Multimodal Sentiment Analysis Debiasing via Bias Purification

Dingkang Yang, Mingcheng Li, Dongling Xiao et al.

ECCV 2024arXiv:2403.05023
35
citations
#2503

PTaRL: Prototype-based Tabular Representation Learning via Space Calibration

Hangting Ye, Wei Fan, Xiaozhuang Song et al.

ICLR 2024spotlightarXiv:2407.05364
35
citations
#2504

Fair Resource Allocation in Multi-Task Learning

Hao Ban, Kaiyi Ji

ICML 2024arXiv:2402.15638
35
citations
#2505

LTGC: Long-tail Recognition via Leveraging LLMs-driven Generated Content

Qihao Zhao, Yalun Dai, Hao Li et al.

CVPR 2024arXiv:2403.05854
35
citations
#2506

MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant

Chenlu Zhan, Gaoang Wang, Yu LIN et al.

CVPR 2024arXiv:2403.04290
35
citations
#2507

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation

Yunhao Ge, Xiaohui Zeng, Jacob Huffman et al.

CVPR 2024arXiv:2404.19752
35
citations
#2508

CLIF: Complementary Leaky Integrate-and-Fire Neuron for Spiking Neural Networks

Yulong Huang, Xiaopeng LIN, Hongwei Ren et al.

ICML 2024oralarXiv:2402.04663
35
citations
#2509

CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation

Lingjun Zhao, Jingyu Song, Katherine Skinner

CVPR 2024arXiv:2403.19104
35
citations
#2510

Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts

Jialin Wu, Xia Hu, Yaqing Wang et al.

CVPR 2024highlightarXiv:2312.00968
34
citations
#2511

Minimax Optimality of Score-based Diffusion Models: Beyond the Density Lower Bound Assumptions

Kaihong Zhang, Heqi Yin, Feng Liang et al.

ICML 2024spotlightarXiv:2402.15602
34
citations
#2512

OpenStreetView-5M: The Many Roads to Global Visual Geolocation

Guillaume Astruc, Nicolas Dufour, Ioannis Siglidis et al.

CVPR 2024arXiv:2404.18873
34
citations
#2513

DGCLUSTER: A Neural Framework for Attributed Graph Clustering via Modularity Maximization

Aritra Bhowmick, Mert Kosan, Zexi Huang et al.

AAAI 2024paperarXiv:2312.12697
34
citations
#2514

How Does Unlabeled Data Provably Help Out-of-Distribution Detection?

Xuefeng Du, Zhen Fang, Ilias Diakonikolas et al.

ICLR 2024arXiv:2402.03502
34
citations
#2515

SemGrasp: Semantic Grasp Generation via Language Aligned Discretization

Kailin Li, Jingbo Wang, Lixin Yang et al.

ECCV 2024arXiv:2404.03590
34
citations
#2516

Towards Effective and General Graph Unlearning via Mutual Evolution

Xunkai Li, Yulin Zhao, Zhengyu Wu et al.

AAAI 2024paperarXiv:2401.11760
34
citations
#2517

AutoAD III: The Prequel – Back to the Pixels

Tengda Han, Max Bain, Arsha Nagrani et al.

CVPR 2024arXiv:2404.14412
34
citations
#2518

CoGS: Controllable Gaussian Splatting

Heng Yu, Joel Julin, Zoltán Á. Milacski et al.

CVPR 2024arXiv:2312.05664
34
citations
#2519

Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance

I-HSIANG CHEN, Wei-Ting Chen, Yu-Wei Liu et al.

ECCV 2024arXiv:2405.10589
34
citations
#2520

Early Stopping Against Label Noise Without Validation Data

Suqin Yuan, Lei Feng, Tongliang Liu

ICLR 2024arXiv:2502.07551
34
citations
#2521

Disentangled Clothed Avatar Generation from Text Descriptions

Jionghao Wang, Yuan Liu, Zhiyang Dou et al.

ECCV 2024arXiv:2312.05295
34
citations
#2522

Three Pillars Improving Vision Foundation Model Distillation for Lidar

Gilles Puy, Spyros Gidaris, Alexandre Boulch et al.

CVPR 2024arXiv:2310.17504
34
citations
#2523

AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving

Mingfu Liang, Jong-Chyi Su, Samuel Schulter et al.

CVPR 2024arXiv:2403.17373
34
citations
#2524

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm

Yi Wu, Ziqiang Li, Heliang Zheng et al.

ECCV 2024arXiv:2403.11781
34
citations
#2525

AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors

Kaishen Yuan, Zitong Yu, Xin Liu et al.

ECCV 2024arXiv:2403.04697
34
citations
#2526

HexGen: Generative Inference of Large Language Model over Heterogeneous Environment

Youhe Jiang, Ran Yan, Xiaozhe Yao et al.

ICML 2024arXiv:2311.11514
34
citations
#2527

How Far Can We Compress Instant-NGP-Based NeRF?

Yihang Chen, Qianyi Wu, Mehrtash Harandi et al.

CVPR 2024arXiv:2406.04101
34
citations
#2528

WeConvene: Learned Image Compression with Wavelet-Domain Convolution and Entropy Model

Haisheng Fu, Jie Liang, Zhenman Fang et al.

ECCV 2024arXiv:2407.09983
34
citations
#2529

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Byung-Kwan Lee, Beomchan Park, Chae Won Kim et al.

ECCV 2024arXiv:2403.07508
34
citations
#2530

UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather

Haimei Zhao, Jing Zhang, Zhuo Chen et al.

CVPR 2024arXiv:2404.05145
34
citations
#2531

GIN-SD: Source Detection in Graphs with Incomplete Nodes via Positional Encoding and Attentive Fusion

Le Cheng, Peican Zhu, Keke Tang et al.

AAAI 2024paperarXiv:2403.00014
34
citations
#2532

Concept-Guided Prompt Learning for Generalization in Vision-Language Models

Yi Zhang, Ce Zhang, Ke Yu et al.

AAAI 2024paperarXiv:2401.07457
34
citations
#2533

Active Generalized Category Discovery

Shijie Ma, Fei Zhu, Zhun Zhong et al.

CVPR 2024arXiv:2403.04272
34
citations
#2534

PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition

Ziyang Zhang, Qizhen Zhang, Jakob Foerster

ICML 2024arXiv:2405.07932
34
citations
#2535

InstaGen: Enhancing Object Detection by Training on Synthetic Dataset

Chengjian Feng, Yujie Zhong, Zequn Jie et al.

CVPR 2024arXiv:2402.05937
34
citations
#2536

NeWRF: A Deep Learning Framework for Wireless Radiation Field Reconstruction and Channel Prediction

Haofan Lu, Christopher Vattheuer, Baharan Mirzasoleiman et al.

ICML 2024arXiv:2403.03241
34
citations
#2537

Probabilities of Causation with Nonbinary Treatment and Effect

Ang Li, Judea Pearl

AAAI 2024paperarXiv:2208.09568
34
citations
#2538

Generalizable Human Gaussians for Sparse View Synthesis

Youngjoong Kwon, Baole Fang, Yixing Lu et al.

ECCV 2024arXiv:2407.12777
34
citations
#2539

LingoQA: Video Question Answering for Autonomous Driving

Ana-Maria Marcu, Long Chen, Jan Hünermann et al.

ECCV 2024
34
citations
#2540

Adversarial Prompt Tuning for Vision-Language Models

Jiaming Zhang, Xingjun Ma, Xin Wang et al.

ECCV 2024arXiv:2311.11261
34
citations
#2541

EquiPocket: an E(3)-Equivariant Geometric Graph Neural Network for Ligand Binding Site Prediction

yang zhang, Zhewei Wei, Ye Yuan et al.

ICML 2024arXiv:2302.12177
34
citations
#2542

Revisiting Adversarial Training at Scale

Zeyu Wang, Xianhang li, Hongru Zhu et al.

CVPR 2024arXiv:2401.04727
34
citations
#2543

Simple Semantic-Aided Few-Shot Learning

Hai Zhang, Junzhe Xu, Shanlin Jiang et al.

CVPR 2024arXiv:2311.18649
34
citations
#2544

An Autoregressive Text-to-Graph Framework for Joint Entity and Relation Extraction

Urchade Zaratiana, Nadi Tomeh, Pierre Holat et al.

AAAI 2024paperarXiv:2401.01326
34
citations
#2545

Lyapunov-stable Neural Control for State and Output Feedback: A Novel Formulation

Lujie Yang, Hongkai Dai, Zhouxing Shi et al.

ICML 2024arXiv:2404.07956
34
citations
#2546

Assessing Large Language Models on Climate Information

Jannis Bulian, Mike Schäfer, Afra Amini et al.

ICML 2024arXiv:2310.02932
34
citations
#2547

Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models

Yiwen Tang, Ray Zhang, Zoey Guo et al.

AAAI 2024paperarXiv:2310.03059
34
citations
#2548

Jointly Training Large Autoregressive Multimodal Models

Emanuele Aiello, Lili Yu, Yixin Nie et al.

ICLR 2024arXiv:2309.15564
34
citations
#2549

Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models

Jan van den Brand, Zhao Song, Tianyi Zhou

ICML 2024arXiv:2304.02207
34
citations
#2550

Calibrating Multi-modal Representations: A Pursuit of Group Robustness without Annotations

Chenyu You, Yifei Min, Weicheng Dai et al.

CVPR 2024arXiv:2403.07241
34
citations
#2551

Hierarchical Multi-Marginal Optimal Transport for Network Alignment

Zhichen Zeng, Boxin Du, Si Zhang et al.

AAAI 2024paperarXiv:2310.04470
34
citations
#2552

Audio-Synchronized Visual Animation

Lin Zhang, Shentong Mo, Yijing Zhang et al.

ECCV 2024arXiv:2403.05659
34
citations
#2553

XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution

Yunpeng Qu, Kun Yuan, Kai Zhao et al.

ECCV 2024arXiv:2403.05049
34
citations
#2554

Low Rank Matrix Completion via Robust Alternating Minimization in Nearly Linear Time

Yuzhou Gu, Zhao Song, Junze Yin et al.

ICLR 2024arXiv:2302.11068
34
citations
#2555

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

Ruizhe Shi, Yuyao Liu, Yanjie Ze et al.

ICLR 2024arXiv:2310.20587
34
citations
#2556

Scalable Language Model with Generalized Continual Learning

Bohao PENG, Zhuotao Tian, Shu Liu et al.

ICLR 2024arXiv:2404.07470
34
citations
#2557

MedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models

Yan Cai, Linlin Wang, Ye Wang et al.

AAAI 2024paperarXiv:2312.12806
34
citations
#2558

Let Models Speak Ciphers: Multiagent Debate through Embeddings

Chau Pham, Boyi Liu, Yingxiang Yang et al.

ICLR 2024arXiv:2310.06272
34
citations
#2559

Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention

Xingyu Zhou, Leheng Zhang, Xiaorui Zhao et al.

CVPR 2024arXiv:2401.06312
34
citations
#2560

NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation

Jingyang Huo, Yikai Wang, Yanwei Fu et al.

ECCV 2024arXiv:2403.18211
34
citations
#2561

AGILE3D: Attention Guided Interactive Multi-object 3D Segmentation

Yuanwen Yue, Sabarinath Mahadevan, Jonas Schult et al.

ICLR 2024oralarXiv:2306.00977
34
citations
#2562

SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization

Jialong Guo, Xinghao Chen, Yehui Tang et al.

ICML 2024arXiv:2405.11582
34
citations
#2563

MonoNPHM: Dynamic Head Reconstruction from Monocular Videos

Simon Giebenhain, Tobias Kirschstein, Markos Georgopoulos et al.

CVPR 2024highlightarXiv:2312.06740
34
citations
#2564

MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning

Zhe Li, Laurence Yang, Bocheng Ren et al.

CVPR 2024arXiv:2402.02045
34
citations
#2565

Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models

Peifei Zhu, Tsubasa Takahashi, Hirokatsu Kataoka

CVPR 2024arXiv:2404.09401
34
citations
#2566

How Do Nonlinear Transformers Learn and Generalize in In-Context Learning?

Hongkang Li, Meng Wang, Songtao Lu et al.

ICML 2024arXiv:2402.15607
34
citations
#2567

Online Boosting Adaptive Learning under Concept Drift for Multistream Classification

En Yu, Jie Lu, Bin Zhang et al.

AAAI 2024paperarXiv:2312.10841
34
citations
#2568

Flextron: Many-in-One Flexible Large Language Model

Ruisi Cai, Saurav Muralidharan, Greg Heinrich et al.

ICML 2024arXiv:2406.10260
34
citations
#2569

Exploring Sparse Visual Prompt for Domain Adaptive Dense Prediction

Senqiao Yang, Jiarui Wu, Jiaming Liu et al.

AAAI 2024paperarXiv:2303.09792
34
citations
#2570

Beyond Accuracy: Evaluating Self-Consistency of Code Large Language Models with IdentityChain

Marcus J. Min, Yangruibo Ding, Luca Buratti et al.

ICLR 2024arXiv:2310.14053
34
citations
#2571

Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs

Woomin Song, Seunghyuk Oh, Sangwoo Mo et al.

ICLR 2024arXiv:2404.10308
34
citations
#2572

PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models

Fei Deng, Qifei Wang, Wei Wei et al.

CVPR 2024arXiv:2402.08714
34
citations
#2573

SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals

Rahul Thapa, Bryan He, Magnus Ruud Kjaer et al.

ICML 2024arXiv:2405.17766
34
citations
#2574

Relightable and Animatable Neural Avatar from Sparse-View Video

Zhen Xu, Sida Peng, Chen Geng et al.

CVPR 2024highlightarXiv:2308.07903
34
citations
#2575

High-fidelity Person-centric Subject-to-Image Synthesis

Yibin Wang, Weizhong Zhang, Jianwei Zheng et al.

CVPR 2024arXiv:2311.10329
34
citations
#2576

On the Scalability of Diffusion-based Text-to-Image Generation

Hao Li, Yang Zou, Ying Wang et al.

CVPR 2024arXiv:2404.02883
34
citations
#2577

SpikePoint: An Efficient Point-based Spiking Neural Network for Event Cameras Action Recognition

Hongwei Ren, Yue ZHOU, Xiaopeng LIN et al.

ICLR 2024spotlightarXiv:2310.07189
34
citations
#2578

Exploration and Anti-Exploration with Distributional Random Network Distillation

Kai Yang, jian tao, Jiafei Lyu et al.

ICML 2024arXiv:2401.09750
34
citations
#2579

On the Content Bias in Fréchet Video Distance

Songwei Ge, Aniruddha Mahapatra, Gaurav Parmar et al.

CVPR 2024arXiv:2404.12391
34
citations
#2580

Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning

Siteng Huang, Biao Gong, Yutong Feng et al.

CVPR 2024arXiv:2303.15230
34
citations
#2581

Learning Object State Changes in Videos: An Open-World Perspective

Zihui Xue, Kumar Ashutosh, Kristen Grauman

CVPR 2024arXiv:2312.11782
34
citations
#2582

Improved Implicit Neural Representation with Fourier Reparameterized Training

Kexuan Shi, Xingyu Zhou, Shuhang Gu

CVPR 2024arXiv:2401.07402
34
citations
#2583

Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object Detection

Junjie Huang, Yun Ye, Zhujin Liang et al.

ECCV 2024arXiv:2311.07152
34
citations
#2584

Spurious Feature Diversification Improves Out-of-distribution Generalization

LIN Yong, Lu Tan, Yifan HAO et al.

ICLR 2024arXiv:2309.17230
34
citations
#2585

Unraveling Instance Associations: A Closer Look for Audio-Visual Segmentation

Yuanhong Chen, Yuyuan Liu, Hu Wang et al.

CVPR 2024arXiv:2304.02970
34
citations
#2586

Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection

Feng Liu, Tengteng Huang, Qianjing Zhang et al.

ECCV 2024arXiv:2402.03634
34
citations
#2587

FineParser: A Fine-grained Spatio-temporal Action Parser for Human-centric Action Quality Assessment

Jinglin Xu, Sibo Yin, Guohao Zhao et al.

CVPR 2024arXiv:2405.06887
34
citations
#2588

ExtDM: Distribution Extrapolation Diffusion Model for Video Prediction

Zhicheng Zhang, Junyao Hu, Wentao Cheng et al.

CVPR 2024
34
citations
#2589

Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory

Sensen Gao, Xiaojun Jia, Xuhong Ren et al.

ECCV 2024arXiv:2403.12445
34
citations
#2590

REACTO: Reconstructing Articulated Objects from a Single Video

Chaoyue Song, Jiacheng Wei, Chuan-Sheng Foo et al.

CVPR 2024arXiv:2404.11151
34
citations
#2591

CoralSCOP: Segment any COral Image on this Planet

Zheng Ziqiang, Liang Haixin, Binh-Son Hua et al.

CVPR 2024highlight
34
citations
#2592

Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion

Bohan Li, Jiajun Deng, Wenyao Zhang et al.

ECCV 2024arXiv:2407.02077
33
citations
#2593

Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal

YUXIN WANG, Qianyi Wu, Guofeng Zhang et al.

ECCV 2024arXiv:2404.13679
33
citations
#2594

HowToCaption: Prompting LLMs to Transform Video Annotations at Scale

Nina Shvetsova, Anna Kukleva, Xudong Hong et al.

ECCV 2024arXiv:2310.04900
33
citations
#2595

Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks

Atli Kosson, Bettina Messmer, Martin Jaggi

ICML 2024arXiv:2305.17212
33
citations
#2596

Don't Play Favorites: Minority Guidance for Diffusion Models

Soobin Um, Suhyeon Lee, Jong Chul YE

ICLR 2024arXiv:2301.12334
33
citations
#2597

It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition

CHEN CHEN, Ruizhe Li, Yuchen Hu et al.

ICLR 2024arXiv:2402.05457
33
citations
#2598

G2P-DDM: Generating Sign Pose Sequence from Gloss Sequence with Discrete Diffusion Model

Pan Xie, Qipeng Zhang, Peng Taiying et al.

AAAI 2024paperarXiv:2208.09141
33
citations
#2599

AST-T5: Structure-Aware Pretraining for Code Generation and Understanding

Linyuan Gong, Mostafa Elhoushi, Alvin Cheung

ICML 2024arXiv:2401.03003
33
citations
#2600

Beyond TreeSHAP: Efficient Computation of Any-Order Shapley Interactions for Tree Ensembles

Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer et al.

AAAI 2024paperarXiv:2401.12069
33
citations