Most Cited 2024 &quot;action transformer&quot; Papers

CVPR 2024arXiv:2404.08958

#3402

AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning

Yuwei Tang, ZhenYi Lin, Qilong Wang et al.

CVPR 2024arXiv:2312.01531

#3403

SANeRF-HQ: Segment Anything for NeRF in High Quality

Yichen Liu, Benran Hu, Chi-Keung Tang et al.

ICML 2024oralarXiv:2405.03140

#3404

TimeMIL: Advancing Multivariate Time Series Classification via a Time-aware Multiple Instance Learning

Xiwen Chen, Peijie Qiu, Wenhui Zhu et al.

ICML 2024arXiv:2304.00776

#3405

Chain-of-Thought Predictive Control

Zhiwei Jia, Vineet Thumuluri, Fangchen Liu et al.

AAAI 2024paperarXiv:2308.12535

#3406

SCP: Spherical-Coordinate-Based Learned Point Cloud Compression

Ao Luo, Linxin Song, Keisuke Nonaka et al.

ICML 2024arXiv:2407.13237

#3407

LLM-Empowered State Representation for Reinforcement Learning

Boyuan Wang, Yun Qu, Yuhang Jiang et al.

ICML 2024arXiv:2402.01869

#3408

InferCept: Efficient Intercept Support for Augmented Large Language Model Inference

Reyna Abhyankar, Zijian He, Vikranth Srivatsa et al.

ECCV 2024arXiv:2312.14223

#3409

Fast Diffusion-Based Counterfactuals for Shortcut Removal and Generation

Nina Weng, Paraskevas Pegios, Eike Petersen et al.

ECCV 2024arXiv:2312.11897

#3410

Text-Conditioned Resampler For Long Form Video Understanding

Bruno Korbar, Yongqin Xian, Alessio Tonioni et al.

ICML 2024arXiv:2407.01606

#3411

On Discrete Prompt Optimization for Diffusion Models

Ruochen Wang, Ting Liu, Cho-Jui Hsieh et al.

CVPR 2024arXiv:2310.00258

#3412

NAYER: Noisy Layer Data Generation for Efficient and Effective Data-free Knowledge Distillation

Minh-Tuan Tran, Trung Le, Xuan-May Le et al.

ECCV 2024arXiv:2407.08476

#3413

VideoMamba: Spatio-Temporal Selective State Space Model

Jinyoung Park, Hee-Seon Kim, Kangwook Ko et al.

AAAI 2024paperarXiv:2401.12564

#3414

Graph Contrastive Invariant Learning from the Causal Perspective

9672 Yanhu Mo, Xiao Wang, Shaohua Fan et al.

CVPR 2024arXiv:2311.10356

#3415

Garment Recovery with Shape and Deformation Priors

Ren Li, Corentin Dumery, Benoît Guillard et al.

ECCV 2024arXiv:2408.05083

#3416

PreciseControl: Enhancing Text-To-Image Diffusion Models with Fine-Grained Attribute Control

Rishubh Parihar, Sachidanand VS, Sabariswaran Mani et al.

ICLR 2024arXiv:2502.14205

#3417

Accurate Forgetting for Heterogeneous Federated Continual Learning

Abudukelimu Wuerkaixi, Sen Cui, Jingfeng Zhang et al.

AAAI 2024paperarXiv:2312.10290

#3418

Runtime Analysis of the SMS-EMOA for Many-Objective Optimization

Weijie Zheng, Benjamin Doerr

AAAI 2024paperarXiv:2312.10758

#3419

SHaRPose: Sparse High-Resolution Representation for Human Pose Estimation

Xiaoqi An, Lin Zhao, Chen Gong et al.

ICML 2024arXiv:2405.02200

#3420

Position: Why We Must Rethink Empirical Research in Machine Learning

Moritz Herrmann, F. Julian D. Lange, Katharina Eggensperger et al.

ICML 2024spotlightarXiv:2306.03928

#3421

Designing Decision Support Systems using Counterfactual Prediction Sets

Eleni Straitouri, Manuel Gomez-Rodriguez

ICLR 2024arXiv:2310.10971

#3422

Context-Aware Meta-Learning

Christopher Fifty, Dennis Duan, Ronald Junkins et al.

AAAI 2024paperarXiv:2401.00271

#3423

HybridGait: A Benchmark for Spatial-Temporal Cloth-Changing Gait Recognition with Hybrid Explorations

Yilan Dong, Chunlin Yu, Ruiyang Ha et al.

CVPR 2024arXiv:2401.08739

#3424

EgoGen: An Egocentric Synthetic Data Generator

Gen Li, Kaifeng Zhao, Siwei Zhang et al.

ICML 2024arXiv:2401.03892

#3425

Sampling in Unit Time with Kernel Fisher-Rao Flow

Aimee Maurais, Youssef Marzouk

ICML 2024arXiv:2512.07289

#3426

Equivariant Diffusion for Crystal Structure Prediction

Peijia Lin, Pin Chen, Rui Jiao et al.

ICML 2024arXiv:2402.07087

#3427

Self-Correcting Self-Consuming Loops for Generative Model Training

Nate Gillman, Michael Freeman, Daksh Aggarwal et al.

ICML 2024arXiv:2308.07120

#3428

Position: Key Claims in LLM Research Have a Long Tail of Footnotes

Anna Rogers, Sasha Luccioni

#3429

VkD: Improving Knowledge Distillation using Orthogonal Projections

Roy Miles, Ismail Elezi, Jiankang Deng

ICML 2024arXiv:2405.17381

#3430

Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention

Zhen Qin, Weigao Sun, Dong Li et al.

CVPR 2024arXiv:2306.08736

#3431

LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation

Linfeng Yuan, Miaojing Shi, Zijie Yue et al.

ICLR 2024arXiv:2306.00824

#3432

Zero and Few-shot Semantic Parsing with Ambiguous Inputs

Elias Stengel-Eskin, Kyle Rawlins, Benjamin Van Durme

CVPR 2024arXiv:2404.00710

#3433

Unknown Prompt the only Lacuna: Unveiling CLIP's Potential for Open Domain Generalization

Mainak Singha, Ankit Jha, Shirsha Bose et al.

AAAI 2024paperarXiv:2305.06594

#3434

V2Meow: Meowing to the Visual Beat via Video-to-Music Generation

Kun Su, Judith Li, Qingqing Huang et al.

CVPR 2024arXiv:2406.00480

#3435

AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning

Duojun Huang, Xinyu Xiong, Jie Ma et al.

AAAI 2024paperarXiv:2310.05195

#3436

GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval

Yuting Wang, Jinpeng Wang, Bin Chen et al.

CVPR 2024arXiv:2403.05094

#3437

Face2Diffusion for Fast and Editable Face Personalization

Kaede Shiohara, Toshihiko Yamasaki

ICML 2024arXiv:2305.17326

#3438

Matrix Information Theory for Self-Supervised Learning

Yifan Zhang, Zhiquan Tan, Jingqin Yang et al.

AAAI 2024paperarXiv:2205.13340

#3439

Deep Active Learning with Noise Stability

Xingjian Li, Pengkun Yang, Yangcheng Gu et al.

AAAI 2024paperarXiv:2403.06151

#3440

Decoupled Contrastive Learning for Long-Tailed Recognition

Shiyu Xuan, Shiliang Zhang

AAAI 2024paperarXiv:2402.04672

#3441

G-NAS: Generalizable Neural Architecture Search for Single Domain Generalization Object Detection

Fan Wu, Jinling Gao, Lanqing Hong et al.

ICML 2024arXiv:2401.18070

#3442

Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?

Andreas Opedal, Alessandro Stolfo, Haruki Shirakami et al.

AAAI 2024paperarXiv:2308.09517

#3443

Transitivity-Preserving Graph Representation Learning for Bridging Local Connectivity and Role-Based Similarity

Van Thuy Hoang, O-Joun Lee

CVPR 2024highlightarXiv:2312.00690

#3444

Open-Vocabulary Object 6D Pose Estimation

Jaime Corsetti, Davide Boscaini, Changjae Oh et al.

AAAI 2024paperarXiv:2401.01377

#3445

Does Few-Shot Learning Suffer from Backdoor Attacks?

Xinwei Liu, Xiaojun Jia, Jindong Gu et al.

ICLR 2024arXiv:2310.17463

#3446

Bayesian Neural Controlled Differential Equations for Treatment Effect Estimation

Konstantin Hess, Valentyn Melnychuk, Dennis Frauen et al.

ICLR 2024spotlightarXiv:2404.19596

#3447

Debiased Collaborative Filtering with Kernel-Based Causal Balancing

Haoxuan Li, Chunyuan Zheng, Yanghao Xiao et al.

ICLR 2024arXiv:2403.13829

#3448

DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization

Xiangxin Zhou, Xiwei Cheng, Yuwei Yang et al.

ICML 2024arXiv:2402.14730

#3449

Clifford-Steerable Convolutional Neural Networks

Maksim Zhdanov, David Ruhe, Maurice Weiler et al.

AAAI 2024paperarXiv:2312.08098

#3450

Adversarial Socialbots Modeling Based on Structural Information Principles

Xianghua Zeng, Hao Peng, Angsheng Li

AAAI 2024paperarXiv:2303.13269

#3451

Disguise without Disruption: Utility-Preserving Face De-identification

Zikui Cai, Zhongpai Gao, Benjamin Planche et al.

ICML 2024arXiv:2402.14202

#3452

Comparing Graph Transformers via Positional Encodings

Mitchell Black, Zhengchao Wan, Gal Mishne et al.

ECCV 2024arXiv:2407.17850

#3453

FlexiEdit: Frequency-Aware Latent Refinement for Enhanced Non-Rigid Editing

Gwanhyeong Koo, Sunjae Yoon, Ji Woo Hong et al.

ICLR 2024arXiv:2401.16352

#3454

Adversarial Training on Purification (AToP): Advancing Both Robustness and Generalization

Guang Lin, Chao Li, Jianhai Zhang et al.

ECCV 2024arXiv:2406.08431

#3455

Diffusion Soup: Model Merging for Text-to-Image Diffusion Models

Benjamin J Biggs, Arjun Seshadri, Yang Zou et al.

ECCV 2024arXiv:2408.09702

#3456

Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering

Ruofan Liang, Zan Gojcic, Merlin Nimier-David et al.

AAAI 2024paperarXiv:2312.13032

#3457

NodeMixup: Tackling Under-Reaching for Graph Neural Networks

Weigang Lu, Ziyu Guan, Wei Zhao et al.

AAAI 2024paperarXiv:2312.06179

#3458

Dynamic Weighted Combiner for Mixed-Modal Image Retrieval

Fuxiang Huang, Lei Zhang, Xiaowei Fu et al.

CVPR 2024arXiv:2402.03312

#3459

Test-Time Adaptation for Depth Completion

Hyoungseob Park, Anjali W Gupta, Alex Wong

CVPR 2024arXiv:2312.02813

#3460

BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models

Fengyuan Shi, Jiaxi Gu, Hang Xu et al.

ICLR 2024arXiv:2304.07063

#3461

Rethinking Complex Queries on Knowledge Graphs with Neural Link Predictors

Hang Yin, Zihao Wang, Yangqiu Song

ICLR 2024arXiv:2311.04157

#3462

A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis

DIPANJYOTI PAUL, Arpita Chowdhury, Xinqi Xiong et al.

CVPR 2024highlightarXiv:2401.13650

#3463

Tyche: Stochastic In-Context Learning for Medical Image Segmentation

Marianne Rakic, Hallee Wong, Jose Javier Gonzalez Ortiz et al.

ICLR 2024arXiv:2306.03258

#3464

LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading

Yochai Yemini, Aviv Shamsian, Lior Bracha et al.

CVPR 2024arXiv:2404.01628

#3465

Learning Equi-angular Representations for Online Continual Learning

Minhyuk Seo, Hyunseo Koh, Wonje Jeung et al.

ICML 2024arXiv:2405.17784

#3466

Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation

Ignat Georgiev, Krishnan Srinivasan, Jie Xu et al.

AAAI 2024paperarXiv:2303.10976

#3467

Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-identification

Jiaer Xia, Lei Tan, Pingyang Dai et al.

ECCV 2024arXiv:2407.03197

#3468

DyFADet: Dynamic Feature Aggregation for Temporal Action Detection

Le Yang, Ziwei Zheng, Yizeng Han et al.

ECCV 2024arXiv:2403.06870

#3469

Semantic Residual Prompts for Continual Learning

Martin Menabue, Emanuele Frascaroli, Matteo Boschini et al.

ECCV 2024arXiv:2404.15014

#3470

OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving

Guoqing Wang, Zhongdao Wang, Pin Tang et al.

ECCV 2024arXiv:2403.12574

#3471

EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks

Ziming Wang, Ziling Wang, Huaning Li et al.

CVPR 2024arXiv:2404.03398

#3472

Scaling Up Video Summarization Pretraining with Large Language Models

Dawit Argaw Argaw, Seunghyun Yoon, Fabian Caba Heilbron et al.

CVPR 2024arXiv:2404.17825

#3473

ODCR: Orthogonal Decoupling Contrastive Regularization for Unpaired Image Dehazing

Zhongze Wang, Haitao Zhao, Jingchao Peng et al.

#3474

Catalyst for Clustering-Based Unsupervised Object Re-identification: Feature Calibration

Huafeng Li, Qingsong Hu, Zhanxuan Hu

ICML 2024oralarXiv:2402.01533

#3475

Efficient and Effective Time-Series Forecasting with Spiking Neural Networks

Changze Lv, Yansen Wang, Dongqi Han et al.

ICML 2024arXiv:2309.00079

#3476

On the Implicit Bias of Adam

Matias Cattaneo, Jason Klusowski, Boris Shigida

ICLR 2024arXiv:2406.13864

#3477

Evaluating Representation Learning on the Protein Structure Universe

Arian Jamasb, Alex Morehead, Chaitanya Joshi et al.

CVPR 2024arXiv:2311.11600

#3478

Deep Equilibrium Diffusion Restoration with Parallel Sampling

Jiezhang Cao, Yue Shi, Kai Zhang et al.

CVPR 2024arXiv:2406.13327

#3479

Part-aware Unified Representation of Language and Skeleton for Zero-shot Action Recognition

Anqi Zhu, Qiuhong Ke, Mingming Gong et al.

CVPR 2024arXiv:2405.04356

#3480

Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation

Jihyun Kim, Changjae Oh, Hoseok Do et al.

ECCV 2024arXiv:2409.19405

#3481

G3R: Gradient Guided Generalizable Reconstruction

Yun Chen, Jingkang Wang, Ze Yang et al.

ICLR 2024spotlightarXiv:2401.16164

#3482

Constrained Bi-Level Optimization: Proximal Lagrangian Value Function Approach and Hessian-free Algorithm

Wei Yao, Chengming Yu, Shangzhi Zeng et al.

ICLR 2024arXiv:2310.06771

#3483

Correlated Noise Provably Beats Independent Noise for Differentially Private Learning

Christopher Choquette-Choo, Krishnamurthy Dvijotham, Krishna Pillutla et al.

#3484

Contrastive Learning for DeepFake Classification and Localization via Multi-Label Ranking

Cheng-Yao Hong, Yen-Chi Hsu, Tyng-Luh Liu

ICLR 2024arXiv:2309.08045

#3485

Traveling Waves Encode The Recent Past and Enhance Sequence Learning

T. Anderson Keller, Lyle Muller, Terrence Sejnowski et al.

ECCV 2024arXiv:2401.12978

#3486

Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models

Hyeonwoo Kim, Sookwan Han, Patrick Kwon et al.

ICML 2024arXiv:2402.18512

#3487

Log Neural Controlled Differential Equations: The Lie Brackets Make A Difference

Benjamin Walker, Andrew McLeod, Tiexin QIN et al.

ICML 2024arXiv:2405.01008

#3488

On Mechanistic Knowledge Localization in Text-to-Image Generative Models

Samyadeep Basu, Keivan Rezaei, Priyatham Kattakinda et al.

CVPR 2024arXiv:2406.09936

#3489

ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers

Narges Norouzi, Svetlana Orlova, Daan de Geus et al.

ICLR 2024arXiv:2402.06244

#3490

Quantifying and Enhancing Multi-modal Robustness with Modality Preference

Zequn Yang, Yake Wei, Ce Liang et al.

ECCV 2024arXiv:2407.07582

#3491

TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data

Siyi Du, Shaoming Zheng, Yinsong Wang et al.

ICLR 2024spotlightarXiv:2309.15111

#3492

SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problem

Margalit Glasgow

ICLR 2024arXiv:2310.07805

#3493

Generative Modeling with Phase Stochastic Bridge

Tianrong Chen, Jiatao Gu, Laurent Dinh et al.

ECCV 2024arXiv:2404.05052

#3494

Facial Affective Behavior Analysis with Instruction Tuning

Yifan Li, Anh Dao, Wentao Bao et al.

AAAI 2024paperarXiv:2312.10687

#3495

MM-TTS: Multi-Modal Prompt Based Style Transfer for Expressive Text-to-Speech Synthesis

Wenhao Guan, Yishuang Li, Tao Li et al.

CVPR 2024arXiv:2312.10998

#3496

ID-Blau: Image Deblurring by Implicit Diffusion-based reBLurring AUgmentation

Jia-Hao Wu, Fu-Jen Tsai, Yan-Tsung Peng et al.

ICLR 2024arXiv:2305.13404

#3497

Improving Convergence and Generalization Using Parameter Symmetries

Bo Zhao, Robert M. Gower, Robin Walters et al.

ICLR 2024arXiv:2401.08501

#3498

ValUES: A Framework for Systematic Validation of Uncertainty Estimation in Semantic Segmentation

Kim-Celine Kahl, Carsten Lüth, Maximilian Zenk et al.

CVPR 2024arXiv:2401.06395

#3499

ModaVerse: Efficiently Transforming Modalities with LLMs

Xinyu Wang, Bohan Zhuang, Qi Wu

ICLR 2024arXiv:2305.12872

#3500

Byzantine Robust Cooperative Multi-Agent Reinforcement Learning as a Bayesian Game

Simin Li, Jun Guo, Jingqiao Xiu et al.

ICML 2024arXiv:2312.11111

#3501

The Good, The Bad, and Why: Unveiling Emotions in Generative AI

CHENG LI, Jindong Wang, Yixuan Zhang et al.

ICLR 2024arXiv:2310.01753

#3502

CausalTime: Realistically Generated Time-series for Benchmarking of Causal Discovery

YUXIAO CHENG, Ziqian Wang, Tingxiong Xiao et al.

ECCV 2024arXiv:2403.00628

#3503

Region-Adaptive Transform with Segmentation Prior for Image Compression

Yuxi Liu, Wenhan Yang, Huihui Bai et al.

ECCV 2024arXiv:2407.13133

#3504

FocusDiffuser: Perceiving Local Disparities for Camouflaged Object Detection

Jianwei Zhao, Xin Li, Fan Yang et al.

CVPR 2024highlightarXiv:2312.05247

#3505

Dynamic LiDAR Re-simulation using Compositional Neural Fields

Hanfeng Wu, Xingxing Zuo, Stefan Leutenegger et al.

ECCV 2024arXiv:2403.09176

#3506

Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts

Byeongjun Park, Hyojun Go, Jin-Young Kim et al.

ECCV 2024arXiv:2409.02543

#3507

StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models

Wen Li, Muyuan Fang, Cheng Zou et al.

ICLR 2024arXiv:2306.04539

#3508

Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications

Paul Liang, Chun Kai Ling, Yun Cheng et al.

#3509

Implicit bias of SGD in $L_2$-regularized linear DNNs: One-way jumps from high to low rank

Zihan Wang, Arthur Jacot

ICLR 2024spotlight

CVPR 2024arXiv:2405.03178

#3510

POPDG: Popular 3D Dance Generation with PopDanceSet

Zhenye Luo, Min Ren, Xuecai Hu et al.

ICML 2024arXiv:2402.07440

#3511

Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT

Jon Saad-Falcon, Daniel Y Fu, Simran Arora et al.

CVPR 2024arXiv:2401.10224

#3512

The Manga Whisperer: Automatically Generating Transcriptions for Comics

Ragav Sachdeva, Andrew Zisserman

ECCV 2024arXiv:2409.15739

#3513

Teaching Tailored to Talent: Adverse Weather Restoration via Prompt Pool and Depth-Anything Constraint

Sixiang Chen, Tian Ye, Kai Zhang et al.

CVPR 2024arXiv:2403.11530

#3514

Continual Forgetting for Pre-trained Vision Models

Hongbo Zhao, Bolin Ni, Junsong Fan et al.

CVPR 2024arXiv:2404.08640

#3515

EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams

Christen Millerdurai, Hiroyasu Akada, Jian Wang et al.

ICLR 2024arXiv:2306.10426

#3516

Understanding Certified Training with Interval Bound Propagation

Yuhao Mao, Mark N Müller, Marc Fischer et al.

CVPR 2024arXiv:2404.09736

#3517

FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance Head-pose and Facial Expression Features

Andre Rochow, Max Schwarz, Sven Behnke

CVPR 2024arXiv:2311.15841

#3518

Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation

Siteng Huang, Biao Gong, Yutong Feng et al.

ECCV 2024arXiv:2311.17528

#3519

HiDiffusion: Unlocking Higher-Resolution Creativity and Efficiency in Pretrained Diffusion Models

Shen Zhang, Zhaowei CHEN, Zhenyu Zhao et al.

CVPR 2024arXiv:2312.09069

#3520

PI3D: Efficient Text-to-3D Generation with Pseudo-Image Diffusion

Ying-Tian Liu, Yuan-Chen Guo, Guan Luo et al.

CVPR 2024arXiv:2403.01795

#3521

RankED: Addressing Imbalance and Uncertainty in Edge Detection Using Ranking-based Losses

bedrettin cetinkaya, Sinan Kalkan, Emre Akbas

CVPR 2024arXiv:2501.05272

#3522

Solving the Catastrophic Forgetting Problem in Generalized Category Discovery

Xinzi Cao, Xiawu Zheng, Guanhong Wang et al.

AAAI 2024paperarXiv:2303.12332

#3523

Weakly-Supervised Temporal Action Localization by Inferring Salient Snippet-Feature

Wu Yun, Mengshi Qi, Chuanming Wang et al.

ECCV 2024arXiv:2312.06573

#3524

ControlNet-XS: Rethinking the Control of Text-to-Image Diffusion Models as Feedback-Control Systems

Denis Zavadski, Johann-Friedrich Feiden, Carsten Rother

CVPR 2024arXiv:2308.06412

#3525

Taming Self-Training for Open-Vocabulary Object Detection

Shiyu Zhao, Samuel Schulter, Long Zhao et al.

ICLR 2024oralarXiv:2401.08552

#3526

Explaining Time Series via Contrastive and Locally Sparse Perturbations

Zichuan Liu, Yingying ZHANG, Tianchun Wang et al.

ICLR 2024arXiv:2401.15652

#3527

Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach

Shaofeng Zhang, Jinfa Huang, Qiang Zhou et al.

#3528

Retrieval-Augmented Primitive Representations for Compositional Zero-Shot Learning

Chenchen Jing, Yukun Li, Hao Chen et al.

#3529

Pre-Training Goal-based Models for Sample-Efficient Reinforcement Learning

Haoqi Yuan, Zhancun Mu, Feiyang Xie et al.

ICLR 2024oral

#3530

Multi-Domain Incremental Learning for Face Presentation Attack Detection

Keyao Wang, Guosheng Zhang, Haixiao Yue et al.

AAAI 2024paperarXiv:2401.00729

#3531

NightRain: Nighttime Video Deraining via Adaptive-Rain-Removal and Adaptive-Correction

Beibei Lin, Yeying Jin, Wending Yan et al.

AAAI 2024paperarXiv:2305.06671

#3532

WeditGAN: Few-Shot Image Generation via Latent Space Relocation

Yuxuan Duan, Li Niu, Yan Hong et al.

ECCV 2024arXiv:2310.17316

#3533

Defect Spectrum: A Granular Look of Large-scale Defect Datasets with Rich Semantics

Shuai Yang, ZhiFei Chen, Pengguang Chen et al.

AAAI 2024paperarXiv:2312.12670

#3534

On the Role of Server Momentum in Federated Learning

Jianhui Sun, Xidong Wu, Heng Huang et al.

AAAI 2024paperarXiv:2302.02589

#3535

$z$-SignFedAvg: A Unified Stochastic Sign-Based Compression for Federated Learning

Zhiwei Tang, Yanmeng Wang, Tsung-Hui Chang

CVPR 2024highlightarXiv:2403.01852

#3536

PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis

Zhengyao Lv, Yuxiang Wei, Wangmeng Zuo et al.

CVPR 2024highlightarXiv:2402.19298

#3537

Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing

Xun Lin, Shuai Wang, RIZHAO CAI et al.

AAAI 2024paperarXiv:2312.07398

#3538

LLMEval: A Preliminary Study on How to Evaluate Large Language Models

Yue Zhang, Ming Zhang, HaiPeng Yuan et al.

CVPR 2024arXiv:2403.06973

#3539

Bayesian Diffusion Models for 3D Shape Reconstruction

Haiyang Xu, Yu lei, Zeyuan Chen et al.

CVPR 2024arXiv:2404.00417

#3540

Orchestrate Latent Expertise: Advancing Online Continual Learning with Multi-Level Supervision and Reverse Self-Distillation

Hongwei Yan, Liyuan Wang, Kaisheng Ma et al.

ICLR 2024arXiv:2402.07011

#3541

FedImpro: Measuring and Improving Client Update in Federated Learning

Zhenheng Tang, Yonggang Zhang, Shaohuai Shi et al.

CVPR 2024arXiv:2403.14852

#3542

KeyPoint Relative Position Encoding for Face Recognition

Minchul Kim, Feng Liu, Yiyang Su et al.

ICLR 2024arXiv:2310.05801

#3543

An operator preconditioning perspective on training in physics-informed machine learning

Tim De Ryck, Florent Bonnet, Siddhartha Mishra et al.

CVPR 2024arXiv:2403.00567

#3544

Flatten Long-Range Loss Landscapes for Cross-Domain Few-Shot Learning

Yixiong Zou, Yicong Liu, Yiman Hu et al.

ECCV 2024arXiv:2311.17944

#3545

PALM: Predicting Actions through Language Models

Sanghwan Kim, Daoji Huang, Yongqin Xian et al.

ICML 2024arXiv:2401.16421

#3546

Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation

Zhenyu He, Guhao Feng, Shengjie Luo et al.

CVPR 2024arXiv:2404.19722

#3547

PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios

Jingbo Wang, Zhengyi Luo, Ye Yuan et al.

CVPR 2024arXiv:2403.16370

#3548

GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model for Distortion-aware Panoramic Semantic Segmentation

WEIMING ZHANG, Yexin Liu, Xu Zheng et al.

CVPR 2024highlightarXiv:2312.08878

#3549

Domain Prompt Learning with Quaternion Networks

Qinglong Cao, Zhengqin Xu, Yuntian Chen et al.

ICML 2024arXiv:2312.02546

#3550

Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning

Zhuo Huang, Chang Liu, Yinpeng Dong et al.

ICLR 2024spotlightarXiv:2310.11971

#3551

Improving Generalization of Alignment with Human Preferences through Group Invariant Learning

Rui Zheng, Wei Shen, Yuan Hua et al.

AAAI 2024paperarXiv:2303.10891

#3552

Non-exemplar Online Class-Incremental Continual Learning via Dual-Prototype Self-Augment and Refinement

Fushuo Huo, Wenchao Xu, Jingcai Guo et al.

AAAI 2024paperarXiv:2312.07871

#3553

MLNet: Mutual Learning Network with Neighborhood Invariance for Universal Domain Adaptation

Yanzuo Lu, Meng Shen, Andy J Ma et al.

AAAI 2024paperarXiv:2304.09595

#3554

AdapterGNN: Parameter-Efficient Fine-Tuning Improves Generalization in GNNs

Shengrui Li, Xueting Han, Jing Bai

#3555

VAREN: Very Accurate and Realistic Equine Network

Silvia Zuffi, Ylva Mellbin, Ci Li et al.

#3556

Towards Robust 3D Object Detection with LiDAR and 4D Radar Fusion in Various Weather Conditions

Yujeong Chae, Hyeonseong Kim, Kuk-Jin Yoon

AAAI 2024paperarXiv:2403.12100

#3557

Learning Time Slot Preferences via Mobility Tree for Next POI Recommendation

Tianhao Huang, Xuan Pan, Xiangrui Cai et al.

ICLR 2024arXiv:2211.10936

#3558

Deep Reinforcement Learning Guided Improvement Heuristic for Job Shop Scheduling

Cong Zhang, Zhiguang Cao, Wen Song et al.

ICML 2024arXiv:2405.03188

#3559

Hyperbolic Geometric Latent Diffusion Model for Graph Generation

Xingcheng Fu, Yisen Gao, Yuecen Wei et al.

#3560

Beyond Mimicking Under-Represented Emotions: Deep Data Augmentation with Emotional Subspace Constraints for EEG-Based Emotion Recognition

Zhi ZHANG, Sheng-hua Zhong, Yan Liu

AAAI 2024paperarXiv:2312.10439

#3561

Simple Image-Level Classification Improves Open-Vocabulary Object Detection

Ruohuan Fang, Guansong Pang, Xiao Bai

ICLR 2024arXiv:2306.11251

#3562

Lipschitz Singularities in Diffusion Models

Zhantao Yang, Ruili Feng, Han Zhang et al.

ICML 2024arXiv:2406.06248

#3563

Compute Better Spent: Replacing Dense Layers with Structured Matrices

Shikai Qiu, Andres Potapczynski, Marc Finzi et al.

ICLR 2024arXiv:2310.07999

#3564

LEMON: Lossless model expansion

Yite Wang, Jiahao Su, Hanlin Lu et al.

AAAI 2024paperarXiv:2302.05428

#3565

Sterling: Synergistic Representation Learning on Bipartite Graphs

Baoyu Jing, Yuchen Yan, Kaize Ding et al.

ECCV 2024arXiv:2403.10153

#3566

Improving Medical Multi-modal Contrastive Learning with Expert Annotations

Yogesh Kumar, Pekka Marttinen

CVPR 2024arXiv:2308.12462

#3567

Overcoming Generic Knowledge Loss with Selective Parameter Update

Wenxuan Zhang, Paul Janson, Rahaf Aljundi et al.

CVPR 2024arXiv:2406.11820

#3568

Composing Object Relations and Attributes for Image-Text Matching

Khoi Pham, Chuong Huynh, Ser-Nam Lim et al.

AAAI 2024paperarXiv:2308.10079

#3569

MeDM: Mediating Image Diffusion Models for Video-to-Video Translation with Temporal Correspondence Guidance

Ernie Chu, Tzuhsuan Huang, Shuo-Yen LIN et al.

CVPR 2024arXiv:2403.08919

#3570

CLIP-BEVFormer: Enhancing Multi-View Image-Based BEV Detector with Ground Truth Flow

Chenbin Pan, Burhan Yaman, Senem Velipasalar et al.

CVPR 2024arXiv:2311.18113

#3571

Back to 3D: Few-Shot 3D Keypoint Detection with Back-Projected 2D Features

Thomas Wimmer, Peter Wonka, Maks Ovsjanikov

ICLR 2024spotlightarXiv:2402.00348

#3572

ODICE: Revealing the Mystery of Distribution Correction Estimation via Orthogonal-gradient Update

Liyuan Mao, Haoran Xu, Weinan Zhang et al.

CVPR 2024arXiv:2402.04476

#3573

Dual-View Visual Contextualization for Web Navigation

Jihyung Kil, Chan Hee Song, Boyuan Zheng et al.

AAAI 2024paperarXiv:2312.11119

#3574

Hyperspectral Image Reconstruction via Combinatorial Embedding of Cross-Channel Spatio-Spectral Clues

Xingxing Yang, Jie Chen, Zaifeng Yang

ICLR 2024arXiv:2310.02003

#3575

L2MAC: Large Language Model Automatic Computer for Extensive Code Generation

Samuel Holt, Max Ruiz Luyten, Mihaela van der Schaar

AAAI 2024paperarXiv:2312.14990

#3576

Learning to Prompt Knowledge Transfer for Open-World Continual Learning

Yujie Li, Xin Yang, Hao Wang et al.

ECCV 2024arXiv:2407.10910

#3577

DataDream: Few-shot Guided Dataset Generation

Jae Myung Kim, Jessica Bader, Stephan Alaniz et al.

CVPR 2024arXiv:2403.01598

#3578

APISR: Anime Production Inspired Real-World Anime Super-Resolution

Boyang Wang, Fengyu Yang, Xihang Yu et al.

CVPR 2024arXiv:2403.09093

#3579

Desigen: A Pipeline for Controllable Design Template Generation

Haohan Weng, Danqing Huang, YU QIAO et al.

AAAI 2024paperarXiv:2309.03548

#3580

Trash to Treasure: Low-Light Object Detection via Decomposition-and-Aggregation

Xiaohan Cui, Long Ma, Tengyu Ma et al.

AAAI 2024paperarXiv:2401.05363

#3581

Generalizable Sleep Staging via Multi-Level Domain Alignment

Jiquan Wang, Sha Zhao, Haiteng Jiang et al.

ECCV 2024arXiv:2409.18049

#3582

Revisit Anything: Visual Place Recognition via Image Segment Retrieval

Kartik Garg, Sai Shubodh Puligilla, Shishir N Y Kolathaya et al.

CVPR 2024arXiv:2404.09389

#3583

Masked and Shuffled Blind Spot Denoising for Real-World Images

Hamadi Chihaoui, Paolo Favaro

CVPR 2024arXiv:2404.00874

#3584

DiSR-NeRF: Diffusion-Guided View-Consistent Super-Resolution NeRF

Jie Long Lee, Chen Li, Gim Hee Lee

ICLR 2024arXiv:2305.19094

#3585

Diffusion Model for Dense Matching

Jisu Nam, Gyuseong Lee, Seonwoo Kim et al.

ICLR 2024arXiv:2306.00974

#3586

Discovering Failure Modes of Text-guided Diffusion Models via Adversarial Search

Qihao Liu, Adam Kortylewski, Yutong Bai et al.

ICLR 2024arXiv:2309.13598

#3587

On the Posterior Distribution in Denoising: Application to Uncertainty Quantification

Hila Manor, Tomer Michaeli

ICLR 2024arXiv:2311.02805

#3588

Tailoring Self-Rationalizers with Multi-Reward Distillation

Sahana Ramnath, Brihi Joshi, Skyler Hallinan et al.

ICML 2024arXiv:2402.02952

#3589

On Least Square Estimation in Softmax Gating Mixture of Experts

Huy Nguyen, Nhat Ho, Alessandro Rinaldo

CVPR 2024arXiv:2312.16837

#3590

DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaptation by Combining 3D GANs and Diffusion Priors

Biwen Lei, Kai Yu, Mengyang Feng et al.

#3591

Learning to Reweight for Generalizable Graph Neural Network

Zhengyu Chen, Teng Xiao, Kun Kuang et al.

CVPR 2024arXiv:2312.04076

#3592

Large Language Models are Good Prompt Learners for Low-Shot Image Classification

Zhaoheng Zheng, Jingmin Wei, Xuefeng Hu et al.

ECCV 2024arXiv:2407.01851

#3593

Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time

Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta et al.

AAAI 2024paperarXiv:2401.10840

#3594

Symbolic Cognitive Diagnosis via Hybrid Optimization for Intelligent Education Systems

Junhao Shen, Hong Qian, Wei Zhang et al.

ICML 2024arXiv:2310.01655

#3595

PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels

Praneeth Kacham, Vahab Mirrokni, Peilin Zhong

ICLR 2024arXiv:2401.13837

#3596

Democratizing Fine-grained Visual Recognition with Large Language Models

Mingxuan Liu, Subhankar Roy, Wenjing Li et al.

AAAI 2024paperarXiv:2305.16645

#3597

Summarizing Stream Data for Memory-Constrained Online Continual Learning

Jianyang Gu, Kai Wang, Wei Jiang et al.

ICML 2024arXiv:2405.03875

#3598

Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits

Jiachen Wang, Tianji Yang, James Zou et al.

ICML 2024arXiv:2402.04467

#3599

DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic Systems

Yair Schiff, Zhong Yi Wan, Jeffrey Parker et al.

CVPR 2024arXiv:2305.12497

#3600

PanoContext-Former: Panoramic Total Scene Understanding with a Transformer

Yuan Dong, Chuan Fang, Liefeng Bo et al.