Most Cited 2024 "action transformer" Papers

12,324 papers found • Page 18 of 62

#3401

Semantic-Guided Generative Image Augmentation Method with Diffusion Models for Image Classification

Bohan Li, Xiao Xu, Xinghao Wang et al.

AAAI 2024paperarXiv:2302.02070
24
citations
#3402

AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning

Yuwei Tang, ZhenYi Lin, Qilong Wang et al.

CVPR 2024arXiv:2404.08958
24
citations
#3403

SANeRF-HQ: Segment Anything for NeRF in High Quality

Yichen Liu, Benran Hu, Chi-Keung Tang et al.

CVPR 2024arXiv:2312.01531
24
citations
#3404

TimeMIL: Advancing Multivariate Time Series Classification via a Time-aware Multiple Instance Learning

Xiwen Chen, Peijie Qiu, Wenhui Zhu et al.

ICML 2024oralarXiv:2405.03140
24
citations
#3405

Chain-of-Thought Predictive Control

Zhiwei Jia, Vineet Thumuluri, Fangchen Liu et al.

ICML 2024arXiv:2304.00776
24
citations
#3406

SCP: Spherical-Coordinate-Based Learned Point Cloud Compression

Ao Luo, Linxin Song, Keisuke Nonaka et al.

AAAI 2024paperarXiv:2308.12535
24
citations
#3407

LLM-Empowered State Representation for Reinforcement Learning

Boyuan Wang, Yun Qu, Yuhang Jiang et al.

ICML 2024arXiv:2407.13237
24
citations
#3408

InferCept: Efficient Intercept Support for Augmented Large Language Model Inference

Reyna Abhyankar, Zijian He, Vikranth Srivatsa et al.

ICML 2024arXiv:2402.01869
24
citations
#3409

Fast Diffusion-Based Counterfactuals for Shortcut Removal and Generation

Nina Weng, Paraskevas Pegios, Eike Petersen et al.

ECCV 2024arXiv:2312.14223
24
citations
#3410

Text-Conditioned Resampler For Long Form Video Understanding

Bruno Korbar, Yongqin Xian, Alessio Tonioni et al.

ECCV 2024arXiv:2312.11897
24
citations
#3411

On Discrete Prompt Optimization for Diffusion Models

Ruochen Wang, Ting Liu, Cho-Jui Hsieh et al.

ICML 2024arXiv:2407.01606
24
citations
#3412

NAYER: Noisy Layer Data Generation for Efficient and Effective Data-free Knowledge Distillation

Minh-Tuan Tran, Trung Le, Xuan-May Le et al.

CVPR 2024arXiv:2310.00258
24
citations
#3413

VideoMamba: Spatio-Temporal Selective State Space Model

Jinyoung Park, Hee-Seon Kim, Kangwook Ko et al.

ECCV 2024arXiv:2407.08476
24
citations
#3414

Graph Contrastive Invariant Learning from the Causal Perspective

9672 Yanhu Mo, Xiao Wang, Shaohua Fan et al.

AAAI 2024paperarXiv:2401.12564
24
citations
#3415

Garment Recovery with Shape and Deformation Priors

Ren Li, Corentin Dumery, Benoît Guillard et al.

CVPR 2024arXiv:2311.10356
24
citations
#3416

PreciseControl: Enhancing Text-To-Image Diffusion Models with Fine-Grained Attribute Control

Rishubh Parihar, Sachidanand VS, Sabariswaran Mani et al.

ECCV 2024arXiv:2408.05083
24
citations
#3417

Accurate Forgetting for Heterogeneous Federated Continual Learning

Abudukelimu Wuerkaixi, Sen Cui, Jingfeng Zhang et al.

ICLR 2024arXiv:2502.14205
24
citations
#3418

Runtime Analysis of the SMS-EMOA for Many-Objective Optimization

Weijie Zheng, Benjamin Doerr

AAAI 2024paperarXiv:2312.10290
24
citations
#3419

SHaRPose: Sparse High-Resolution Representation for Human Pose Estimation

Xiaoqi An, Lin Zhao, Chen Gong et al.

AAAI 2024paperarXiv:2312.10758
24
citations
#3420

Position: Why We Must Rethink Empirical Research in Machine Learning

Moritz Herrmann, F. Julian D. Lange, Katharina Eggensperger et al.

ICML 2024arXiv:2405.02200
24
citations
#3421

Designing Decision Support Systems using Counterfactual Prediction Sets

Eleni Straitouri, Manuel Gomez-Rodriguez

ICML 2024spotlightarXiv:2306.03928
24
citations
#3422

Context-Aware Meta-Learning

Christopher Fifty, Dennis Duan, Ronald Junkins et al.

ICLR 2024arXiv:2310.10971
24
citations
#3423

HybridGait: A Benchmark for Spatial-Temporal Cloth-Changing Gait Recognition with Hybrid Explorations

Yilan Dong, Chunlin Yu, Ruiyang Ha et al.

AAAI 2024paperarXiv:2401.00271
24
citations
#3424

EgoGen: An Egocentric Synthetic Data Generator

Gen Li, Kaifeng Zhao, Siwei Zhang et al.

CVPR 2024arXiv:2401.08739
24
citations
#3425

Sampling in Unit Time with Kernel Fisher-Rao Flow

Aimee Maurais, Youssef Marzouk

ICML 2024arXiv:2401.03892
24
citations
#3426

Equivariant Diffusion for Crystal Structure Prediction

Peijia Lin, Pin Chen, Rui Jiao et al.

ICML 2024arXiv:2512.07289
24
citations
#3427

Self-Correcting Self-Consuming Loops for Generative Model Training

Nate Gillman, Michael Freeman, Daksh Aggarwal et al.

ICML 2024arXiv:2402.07087
24
citations
#3428

Position: Key Claims in LLM Research Have a Long Tail of Footnotes

Anna Rogers, Sasha Luccioni

ICML 2024arXiv:2308.07120
24
citations
#3429

VkD: Improving Knowledge Distillation using Orthogonal Projections

Roy Miles, Ismail Elezi, Jiankang Deng

CVPR 2024
24
citations
#3430

Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention

Zhen Qin, Weigao Sun, Dong Li et al.

ICML 2024arXiv:2405.17381
24
citations
#3431

LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation

Linfeng Yuan, Miaojing Shi, Zijie Yue et al.

CVPR 2024arXiv:2306.08736
24
citations
#3432

Zero and Few-shot Semantic Parsing with Ambiguous Inputs

Elias Stengel-Eskin, Kyle Rawlins, Benjamin Van Durme

ICLR 2024arXiv:2306.00824
24
citations
#3433

Unknown Prompt the only Lacuna: Unveiling CLIP's Potential for Open Domain Generalization

Mainak Singha, Ankit Jha, Shirsha Bose et al.

CVPR 2024arXiv:2404.00710
24
citations
#3434

V2Meow: Meowing to the Visual Beat via Video-to-Music Generation

Kun Su, Judith Li, Qingqing Huang et al.

AAAI 2024paperarXiv:2305.06594
24
citations
#3435

AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning

Duojun Huang, Xinyu Xiong, Jie Ma et al.

CVPR 2024arXiv:2406.00480
24
citations
#3436

GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval

Yuting Wang, Jinpeng Wang, Bin Chen et al.

AAAI 2024paperarXiv:2310.05195
24
citations
#3437

Face2Diffusion for Fast and Editable Face Personalization

Kaede Shiohara, Toshihiko Yamasaki

CVPR 2024arXiv:2403.05094
24
citations
#3438

Matrix Information Theory for Self-Supervised Learning

Yifan Zhang, Zhiquan Tan, Jingqin Yang et al.

ICML 2024arXiv:2305.17326
24
citations
#3439

Deep Active Learning with Noise Stability

Xingjian Li, Pengkun Yang, Yangcheng Gu et al.

AAAI 2024paperarXiv:2205.13340
24
citations
#3440

Decoupled Contrastive Learning for Long-Tailed Recognition

Shiyu Xuan, Shiliang Zhang

AAAI 2024paperarXiv:2403.06151
24
citations
#3441

G-NAS: Generalizable Neural Architecture Search for Single Domain Generalization Object Detection

Fan Wu, Jinling Gao, Lanqing Hong et al.

AAAI 2024paperarXiv:2402.04672
24
citations
#3442

Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?

Andreas Opedal, Alessandro Stolfo, Haruki Shirakami et al.

ICML 2024arXiv:2401.18070
24
citations
#3443

Transitivity-Preserving Graph Representation Learning for Bridging Local Connectivity and Role-Based Similarity

Van Thuy Hoang, O-Joun Lee

AAAI 2024paperarXiv:2308.09517
24
citations
#3444

Open-Vocabulary Object 6D Pose Estimation

Jaime Corsetti, Davide Boscaini, Changjae Oh et al.

CVPR 2024highlightarXiv:2312.00690
24
citations
#3445

Does Few-Shot Learning Suffer from Backdoor Attacks?

Xinwei Liu, Xiaojun Jia, Jindong Gu et al.

AAAI 2024paperarXiv:2401.01377
24
citations
#3446

Bayesian Neural Controlled Differential Equations for Treatment Effect Estimation

Konstantin Hess, Valentyn Melnychuk, Dennis Frauen et al.

ICLR 2024arXiv:2310.17463
24
citations
#3447

Debiased Collaborative Filtering with Kernel-Based Causal Balancing

Haoxuan Li, Chunyuan Zheng, Yanghao Xiao et al.

ICLR 2024spotlightarXiv:2404.19596
24
citations
#3448

DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization

Xiangxin Zhou, Xiwei Cheng, Yuwei Yang et al.

ICLR 2024arXiv:2403.13829
24
citations
#3449

Clifford-Steerable Convolutional Neural Networks

Maksim Zhdanov, David Ruhe, Maurice Weiler et al.

ICML 2024arXiv:2402.14730
24
citations
#3450

Adversarial Socialbots Modeling Based on Structural Information Principles

Xianghua Zeng, Hao Peng, Angsheng Li

AAAI 2024paperarXiv:2312.08098
24
citations
#3451

Disguise without Disruption: Utility-Preserving Face De-identification

Zikui Cai, Zhongpai Gao, Benjamin Planche et al.

AAAI 2024paperarXiv:2303.13269
24
citations
#3452

Comparing Graph Transformers via Positional Encodings

Mitchell Black, Zhengchao Wan, Gal Mishne et al.

ICML 2024arXiv:2402.14202
24
citations
#3453

FlexiEdit: Frequency-Aware Latent Refinement for Enhanced Non-Rigid Editing

Gwanhyeong Koo, Sunjae Yoon, Ji Woo Hong et al.

ECCV 2024arXiv:2407.17850
24
citations
#3454

Adversarial Training on Purification (AToP): Advancing Both Robustness and Generalization

Guang Lin, Chao Li, Jianhai Zhang et al.

ICLR 2024arXiv:2401.16352
24
citations
#3455

Diffusion Soup: Model Merging for Text-to-Image Diffusion Models

Benjamin J Biggs, Arjun Seshadri, Yang Zou et al.

ECCV 2024arXiv:2406.08431
24
citations
#3456

Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering

Ruofan Liang, Zan Gojcic, Merlin Nimier-David et al.

ECCV 2024arXiv:2408.09702
24
citations
#3457

NodeMixup: Tackling Under-Reaching for Graph Neural Networks

Weigang Lu, Ziyu Guan, Wei Zhao et al.

AAAI 2024paperarXiv:2312.13032
24
citations
#3458

Dynamic Weighted Combiner for Mixed-Modal Image Retrieval

Fuxiang Huang, Lei Zhang, Xiaowei Fu et al.

AAAI 2024paperarXiv:2312.06179
24
citations
#3459

Test-Time Adaptation for Depth Completion

Hyoungseob Park, Anjali W Gupta, Alex Wong

CVPR 2024arXiv:2402.03312
24
citations
#3460

BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models

Fengyuan Shi, Jiaxi Gu, Hang Xu et al.

CVPR 2024arXiv:2312.02813
24
citations
#3461

Rethinking Complex Queries on Knowledge Graphs with Neural Link Predictors

Hang Yin, Zihao Wang, Yangqiu Song

ICLR 2024arXiv:2304.07063
24
citations
#3462

A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis

DIPANJYOTI PAUL, Arpita Chowdhury, Xinqi Xiong et al.

ICLR 2024arXiv:2311.04157
24
citations
#3463

Tyche: Stochastic In-Context Learning for Medical Image Segmentation

Marianne Rakic, Hallee Wong, Jose Javier Gonzalez Ortiz et al.

CVPR 2024highlightarXiv:2401.13650
24
citations
#3464

LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading

Yochai Yemini, Aviv Shamsian, Lior Bracha et al.

ICLR 2024arXiv:2306.03258
24
citations
#3465

Learning Equi-angular Representations for Online Continual Learning

Minhyuk Seo, Hyunseo Koh, Wonje Jeung et al.

CVPR 2024arXiv:2404.01628
24
citations
#3466

Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation

Ignat Georgiev, Krishnan Srinivasan, Jie Xu et al.

ICML 2024arXiv:2405.17784
24
citations
#3467

Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-identification

Jiaer Xia, Lei Tan, Pingyang Dai et al.

AAAI 2024paperarXiv:2303.10976
24
citations
#3468

DyFADet: Dynamic Feature Aggregation for Temporal Action Detection

Le Yang, Ziwei Zheng, Yizeng Han et al.

ECCV 2024arXiv:2407.03197
24
citations
#3469

Semantic Residual Prompts for Continual Learning

Martin Menabue, Emanuele Frascaroli, Matteo Boschini et al.

ECCV 2024arXiv:2403.06870
24
citations
#3470

OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving

Guoqing Wang, Zhongdao Wang, Pin Tang et al.

ECCV 2024arXiv:2404.15014
24
citations
#3471

EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks

Ziming Wang, Ziling Wang, Huaning Li et al.

ECCV 2024arXiv:2403.12574
24
citations
#3472

Scaling Up Video Summarization Pretraining with Large Language Models

Dawit Argaw Argaw, Seunghyun Yoon, Fabian Caba Heilbron et al.

CVPR 2024arXiv:2404.03398
24
citations
#3473

ODCR: Orthogonal Decoupling Contrastive Regularization for Unpaired Image Dehazing

Zhongze Wang, Haitao Zhao, Jingchao Peng et al.

CVPR 2024arXiv:2404.17825
24
citations
#3474

Catalyst for Clustering-Based Unsupervised Object Re-identification: Feature Calibration

Huafeng Li, Qingsong Hu, Zhanxuan Hu

AAAI 2024paper
24
citations
#3475

Efficient and Effective Time-Series Forecasting with Spiking Neural Networks

Changze Lv, Yansen Wang, Dongqi Han et al.

ICML 2024oralarXiv:2402.01533
24
citations
#3476

On the Implicit Bias of Adam

Matias Cattaneo, Jason Klusowski, Boris Shigida

ICML 2024arXiv:2309.00079
24
citations
#3477

Evaluating Representation Learning on the Protein Structure Universe

Arian Jamasb, Alex Morehead, Chaitanya Joshi et al.

ICLR 2024arXiv:2406.13864
24
citations
#3478

Deep Equilibrium Diffusion Restoration with Parallel Sampling

Jiezhang Cao, Yue Shi, Kai Zhang et al.

CVPR 2024arXiv:2311.11600
24
citations
#3479

Part-aware Unified Representation of Language and Skeleton for Zero-shot Action Recognition

Anqi Zhu, Qiuhong Ke, Mingming Gong et al.

CVPR 2024arXiv:2406.13327
24
citations
#3480

Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation

Jihyun Kim, Changjae Oh, Hoseok Do et al.

CVPR 2024arXiv:2405.04356
24
citations
#3481

G3R: Gradient Guided Generalizable Reconstruction

Yun Chen, Jingkang Wang, Ze Yang et al.

ECCV 2024arXiv:2409.19405
24
citations
#3482

Constrained Bi-Level Optimization: Proximal Lagrangian Value Function Approach and Hessian-free Algorithm

Wei Yao, Chengming Yu, Shangzhi Zeng et al.

ICLR 2024spotlightarXiv:2401.16164
24
citations
#3483

Correlated Noise Provably Beats Independent Noise for Differentially Private Learning

Christopher Choquette-Choo, Krishnamurthy Dvijotham, Krishna Pillutla et al.

ICLR 2024arXiv:2310.06771
24
citations
#3484

Contrastive Learning for DeepFake Classification and Localization via Multi-Label Ranking

Cheng-Yao Hong, Yen-Chi Hsu, Tyng-Luh Liu

CVPR 2024
24
citations
#3485

Traveling Waves Encode The Recent Past and Enhance Sequence Learning

T. Anderson Keller, Lyle Muller, Terrence Sejnowski et al.

ICLR 2024arXiv:2309.08045
24
citations
#3486

Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models

Hyeonwoo Kim, Sookwan Han, Patrick Kwon et al.

ECCV 2024arXiv:2401.12978
24
citations
#3487

Log Neural Controlled Differential Equations: The Lie Brackets Make A Difference

Benjamin Walker, Andrew McLeod, Tiexin QIN et al.

ICML 2024arXiv:2402.18512
24
citations
#3488

On Mechanistic Knowledge Localization in Text-to-Image Generative Models

Samyadeep Basu, Keivan Rezaei, Priyatham Kattakinda et al.

ICML 2024arXiv:2405.01008
24
citations
#3489

ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers

Narges Norouzi, Svetlana Orlova, Daan de Geus et al.

CVPR 2024arXiv:2406.09936
24
citations
#3490

Quantifying and Enhancing Multi-modal Robustness with Modality Preference

Zequn Yang, Yake Wei, Ce Liang et al.

ICLR 2024arXiv:2402.06244
24
citations
#3491

TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data

Siyi Du, Shaoming Zheng, Yinsong Wang et al.

ECCV 2024arXiv:2407.07582
24
citations
#3492

SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problem

Margalit Glasgow

ICLR 2024spotlightarXiv:2309.15111
24
citations
#3493

Generative Modeling with Phase Stochastic Bridge

Tianrong Chen, Jiatao Gu, Laurent Dinh et al.

ICLR 2024arXiv:2310.07805
24
citations
#3494

Facial Affective Behavior Analysis with Instruction Tuning

Yifan Li, Anh Dao, Wentao Bao et al.

ECCV 2024arXiv:2404.05052
24
citations
#3495

MM-TTS: Multi-Modal Prompt Based Style Transfer for Expressive Text-to-Speech Synthesis

Wenhao Guan, Yishuang Li, Tao Li et al.

AAAI 2024paperarXiv:2312.10687
24
citations
#3496

ID-Blau: Image Deblurring by Implicit Diffusion-based reBLurring AUgmentation

Jia-Hao Wu, Fu-Jen Tsai, Yan-Tsung Peng et al.

CVPR 2024arXiv:2312.10998
24
citations
#3497

Improving Convergence and Generalization Using Parameter Symmetries

Bo Zhao, Robert M. Gower, Robin Walters et al.

ICLR 2024arXiv:2305.13404
23
citations
#3498

ValUES: A Framework for Systematic Validation of Uncertainty Estimation in Semantic Segmentation

Kim-Celine Kahl, Carsten Lüth, Maximilian Zenk et al.

ICLR 2024arXiv:2401.08501
23
citations
#3499

ModaVerse: Efficiently Transforming Modalities with LLMs

Xinyu Wang, Bohan Zhuang, Qi Wu

CVPR 2024arXiv:2401.06395
23
citations
#3500

Byzantine Robust Cooperative Multi-Agent Reinforcement Learning as a Bayesian Game

Simin Li, Jun Guo, Jingqiao Xiu et al.

ICLR 2024arXiv:2305.12872
23
citations
#3501

The Good, The Bad, and Why: Unveiling Emotions in Generative AI

CHENG LI, Jindong Wang, Yixuan Zhang et al.

ICML 2024arXiv:2312.11111
23
citations
#3502

CausalTime: Realistically Generated Time-series for Benchmarking of Causal Discovery

YUXIAO CHENG, Ziqian Wang, Tingxiong Xiao et al.

ICLR 2024arXiv:2310.01753
23
citations
#3503

Region-Adaptive Transform with Segmentation Prior for Image Compression

Yuxi Liu, Wenhan Yang, Huihui Bai et al.

ECCV 2024arXiv:2403.00628
23
citations
#3504

FocusDiffuser: Perceiving Local Disparities for Camouflaged Object Detection

Jianwei Zhao, Xin Li, Fan Yang et al.

ECCV 2024arXiv:2407.13133
23
citations
#3505

Dynamic LiDAR Re-simulation using Compositional Neural Fields

Hanfeng Wu, Xingxing Zuo, Stefan Leutenegger et al.

CVPR 2024highlightarXiv:2312.05247
23
citations
#3506

Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts

Byeongjun Park, Hyojun Go, Jin-Young Kim et al.

ECCV 2024arXiv:2403.09176
23
citations
#3507

StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models

Wen Li, Muyuan Fang, Cheng Zou et al.

ECCV 2024arXiv:2409.02543
23
citations
#3508

Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications

Paul Liang, Chun Kai Ling, Yun Cheng et al.

ICLR 2024arXiv:2306.04539
23
citations
#3509

Implicit bias of SGD in $L_2$-regularized linear DNNs: One-way jumps from high to low rank

Zihan Wang, Arthur Jacot

ICLR 2024spotlight
23
citations
#3510

POPDG: Popular 3D Dance Generation with PopDanceSet

Zhenye Luo, Min Ren, Xuecai Hu et al.

CVPR 2024arXiv:2405.03178
23
citations
#3511

Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT

Jon Saad-Falcon, Daniel Y Fu, Simran Arora et al.

ICML 2024arXiv:2402.07440
23
citations
#3512

The Manga Whisperer: Automatically Generating Transcriptions for Comics

Ragav Sachdeva, Andrew Zisserman

CVPR 2024arXiv:2401.10224
23
citations
#3513

Teaching Tailored to Talent: Adverse Weather Restoration via Prompt Pool and Depth-Anything Constraint

Sixiang Chen, Tian Ye, Kai Zhang et al.

ECCV 2024arXiv:2409.15739
23
citations
#3514

Continual Forgetting for Pre-trained Vision Models

Hongbo Zhao, Bolin Ni, Junsong Fan et al.

CVPR 2024arXiv:2403.11530
23
citations
#3515

EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams

Christen Millerdurai, Hiroyasu Akada, Jian Wang et al.

CVPR 2024arXiv:2404.08640
23
citations
#3516

Understanding Certified Training with Interval Bound Propagation

Yuhao Mao, Mark N Müller, Marc Fischer et al.

ICLR 2024arXiv:2306.10426
23
citations
#3517

FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance Head-pose and Facial Expression Features

Andre Rochow, Max Schwarz, Sven Behnke

CVPR 2024arXiv:2404.09736
23
citations
#3518

Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation

Siteng Huang, Biao Gong, Yutong Feng et al.

CVPR 2024arXiv:2311.15841
23
citations
#3519

HiDiffusion: Unlocking Higher-Resolution Creativity and Efficiency in Pretrained Diffusion Models

Shen Zhang, Zhaowei CHEN, Zhenyu Zhao et al.

ECCV 2024arXiv:2311.17528
23
citations
#3520

PI3D: Efficient Text-to-3D Generation with Pseudo-Image Diffusion

Ying-Tian Liu, Yuan-Chen Guo, Guan Luo et al.

CVPR 2024arXiv:2312.09069
23
citations
#3521

RankED: Addressing Imbalance and Uncertainty in Edge Detection Using Ranking-based Losses

bedrettin cetinkaya, Sinan Kalkan, Emre Akbas

CVPR 2024arXiv:2403.01795
23
citations
#3522

Solving the Catastrophic Forgetting Problem in Generalized Category Discovery

Xinzi Cao, Xiawu Zheng, Guanhong Wang et al.

CVPR 2024arXiv:2501.05272
23
citations
#3523

Weakly-Supervised Temporal Action Localization by Inferring Salient Snippet-Feature

Wu Yun, Mengshi Qi, Chuanming Wang et al.

AAAI 2024paperarXiv:2303.12332
23
citations
#3524

ControlNet-XS: Rethinking the Control of Text-to-Image Diffusion Models as Feedback-Control Systems

Denis Zavadski, Johann-Friedrich Feiden, Carsten Rother

ECCV 2024arXiv:2312.06573
23
citations
#3525

Taming Self-Training for Open-Vocabulary Object Detection

Shiyu Zhao, Samuel Schulter, Long Zhao et al.

CVPR 2024arXiv:2308.06412
23
citations
#3526

Explaining Time Series via Contrastive and Locally Sparse Perturbations

Zichuan Liu, Yingying ZHANG, Tianchun Wang et al.

ICLR 2024oralarXiv:2401.08552
23
citations
#3527

Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach

Shaofeng Zhang, Jinfa Huang, Qiang Zhou et al.

ICLR 2024arXiv:2401.15652
23
citations
#3528

Retrieval-Augmented Primitive Representations for Compositional Zero-Shot Learning

Chenchen Jing, Yukun Li, Hao Chen et al.

AAAI 2024paper
23
citations
#3529

Pre-Training Goal-based Models for Sample-Efficient Reinforcement Learning

Haoqi Yuan, Zhancun Mu, Feiyang Xie et al.

ICLR 2024oral
23
citations
#3530

Multi-Domain Incremental Learning for Face Presentation Attack Detection

Keyao Wang, Guosheng Zhang, Haixiao Yue et al.

AAAI 2024paper
23
citations
#3531

NightRain: Nighttime Video Deraining via Adaptive-Rain-Removal and Adaptive-Correction

Beibei Lin, Yeying Jin, Wending Yan et al.

AAAI 2024paperarXiv:2401.00729
23
citations
#3532

WeditGAN: Few-Shot Image Generation via Latent Space Relocation

Yuxuan Duan, Li Niu, Yan Hong et al.

AAAI 2024paperarXiv:2305.06671
23
citations
#3533

Defect Spectrum: A Granular Look of Large-scale Defect Datasets with Rich Semantics

Shuai Yang, ZhiFei Chen, Pengguang Chen et al.

ECCV 2024arXiv:2310.17316
23
citations
#3534

On the Role of Server Momentum in Federated Learning

Jianhui Sun, Xidong Wu, Heng Huang et al.

AAAI 2024paperarXiv:2312.12670
23
citations
#3535

$z$-SignFedAvg: A Unified Stochastic Sign-Based Compression for Federated Learning

Zhiwei Tang, Yanmeng Wang, Tsung-Hui Chang

AAAI 2024paperarXiv:2302.02589
23
citations
#3536

PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis

Zhengyao Lv, Yuxiang Wei, Wangmeng Zuo et al.

CVPR 2024highlightarXiv:2403.01852
23
citations
#3537

Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing

Xun Lin, Shuai Wang, RIZHAO CAI et al.

CVPR 2024highlightarXiv:2402.19298
23
citations
#3538

LLMEval: A Preliminary Study on How to Evaluate Large Language Models

Yue Zhang, Ming Zhang, HaiPeng Yuan et al.

AAAI 2024paperarXiv:2312.07398
23
citations
#3539

Bayesian Diffusion Models for 3D Shape Reconstruction

Haiyang Xu, Yu lei, Zeyuan Chen et al.

CVPR 2024arXiv:2403.06973
23
citations
#3540

Orchestrate Latent Expertise: Advancing Online Continual Learning with Multi-Level Supervision and Reverse Self-Distillation

Hongwei Yan, Liyuan Wang, Kaisheng Ma et al.

CVPR 2024arXiv:2404.00417
23
citations
#3541

FedImpro: Measuring and Improving Client Update in Federated Learning

Zhenheng Tang, Yonggang Zhang, Shaohuai Shi et al.

ICLR 2024arXiv:2402.07011
23
citations
#3542

KeyPoint Relative Position Encoding for Face Recognition

Minchul Kim, Feng Liu, Yiyang Su et al.

CVPR 2024arXiv:2403.14852
23
citations
#3543

An operator preconditioning perspective on training in physics-informed machine learning

Tim De Ryck, Florent Bonnet, Siddhartha Mishra et al.

ICLR 2024arXiv:2310.05801
23
citations
#3544

Flatten Long-Range Loss Landscapes for Cross-Domain Few-Shot Learning

Yixiong Zou, Yicong Liu, Yiman Hu et al.

CVPR 2024arXiv:2403.00567
23
citations
#3545

PALM: Predicting Actions through Language Models

Sanghwan Kim, Daoji Huang, Yongqin Xian et al.

ECCV 2024arXiv:2311.17944
23
citations
#3546

Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation

Zhenyu He, Guhao Feng, Shengjie Luo et al.

ICML 2024arXiv:2401.16421
23
citations
#3547

PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios

Jingbo Wang, Zhengyi Luo, Ye Yuan et al.

CVPR 2024arXiv:2404.19722
23
citations
#3548

GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model for Distortion-aware Panoramic Semantic Segmentation

WEIMING ZHANG, Yexin Liu, Xu Zheng et al.

CVPR 2024arXiv:2403.16370
23
citations
#3549

Domain Prompt Learning with Quaternion Networks

Qinglong Cao, Zhengqin Xu, Yuntian Chen et al.

CVPR 2024highlightarXiv:2312.08878
23
citations
#3550

Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning

Zhuo Huang, Chang Liu, Yinpeng Dong et al.

ICML 2024arXiv:2312.02546
23
citations
#3551

Improving Generalization of Alignment with Human Preferences through Group Invariant Learning

Rui Zheng, Wei Shen, Yuan Hua et al.

ICLR 2024spotlightarXiv:2310.11971
23
citations
#3552

Non-exemplar Online Class-Incremental Continual Learning via Dual-Prototype Self-Augment and Refinement

Fushuo Huo, Wenchao Xu, Jingcai Guo et al.

AAAI 2024paperarXiv:2303.10891
23
citations
#3553

MLNet: Mutual Learning Network with Neighborhood Invariance for Universal Domain Adaptation

Yanzuo Lu, Meng Shen, Andy J Ma et al.

AAAI 2024paperarXiv:2312.07871
23
citations
#3554

AdapterGNN: Parameter-Efficient Fine-Tuning Improves Generalization in GNNs

Shengrui Li, Xueting Han, Jing Bai

AAAI 2024paperarXiv:2304.09595
23
citations
#3555

VAREN: Very Accurate and Realistic Equine Network

Silvia Zuffi, Ylva Mellbin, Ci Li et al.

CVPR 2024
23
citations
#3556

Towards Robust 3D Object Detection with LiDAR and 4D Radar Fusion in Various Weather Conditions

Yujeong Chae, Hyeonseong Kim, Kuk-Jin Yoon

CVPR 2024
23
citations
#3557

Learning Time Slot Preferences via Mobility Tree for Next POI Recommendation

Tianhao Huang, Xuan Pan, Xiangrui Cai et al.

AAAI 2024paperarXiv:2403.12100
23
citations
#3558

Deep Reinforcement Learning Guided Improvement Heuristic for Job Shop Scheduling

Cong Zhang, Zhiguang Cao, Wen Song et al.

ICLR 2024arXiv:2211.10936
23
citations
#3559

Hyperbolic Geometric Latent Diffusion Model for Graph Generation

Xingcheng Fu, Yisen Gao, Yuecen Wei et al.

ICML 2024arXiv:2405.03188
23
citations
#3560

Beyond Mimicking Under-Represented Emotions: Deep Data Augmentation with Emotional Subspace Constraints for EEG-Based Emotion Recognition

Zhi ZHANG, Sheng-hua Zhong, Yan Liu

AAAI 2024paper
23
citations
#3561

Simple Image-Level Classification Improves Open-Vocabulary Object Detection

Ruohuan Fang, Guansong Pang, Xiao Bai

AAAI 2024paperarXiv:2312.10439
23
citations
#3562

Lipschitz Singularities in Diffusion Models

Zhantao Yang, Ruili Feng, Han Zhang et al.

ICLR 2024arXiv:2306.11251
23
citations
#3563

Compute Better Spent: Replacing Dense Layers with Structured Matrices

Shikai Qiu, Andres Potapczynski, Marc Finzi et al.

ICML 2024arXiv:2406.06248
23
citations
#3564

LEMON: Lossless model expansion

Yite Wang, Jiahao Su, Hanlin Lu et al.

ICLR 2024arXiv:2310.07999
23
citations
#3565

Sterling: Synergistic Representation Learning on Bipartite Graphs

Baoyu Jing, Yuchen Yan, Kaize Ding et al.

AAAI 2024paperarXiv:2302.05428
23
citations
#3566

Improving Medical Multi-modal Contrastive Learning with Expert Annotations

Yogesh Kumar, Pekka Marttinen

ECCV 2024arXiv:2403.10153
23
citations
#3567

Overcoming Generic Knowledge Loss with Selective Parameter Update

Wenxuan Zhang, Paul Janson, Rahaf Aljundi et al.

CVPR 2024arXiv:2308.12462
23
citations
#3568

Composing Object Relations and Attributes for Image-Text Matching

Khoi Pham, Chuong Huynh, Ser-Nam Lim et al.

CVPR 2024arXiv:2406.11820
23
citations
#3569

MeDM: Mediating Image Diffusion Models for Video-to-Video Translation with Temporal Correspondence Guidance

Ernie Chu, Tzuhsuan Huang, Shuo-Yen LIN et al.

AAAI 2024paperarXiv:2308.10079
23
citations
#3570

CLIP-BEVFormer: Enhancing Multi-View Image-Based BEV Detector with Ground Truth Flow

Chenbin Pan, Burhan Yaman, Senem Velipasalar et al.

CVPR 2024arXiv:2403.08919
23
citations
#3571

Back to 3D: Few-Shot 3D Keypoint Detection with Back-Projected 2D Features

Thomas Wimmer, Peter Wonka, Maks Ovsjanikov

CVPR 2024arXiv:2311.18113
23
citations
#3572

ODICE: Revealing the Mystery of Distribution Correction Estimation via Orthogonal-gradient Update

Liyuan Mao, Haoran Xu, Weinan Zhang et al.

ICLR 2024spotlightarXiv:2402.00348
23
citations
#3573

Dual-View Visual Contextualization for Web Navigation

Jihyung Kil, Chan Hee Song, Boyuan Zheng et al.

CVPR 2024arXiv:2402.04476
23
citations
#3574

Hyperspectral Image Reconstruction via Combinatorial Embedding of Cross-Channel Spatio-Spectral Clues

Xingxing Yang, Jie Chen, Zaifeng Yang

AAAI 2024paperarXiv:2312.11119
23
citations
#3575

L2MAC: Large Language Model Automatic Computer for Extensive Code Generation

Samuel Holt, Max Ruiz Luyten, Mihaela van der Schaar

ICLR 2024arXiv:2310.02003
23
citations
#3576

Learning to Prompt Knowledge Transfer for Open-World Continual Learning

Yujie Li, Xin Yang, Hao Wang et al.

AAAI 2024paperarXiv:2312.14990
23
citations
#3577

DataDream: Few-shot Guided Dataset Generation

Jae Myung Kim, Jessica Bader, Stephan Alaniz et al.

ECCV 2024arXiv:2407.10910
23
citations
#3578

APISR: Anime Production Inspired Real-World Anime Super-Resolution

Boyang Wang, Fengyu Yang, Xihang Yu et al.

CVPR 2024arXiv:2403.01598
23
citations
#3579

Desigen: A Pipeline for Controllable Design Template Generation

Haohan Weng, Danqing Huang, YU QIAO et al.

CVPR 2024arXiv:2403.09093
23
citations
#3580

Trash to Treasure: Low-Light Object Detection via Decomposition-and-Aggregation

Xiaohan Cui, Long Ma, Tengyu Ma et al.

AAAI 2024paperarXiv:2309.03548
23
citations
#3581

Generalizable Sleep Staging via Multi-Level Domain Alignment

Jiquan Wang, Sha Zhao, Haiteng Jiang et al.

AAAI 2024paperarXiv:2401.05363
23
citations
#3582

Revisit Anything: Visual Place Recognition via Image Segment Retrieval

Kartik Garg, Sai Shubodh Puligilla, Shishir N Y Kolathaya et al.

ECCV 2024arXiv:2409.18049
23
citations
#3583

Masked and Shuffled Blind Spot Denoising for Real-World Images

Hamadi Chihaoui, Paolo Favaro

CVPR 2024arXiv:2404.09389
23
citations
#3584

DiSR-NeRF: Diffusion-Guided View-Consistent Super-Resolution NeRF

Jie Long Lee, Chen Li, Gim Hee Lee

CVPR 2024arXiv:2404.00874
23
citations
#3585

Diffusion Model for Dense Matching

Jisu Nam, Gyuseong Lee, Seonwoo Kim et al.

ICLR 2024arXiv:2305.19094
23
citations
#3586

Discovering Failure Modes of Text-guided Diffusion Models via Adversarial Search

Qihao Liu, Adam Kortylewski, Yutong Bai et al.

ICLR 2024arXiv:2306.00974
23
citations
#3587

On the Posterior Distribution in Denoising: Application to Uncertainty Quantification

Hila Manor, Tomer Michaeli

ICLR 2024arXiv:2309.13598
23
citations
#3588

Tailoring Self-Rationalizers with Multi-Reward Distillation

Sahana Ramnath, Brihi Joshi, Skyler Hallinan et al.

ICLR 2024arXiv:2311.02805
23
citations
#3589

On Least Square Estimation in Softmax Gating Mixture of Experts

Huy Nguyen, Nhat Ho, Alessandro Rinaldo

ICML 2024arXiv:2402.02952
23
citations
#3590

DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaptation by Combining 3D GANs and Diffusion Priors

Biwen Lei, Kai Yu, Mengyang Feng et al.

CVPR 2024arXiv:2312.16837
23
citations
#3591

Learning to Reweight for Generalizable Graph Neural Network

Zhengyu Chen, Teng Xiao, Kun Kuang et al.

AAAI 2024paper
23
citations
#3592

Large Language Models are Good Prompt Learners for Low-Shot Image Classification

Zhaoheng Zheng, Jingmin Wei, Xuefeng Hu et al.

CVPR 2024arXiv:2312.04076
23
citations
#3593

Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time

Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta et al.

ECCV 2024arXiv:2407.01851
23
citations
#3594

Symbolic Cognitive Diagnosis via Hybrid Optimization for Intelligent Education Systems

Junhao Shen, Hong Qian, Wei Zhang et al.

AAAI 2024paperarXiv:2401.10840
23
citations
#3595

PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels

Praneeth Kacham, Vahab Mirrokni, Peilin Zhong

ICML 2024arXiv:2310.01655
23
citations
#3596

Democratizing Fine-grained Visual Recognition with Large Language Models

Mingxuan Liu, Subhankar Roy, Wenjing Li et al.

ICLR 2024arXiv:2401.13837
23
citations
#3597

Summarizing Stream Data for Memory-Constrained Online Continual Learning

Jianyang Gu, Kai Wang, Wei Jiang et al.

AAAI 2024paperarXiv:2305.16645
23
citations
#3598

Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits

Jiachen Wang, Tianji Yang, James Zou et al.

ICML 2024arXiv:2405.03875
23
citations
#3599

DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic Systems

Yair Schiff, Zhong Yi Wan, Jeffrey Parker et al.

ICML 2024arXiv:2402.04467
23
citations
#3600

PanoContext-Former: Panoramic Total Scene Understanding with a Transformer

Yuan Dong, Chuan Fang, Liefeng Bo et al.

CVPR 2024arXiv:2305.12497
23
citations