Most Cited AAAI Oral "msa profile modeling" Papers

5,317 papers found • Page 1 of 27

#1

T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion

Chong Mou, Xintao Wang, Liangbin Xie et al.

AAAI 2024paperarXiv:2302.08453
1460
citations
#2

Graph of Thoughts: Solving Elaborate Problems with Large Language Models

Maciej Besta, Nils Blach, Ales Kubicek et al.

AAAI 2024paperarXiv:2308.09687
1116
citations
#3

Benchmarking Large Language Models in Retrieval-Augmented Generation

Jiawei Chen, Hongyu Lin, Xianpei Han et al.

AAAI 2024paperarXiv:2309.01431
475
citations
#4

ExpeL: LLM Agents Are Experiential Learners

Andrew Zhao, Daniel Huang, Quentin Xu et al.

AAAI 2024paperarXiv:2308.10144
376
citations
#5

U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation

Chenxin Li, Xinyu Liu, Wuyang Li et al.

AAAI 2025paperarXiv:2406.02918
356
citations
#6

Preference Ranking Optimization for Human Alignment

Feifan Song, Bowen Yu, Minghao Li et al.

AAAI 2024paperarXiv:2306.17492
337
citations
#7

FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts

Yichen Gong, Delong Ran, Jinyuan Liu et al.

AAAI 2025paperarXiv:2311.05608
302
citations
#8

MemoryBank: Enhancing Large Language Models with Long-Term Memory

Wanjun Zhong, Lianghong Guo, Qiqi Gao et al.

AAAI 2024paperarXiv:2305.10250
290
citations
#9

Follow Your Pose: Pose-Guided Text-to-Video Generation Using Pose-Free Videos

Yue Ma, Yingqing HE, Xiaodong Cun et al.

AAAI 2024paperarXiv:2304.01186
284
citations
#10

NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models

Gengze Zhou, Yicong Hong, Qi Wu

AAAI 2024paperarXiv:2305.16986
283
citations
#11

MedSegDiff-V2: Diffusion-based Medical Image Segmentation with Transformer

Junde Wu, Wei Ji, Huazhu Fu et al.

AAAI 2024paperarXiv:2301.11798
274
citations
#12

NuScenes-QA: A Multi-Modal Visual Question Answering Benchmark for Autonomous Driving

Tianwen Qian, Jingjing Chen, Linhai Zhuo et al.

AAAI 2024paperarXiv:2305.14836
271
citations
#13

Detecting and Preventing Hallucinations in Large Vision Language Models

Anisha Gunjal, Jihan Yin, Erhan Bas

AAAI 2024paperarXiv:2308.06394
264
citations
#14

AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models

Zhaopeng Gu, Bingke Zhu, Guibo Zhu et al.

AAAI 2024paperarXiv:2308.15366
252
citations
#15

Knowledge Graph Prompting for Multi-Document Question Answering

Yu Wang, Nedim Lipka, Ryan A. Rossi et al.

AAAI 2024paperarXiv:2308.11730
240
citations
#16

Omni-Kernel Network for Image Restoration

Yuning Cui, Wenqi Ren, Alois Knoll

AAAI 2024paper
235
citations
#17

Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-World Multi-Turn Dialogue

Songhua Yang, Hanjie Zhao, Senbin Zhu et al.

AAAI 2024paperarXiv:2308.03549
210
citations
#18

EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba

Xiaohuan Pei, Tao Huang, Chang Xu

AAAI 2025paperarXiv:2403.09977
192
citations
#19

BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions

Wenbo Hu, Yifan Xu, Yi Li et al.

AAAI 2024paperarXiv:2308.09936
192
citations
#20

ODTrack: Online Dense Temporal Token Learning for Visual Tracking

Yaozong Zheng, Bineng Zhong, Qihua Liang et al.

AAAI 2024paperarXiv:2401.01686
188
citations
#21

PMET: Precise Model Editing in a Transformer

Xiaopeng Li, Shasha Li, Shezheng Song et al.

AAAI 2024paperarXiv:2308.08742
187
citations
#22

MSGNet: Learning Multi-Scale Inter-series Correlations for Multivariate Time Series Forecasting

Wanlin Cai, Yuxuan Liang, Xianggen Liu et al.

AAAI 2024paperarXiv:2401.00423
185
citations
#23

Generalized Planning in PDDL Domains with Pretrained Large Language Models

Tom Silver, Soham Dan, Kavitha Srinivas et al.

AAAI 2024paperarXiv:2305.11014
178
citations
#24

Fast Machine Unlearning without Retraining through Selective Synaptic Dampening

Jack Foster, Stefan Schoepf, Alexandra Brintrup

AAAI 2024paperarXiv:2308.07707
176
citations
#25

EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions

Zhiyuan Chen, Jiajiong Cao, Zhiquan Chen et al.

AAAI 2025paperarXiv:2407.08136
171
citations
#26

VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection

Peng Wu, Xuerong Zhou, Guansong Pang et al.

AAAI 2024paperarXiv:2308.11681
163
citations
#27

DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation

Guosheng Zhao, Xiaofeng Wang, Zheng Zhu et al.

AAAI 2025paperarXiv:2403.06845
146
citations
#28

Segment Any 3D Gaussians

Jiazhong Cen, Jiemin Fang, Chen Yang et al.

AAAI 2025paperarXiv:2312.00860
145
citations
#29

AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

Teng Hu, Jiangning Zhang, Ran Yi et al.

AAAI 2024paperarXiv:2312.05767
144
citations
#30

Hybrid-SORT: Weak Cues Matter for Online Multi-Object Tracking

Mingzhan Yang, Guangxin Han, Bin Yan et al.

AAAI 2024paperarXiv:2308.00783
143
citations
#31

SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery

Konstantin Klemmer, Esther Rolf, Caleb Robinson et al.

AAAI 2025paperarXiv:2311.17179
141
citations
#32

SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing

Zhecheng Wang, Rajanie Prabha, Tianyuan Huang et al.

AAAI 2024paperarXiv:2312.12856
140
citations
#33

ResDiff: Combining CNN and Diffusion Model for Image Super-resolution

Shuyao Shang, Zhengyang Shan, Guangxing Liu et al.

AAAI 2024paperarXiv:2303.08714
140
citations
#34

Language Prompt for Autonomous Driving

Dongming Wu, Wencheng Han, Yingfei Liu et al.

AAAI 2025paperarXiv:2309.04379
138
citations
#35

OOTDiffusion: Outfitting Fusion Based Latent Diffusion for Controllable Virtual Try-On

Yuhao Xu, Tao Gu, Weifeng Chen et al.

AAAI 2025paperarXiv:2403.01779
138
citations
#36

PromptMRG: Diagnosis-Driven Prompts for Medical Report Generation

Haibo Jin, Haoxuan Che, Yi Lin et al.

AAAI 2024paperarXiv:2308.12604
137
citations
#37

C3oT: Generating Shorter Chain-of-Thought Without Compromising Effectiveness

Yu Kang, Xianghui Sun, Liangyu Chen et al.

AAAI 2025paperarXiv:2412.11664
136
citations
#38

Task Contamination: Language Models May Not Be Few-Shot Anymore

Changmao Li, Jeffrey Flanigan

AAAI 2024paperarXiv:2312.16337
132
citations
#39

SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research

Liangtai Sun, Yang Han, Zihan Zhao et al.

AAAI 2024paperarXiv:2308.13149
132
citations
#40

SCTNet: Single Branch CNN with Transformer Semantic Information for Real-Time Segmentation

Authors: Zhengze Xu, Dongyue Wu, Changqian Yu et al.

AAAI 2024paperarXiv:2312.17071
130
citations
#41

Pinwheel-shaped Convolution and Scale-based Dynamic Loss for Infrared Small Target Detection

Jiangnan Yang, Shuangli Liu, Jingjun Wu et al.

AAAI 2025paperarXiv:2412.16986
129
citations
#42

FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering

Zhenyu Li, Sunqi Fan, Yu Gu et al.

AAAI 2024paperarXiv:2308.12060
125
citations
#43

ProAgent: Building Proactive Cooperative Agents with Large Language Models

Ceyao Zhang, Kaijie Yang, Siyi Hu et al.

AAAI 2024paperarXiv:2308.11339
120
citations
#44

LGMRec: Local and Global Graph Learning for Multimodal Recommendation

Zhiqiang Guo, Jianjun Li, Guohui Li et al.

AAAI 2024paperarXiv:2312.16400
120
citations
#45

LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding

Senqiao Yang, Jiaming Liu, Renrui Zhang et al.

AAAI 2025paperarXiv:2312.14074
116
citations
#46

SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation

Wenxi Yue, Jing Zhang, Kun Hu et al.

AAAI 2024paperarXiv:2308.08746
114
citations
#47

Incomplete Contrastive Multi-View Clustering with High-Confidence Guiding

AAAI 2024paperarXiv:2312.08697
114
citations
#48

Mitigating Large Language Model Hallucinations via Autonomous Knowledge Graph-Based Retrofitting

Xinyan Guan, Yanjiang Liu, Hongyu Lin et al.

AAAI 2024paperarXiv:2311.13314
112
citations
#49

Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference

Han Zhao, Min Zhang, Wei Zhao et al.

AAAI 2025paperarXiv:2403.14520
110
citations
#50

VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View

Raphael Schumann, Wanrong Zhu, Weixi Feng et al.

AAAI 2024paperarXiv:2307.06082
108
citations
#51

Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations

Likang Wu, Zhaopeng Qiu, Zhi Zheng et al.

AAAI 2024paperarXiv:2307.05722
108
citations
#52

IMAGDressing-v1: Customizable Virtual Dressing

Fei Shen, Xin Jiang, Xin He et al.

AAAI 2025paperarXiv:2407.12705
107
citations
#53

Fluctuation-Based Adaptive Structured Pruning for Large Language Models

Yongqi An, Xu Zhao, Tao Yu et al.

AAAI 2024paperarXiv:2312.11983
106
citations
#54

TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Xianjie Wu, Jian Yang, Linzheng Chai et al.

AAAI 2025paperarXiv:2408.09174
105
citations
#55

TimesURL: Self-Supervised Contrastive Learning for Universal Time Series Representation Learning

jiexi Liu, Songcan Chen

AAAI 2024paperarXiv:2312.15709
105
citations
#56

OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models

Changhun Lee, Jungyu Jin, Taesu Kim et al.

AAAI 2024paperarXiv:2306.02272
105
citations
#57

Can Large Language Models Serve as Rational Players in Game Theory? A Systematic Analysis

Caoyun Fan, Jindou Chen, Yaohui Jin et al.

AAAI 2024paperarXiv:2312.05488
104
citations
#58

Fully-Connected Spatial-Temporal Graph for Multivariate Time-Series Data

Yucheng Wang, Yuecong Xu, Jianfei Yang et al.

AAAI 2024paperarXiv:2309.05305
104
citations
#59

An Attentive Inductive Bias for Sequential Recommendation beyond the Self-Attention

Yehjin Shin, Jeongwhan Choi, Hyowon Wi et al.

AAAI 2024paperarXiv:2312.10325
104
citations
#60

UCMCTrack: Multi-Object Tracking with Uniform Camera Motion Compensation

Kefu Yi, Kai Luo, Xiaolei Luo et al.

AAAI 2024paperarXiv:2312.08952
101
citations
#61

TimeCMA: Towards LLM-Empowered Multivariate Time Series Forecasting via Cross-Modality Alignment

Chenxi Liu, Qianxiong Xu, Hao Miao et al.

AAAI 2025paperarXiv:2406.01638
100
citations
#62

Rolling-Unet: Revitalizing MLP’s Ability to Efficiently Extract Long-Distance Dependencies for Medical Image Segmentation

Yutong Liu, Haijiang Zhu, Mengting Liu et al.

AAAI 2024paper
98
citations
#63

An Empirical Study of CLIP for Text-Based Person Search

Cao Min, Yang Bai, ziyin Zeng et al.

AAAI 2024paperarXiv:2308.10045
98
citations
#64

LDMVFI: Video Frame Interpolation with Latent Diffusion Models

Duolikun Danier, Fan Zhang, David Bull

AAAI 2024paperarXiv:2303.09508
97
citations
#65

Reliable Conflictive Multi-View Learning

Cai Xu, Jiajun Si, Ziyu Guan et al.

AAAI 2024paperarXiv:2402.16897
96
citations
#66

Explicit Visual Prompts for Visual Object Tracking

Liangtao Shi, Bineng Zhong, Qihua Liang et al.

AAAI 2024paperarXiv:2401.03142
96
citations
#67

CALF: Aligning LLMs for Time Series Forecasting via Cross-modal Fine-Tuning

Peiyuan Liu, Hang Guo, Tao Dai et al.

AAAI 2025paperarXiv:2403.07300
95
citations
#68

Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models

Ruichen Wang, Zekang Chen, Chen Chen et al.

AAAI 2024paperarXiv:2305.13921
93
citations
#69

Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties

Taylor Sorensen, Liwei Jiang, Jena Hwang et al.

AAAI 2024paperarXiv:2309.00779
93
citations
#70

8976 PointAttN: You Only Need Attention for Point Cloud Completion

Jun Wang, Ying Cui, Dongyan Guo et al.

AAAI 2024paper
92
citations
#71

Decoupled Contrastive Multi-View Clustering with High-Order Random Walks

Yiding Lu, Yijie Lin, Mouxing Yang et al.

AAAI 2024paperarXiv:2308.11164
92
citations
#72

Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference

Zhihang Lin, Mingbao Lin, Luxi Lin et al.

AAAI 2025paperarXiv:2405.05803
90
citations
#73

MmAP: Multi-Modal Alignment Prompt for Cross-Domain Multi-Task Learning

Yi Xin, Junlong Du, Qiang Wang et al.

AAAI 2024paperarXiv:2312.08636
88
citations
#74

VIGC: Visual Instruction Generation and Correction

Théo Delemazure, Jérôme Lang, Grzegorz Pierczyński

AAAI 2024paperarXiv:2308.12714
88
citations
#75

AnalogCoder: Analog Circuit Design via Training-Free Code Generation

Yao Lai, Sungyoung Lee, Guojin Chen et al.

AAAI 2025paperarXiv:2405.14918
87
citations
#76

FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning

Haokun Chen, Yao Zhang, Denis Krompass et al.

AAAI 2024paperarXiv:2308.12305
86
citations
#77

FocalDreamer: Text-Driven 3D Editing via Focal-Fusion Assembly

Yuhan Li, Yishun Dou, Yue Shi et al.

AAAI 2024paperarXiv:2308.10608
85
citations
#78

VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense Scene Understanding

Yi Xin, Junlong Du, Qiang Wang et al.

AAAI 2024paperarXiv:2312.08733
85
citations
#79

Prompt-Based Distribution Alignment for Unsupervised Domain Adaptation

Shuanghao Bai, Min Zhang, Wanqi Zhou et al.

AAAI 2024paperarXiv:2312.09553
85
citations
#80

GLOP: Learning Global Partition and Local Construction for Solving Large-Scale Routing Problems in Real-Time

Haoran Ye, Jiarui Wang, Helan Liang et al.

AAAI 2024paperarXiv:2312.08224
85
citations
#81

Point Cloud Mamba: Point Cloud Learning via State Space Model

Tao Zhang, Haobo Yuan, Lu Qi et al.

AAAI 2025paperarXiv:2403.00762
84
citations
#82

WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic Exploration

Yao Zhang, Zijian Ma, Yunpu Ma et al.

AAAI 2025paperarXiv:2408.15978
83
citations
#83

Directed Diffusion: Direct Control of Object Placement through Attention Guidance

Wan-Duo Ma, Avisek Lahiri, J. P. Lewis et al.

AAAI 2024paperarXiv:2302.13153
83
citations
#84

KAM-CoT: Knowledge Augmented Multimodal Chain-of-Thoughts Reasoning

Debjyoti Mondal, Suraj Modi, Subhadarshi Panda et al.

AAAI 2024paperarXiv:2401.12863
82
citations
#85

DepthFM: Fast Generative Monocular Depth Estimation with Flow Matching

Ming Gui, Johannes Schusterbauer, Ulrich Prestel et al.

AAAI 2025paper
82
citations
#86

Enhance Vision-Language Alignment with Noise

Sida Huang, Hongyuan Zhang, Xuelong Li

AAAI 2025paperarXiv:2412.10817
82
citations
#87

AVSegFormer: Audio-Visual Segmentation with Transformer

Shengyi Gao, Zhe Chen, Guo Chen et al.

AAAI 2024paperarXiv:2307.01146
82
citations
#88

Exposing the Deception: Uncovering More Forgery Clues for Deepfake Detection

Zhongjie Ba, Qingyu Liu, Zhenguang Liu et al.

AAAI 2024paperarXiv:2403.01786
82
citations
#89

Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Large Language Models

Wenbin Wang, Liang Ding, Minyan Zeng et al.

AAAI 2025paperarXiv:2408.15556
81
citations
#90

DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge Detection

Yunfan Ye, Yuhang Huang, Renjiao Yi et al.

AAAI 2024paperarXiv:2401.02032
81
citations
#91

MELO: Enhancing Model Editing with Neuron-Indexed Dynamic LoRA

Lang Yu, Qin Chen, Jie Zhou et al.

AAAI 2024paperarXiv:2312.11795
80
citations
#92

Mamba YOLO: A Simple Baseline for Object Detection with State Space Model

Zeyu Wang, Chen Li, Huiying Xu et al.

AAAI 2025paperarXiv:2406.05835
80
citations
#93

MetaDiff: Meta-Learning with Conditional Diffusion for Few-Shot Learning

Baoquan Zhang, Chuyao Luo, Demin Yu et al.

AAAI 2024paperarXiv:2307.16424
79
citations
#94

PathAsst: A Generative Foundation AI Assistant towards Artificial General Intelligence of Pathology

Yuxuan Sun, Chenglu Zhu, Sunyi Zheng et al.

AAAI 2024paperarXiv:2305.15072
79
citations
#95

EcomGPT: Instruction-Tuning Large Language Models with Chain-of-Task Tasks for E-commerce

Li Yangning, Shirong Ma, Xiaobin Wang et al.

AAAI 2024paperarXiv:2308.06966
79
citations
#96

SMILEtrack: SiMIlarity LEarning for Occlusion-Aware Multiple Object Tracking

Wang Yu Hsiang, Jun-Wei Hsieh, Ping-Yang Chen et al.

AAAI 2024paperarXiv:2211.08824
78
citations
#97

RewriteLM: An Instruction-Tuned Large Language Model for Text Rewriting

Lei Shu, Liangchen Luo, Jayakumar Hoskere et al.

AAAI 2024paperarXiv:2305.15685
78
citations
#98

VerilogCoder: Autonomous Verilog Coding Agents with Graph-based Planning and Abstract Syntax Tree (AST)-based Waveform Tracing Tool

Chia-Tung Ho, Haoxing Ren, Brucek Khailany

AAAI 2025paperarXiv:2408.08927
78
citations
#99

ChatTime: A Unified Multimodal Time Series Foundation Model Bridging Numerical and Textual Data

Chengsen Wang, Qi Qi, Jingyu Wang et al.

AAAI 2025paperarXiv:2412.11376
78
citations
#100

Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation

Zhouhong Gu, Xiaoxuan Zhu, Haoning Ye et al.

AAAI 2024paperarXiv:2306.05783
77
citations
#101

Graph Neural Prompting with Large Language Models

Yijun Tian, Huan Song, Zichen Wang et al.

AAAI 2024paperarXiv:2309.15427
77
citations
#102

V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models

Heng Wang, Jianbo Ma, Santiago Pascual et al.

AAAI 2024paperarXiv:2308.09300
75
citations
#103

FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning

Zhenhua Yang, Dezhi Peng, Yuxin Kong et al.

AAAI 2024paperarXiv:2312.12142
75
citations
#104

IT3D: Improved Text-to-3D Generation with Explicit View Synthesis

Yiwen Chen, Chi Zhang, Xiaofeng Yang et al.

AAAI 2024paperarXiv:2308.11473
75
citations
#105

Temporal Adaptive RGBT Tracking with Modality Prompt

Hongyu Wang, Xiaotao Liu, Yifan Li et al.

AAAI 2024paperarXiv:2401.01244
75
citations
#106

FedASMU: Efficient Asynchronous Federated Learning with Dynamic Staleness-Aware Model Update

Ji Liu, Juncheng Jia, Tianshi Che et al.

AAAI 2024paperarXiv:2312.05770
75
citations
#107

Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

Zhen Ye, Peiwen Sun, Jiahe Lei et al.

AAAI 2025paperarXiv:2408.17175
75
citations
#108

SpikingBERT: Distilling BERT to Train Spiking Language Models Using Implicit Differentiation

Malyaban Bal, Abhronil Sengupta

AAAI 2024paperarXiv:2308.10873
73
citations
#109

Teaching Large Language Models to Translate with Comparison

Jiali Zeng, Fandong Meng, Yongjing Yin et al.

AAAI 2024paperarXiv:2307.04408
73
citations
#110

DiT4Edit: Diffusion Transformer for Image Editing

Kunyu Feng, Yue Ma, Bingyuan Wang et al.

AAAI 2025paperarXiv:2411.03286
73
citations
#111

Enhancing Job Recommendation through LLM-Based Generative Adversarial Networks

Yingpeng Du, Di Luo, Rui Yan et al.

AAAI 2024paperarXiv:2307.10747
72
citations
#112

Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation

Guy Yariv, Itai Gat, Sagie Benaim et al.

AAAI 2024paperarXiv:2309.16429
72
citations
#113

Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum

Zhengliang Shi, Shen Gao, Minghang Zhu et al.

AAAI 2024paperarXiv:2308.14034
72
citations
#114

HARDVS: Revisiting Human Activity Recognition with Dynamic Vision Sensors

Xiao Wang, Zongzhen Wu, Bo Jiang et al.

AAAI 2024paperarXiv:2211.09648
72
citations
#115

SkeletonGait: Gait Recognition Using Skeleton Maps

Chao Fan, Jingzhe Ma, Dongyang Jin et al.

AAAI 2024paperarXiv:2311.13444
72
citations
#116

Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation

Zhewei Yao, Xiaoxia Wu, Cheng Li et al.

AAAI 2024paperarXiv:2303.08302
71
citations
#117

HDMixer: Hierarchical Dependency with Extendable Patch for Multivariate Time Series Forecasting

Qihe Huang, Lei Shen, Ruixin Zhang et al.

AAAI 2024paper
71
citations
#118

Plug-In Diffusion Model for Sequential Recommendation

Haokai Ma, Ruobing Xie, Lei Meng et al.

AAAI 2024paperarXiv:2401.02913
71
citations
#119

TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection

Hao Sun, Mingyao Zhou, Wenjing Chen et al.

AAAI 2024paperarXiv:2401.02309
71
citations
#120

NeRF-LiDAR: Generating Realistic LiDAR Point Clouds with Neural Radiance Fields

Junge Zhang, Feihu Zhang, Shaochen Kuang et al.

AAAI 2024paperarXiv:2304.14811
69
citations
#121

Learning to Rank in Generative Retrieval

Yongqi Li, Nan Yang, Liang Wang et al.

AAAI 2024paperarXiv:2306.15222
69
citations
#122

Generating Images of Rare Concepts Using Pre-trained Diffusion Models

Dvir Samuel, Rami Ben-Ari, Simon Raviv et al.

AAAI 2024paperarXiv:2304.14530
69
citations
#123

Generative-Based Fusion Mechanism for Multi-Modal Tracking

Zhangyong Tang, Tianyang Xu, Xiaojun Wu et al.

AAAI 2024paperarXiv:2309.01728
69
citations
#124

Augmenting Math Word Problems via Iterative Question Composing

Haoxiong Liu, Yifan Zhang, Yifan Luo et al.

AAAI 2025paperarXiv:2401.09003
69
citations
#125

Learning to Unlearn: Instance-Wise Unlearning for Pre-trained Classifiers

Sungmin Cha, Sungjun Cho, Dasol Hwang et al.

AAAI 2024paperarXiv:2301.11578
69
citations
#126

Learning Content-Enhanced Mask Transformer for Domain Generalized Urban-Scene Segmentation

Qi Bi, Shaodi You, Theo Gevers

AAAI 2024paperarXiv:2307.00371
69
citations
#127

BAT: Behavior-Aware Human-Like Trajectory Prediction for Autonomous Driving

Haicheng Liao, Zhenning Li, Huanming Shen et al.

AAAI 2024paperarXiv:2312.06371
67
citations
#128

ELLA-V: Stable Neural Codec Language Modeling with Alignment-Guided Sequence Reordering

Yakun Song, Zhuo Chen, Xiaofei Wang et al.

AAAI 2025paperarXiv:2401.07333
66
citations
#129

DiffusionTrack: Diffusion Model for Multi-Object Tracking

Run Luo, Zikai Song, Lintao Ma et al.

AAAI 2024paperarXiv:2308.09905
66
citations
#130

XCOT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning

Linzheng Chai, Jian Yang, Tao Sun et al.

AAAI 2025paperarXiv:2401.07037
66
citations
#131

Make RepVGG Greater Again: A Quantization-Aware Approach

Xuesong Nie, Yunfeng Yan, Siyuan Li et al.

AAAI 2024paperarXiv:2212.01593
66
citations
#132

Key-Point-Driven Data Synthesis with Its Enhancement on Mathematical Reasoning

Yiming Huang, Xiao Liu, Yeyun Gong et al.

AAAI 2025paperarXiv:2403.02333
65
citations
#133

Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons

Yuheng Chen, Pengfei Cao, Yubo Chen et al.

AAAI 2024paperarXiv:2308.13198
64
citations
#134

Fit and Prune: Fast and Training-free Visual Token Pruning for Multi-modal Large Language Models

Weihao Ye, Qiong Wu, Wenhao Lin et al.

AAAI 2025paperarXiv:2409.10197
64
citations
#135

Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion Models

Fei Shen, Hu Ye, Sibo Liu et al.

AAAI 2025paperarXiv:2407.02482
64
citations
#136

FBRT-YOLO: Faster and Better for Real-Time Aerial Image Detection

Yao Xiao, Tingfa Xu, Yu Xin et al.

AAAI 2025paperarXiv:2504.20670
62
citations
#137

Gated Attention Coding for Training High-Performance and Efficient Spiking Neural Networks

Xuerui Qiu, Rui-Jie Zhu, Yuhong Chou et al.

AAAI 2024paperarXiv:2308.06582
62
citations
#138

Large Language Models Are Clinical Reasoners: Reasoning-Aware Diagnosis Framework with Prompt-Generated Rationales

Taeyoon Kwon, Kai Ong, Dongjin Kang et al.

AAAI 2024paperarXiv:2312.07399
62
citations
#139

HGPrompt: Bridging Homogeneous and Heterogeneous Graphs for Few-Shot Prompt Learning

Xingtong Yu, Yuan Fang, Zemin Liu et al.

AAAI 2024paperarXiv:2312.01878
61
citations
#140

Towards Compact 3D Representations via Point Feature Enhancement Masked Autoencoders

Yaohua Zha, Huizhen Ji, Jinmin Li et al.

AAAI 2024paperarXiv:2312.10726
61
citations
#141

C2P-CLIP: Injecting Category Common Prompt in CLIP to Enhance Generalization in Deepfake Detection

Chuangchuang Tan, Renshuai Tao, Huan Liu et al.

AAAI 2025paperarXiv:2408.09647
61
citations
#142

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

Wenyi Xiao, Ziwei Huang, Leilei Gan et al.

AAAI 2025paperarXiv:2404.14233
61
citations
#143

Ternary Spike: Learning Ternary Spikes for Spiking Neural Networks

Yufei Guo, Yuanpei Chen, Xiaode Liu et al.

AAAI 2024paperarXiv:2312.06372
60
citations
#144

Context-I2W: Mapping Images to Context-Dependent Words for Accurate Zero-Shot Composed Image Retrieval

Yuanmin Tang, Jing Yu, Keke Gai et al.

AAAI 2024paperarXiv:2309.16137
60
citations
#145

Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical Reasoning

Wenwen Zhuang, Xin Huang, Xiantao Zhang et al.

AAAI 2025paperarXiv:2408.08640
60
citations
#146

MASTER: Market-Guided Stock Transformer for Stock Price Forecasting

Tong Li, Zhaoyang Liu, Yanyan Shen et al.

AAAI 2024paperarXiv:2312.15235
59
citations
#147

Unlocking the Power of LSTM for Long Term Time Series Forecasting

Yaxuan Kong, Zepu Wang, Yuqi Nie et al.

AAAI 2025paperarXiv:2408.10006
59
citations
#148

UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding

Chenpeng Du, Yiwei Guo, Feiyu Shen et al.

AAAI 2024paperarXiv:2306.07547
59
citations
#149

Delving into Multimodal Prompting for Fine-Grained Visual Classification

Xin Jiang, Hao Tang, Junyao Gao et al.

AAAI 2024paperarXiv:2309.08912
59
citations
#150

Correlation Matching Transformation Transformers for UHD Image Restoration

Cong Wang, Jinshan Pan, Wei Wang et al.

AAAI 2024paperarXiv:2406.00629
59
citations
#151

VLCounter: Text-Aware Visual Representation for Zero-Shot Object Counting

Seunggu Kang, WonJun Moon, Euiyeon Kim et al.

AAAI 2024paperarXiv:2312.16580
59
citations
#152

FFT-Based Dynamic Token Mixer for Vision

Yuki Tatsunami, Masato Taki

AAAI 2024paperarXiv:2303.03932
59
citations
#153

DocFormerv2: Local Features for Document Understanding

Srikar Appalaraju, Peng Tang, Qi Dong et al.

AAAI 2024paperarXiv:2306.01733
58
citations
#154

SECap: Speech Emotion Captioning with Large Language Model

Yaoxun Xu, Hangting Chen, Jianwei Yu et al.

AAAI 2024paperarXiv:2312.10381
58
citations
#155

Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation

Shilin Yan, Renrui Zhang, Ziyu Guo et al.

AAAI 2024paperarXiv:2305.16318
58
citations
#156

ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area

Junxian Li, Di Zhang, Xunzhi Wang et al.

AAAI 2025paperarXiv:2408.07246
58
citations
#157

Hot or Cold? Adaptive Temperature Sampling for Code Generation with Large Language Models

Yuqi Zhu, Jia Li, Ge Li et al.

AAAI 2024paperarXiv:2309.02772
58
citations
#158

Revisiting Graph-Based Fraud Detection in Sight of Heterophily and Spectrum

Fan Xu, Nan Wang, Hao Wu et al.

AAAI 2024paperarXiv:2312.06441
58
citations
#159

Editing Language Model

Based Knowledge Graph Embeddings

AAAI 2024paperarXiv:2305.14908
57
citations
#160

PC-Conv: Unifying Homophily and Heterophily with Two-Fold Filtering

Bingheng Li, Erlin Pan, Zhao Kang

AAAI 2024paperarXiv:2312.14438
57
citations
#161

VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

Yongxin Guo, Jingyu Liu, Mingda Li et al.

AAAI 2025paperarXiv:2405.13382
57
citations
#162

SelfPromer: Self-Prompt Dehazing Transformers with Depth-Consistency

8137 Feiyu Zhu, Reid Simmons

AAAI 2024paperarXiv:2303.07033
57
citations
#163

DrFuse: Learning Disentangled Representation for Clinical Multi-Modal Fusion with Missing Modality and Modal Inconsistency

Wenfang Yao, Kejing Yin, William Cheung et al.

AAAI 2024paperarXiv:2403.06197
56
citations
#164

TEILP: Time Prediction over Knowledge Graphs via Logical Reasoning

Siheng Xiong, Yuan Yang, Ali Payani et al.

AAAI 2024paperarXiv:2312.15816
56
citations
#165

DME-Driver: Integrating Human Decision Logic and 3D Scene Perception in Autonomous Driving

Wencheng Han, Dongqian Guo, Cheng-Zhong Xu et al.

AAAI 2025paperarXiv:2401.03641
56
citations
#166

FlowPolicy: Enabling Fast and Robust 3D Flow-Based Policy via Consistency Flow Matching for Robot Manipulation

Qinglun Zhang, Zhen Liu, Haoqiang Fan et al.

AAAI 2025paperarXiv:2412.04987
56
citations
#167

SwitchTab: Switched Autoencoders Are Effective Tabular Learners

Jing Wu, Suiyao Chen, Qi Zhao et al.

AAAI 2024paperarXiv:2401.02013
56
citations
#168

CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities

Tao Wu, Yong Zhang, Xintao Wang et al.

AAAI 2025paperarXiv:2408.13239
55
citations
#169

Watermarking Conditional Text Generation for AI Detection: Unveiling Challenges and a Semantic-Aware Watermark Remedy

Yu Fu, Deyi Xiong, Yue Dong

AAAI 2024paperarXiv:2307.13808
55
citations
#170

CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion Models

Zhongxi Chen, Ke Sun, Xianming Lin

AAAI 2024paperarXiv:2305.17932
55
citations
#171

Data Roaming and Quality Assessment for Composed Image Retrieval

Matan Levy, Rami Ben-Ari, Nir Darshan et al.

AAAI 2024paperarXiv:2303.09429
55
citations
#172

LatestEval: Addressing Data Contamination in Language Model Evaluation through Dynamic and Time

Sensitive Test Construction - Yucheng Li, Frank Guerin, Chenghua Lin

AAAI 2024paperarXiv:2312.12343
54
citations
#173

Panoptic Scene Graph Generation with Semantics-Prototype Learning

Li Li, Wei Ji, Yiming Wu et al.

AAAI 2024paperarXiv:2307.15567
54
citations
#174

Relax Image-Specific Prompt Requirement in SAM: A Single Generic Prompt for Segmenting Camouflaged Objects

Jian Hu, Jiayi Lin, Shaogang Gong et al.

AAAI 2024paperarXiv:2312.07374
54
citations
#175

VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis

Chao Pang, Xingxing Weng, Jiang Wu et al.

AAAI 2025paperarXiv:2403.20213
54
citations
#176

Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-Modal Structured Representations

Yufeng Huang, Jiji Tang, Zhuo Chen et al.

AAAI 2024paperarXiv:2305.06152
53
citations
#177

Visual Instruction Tuning with Polite Flamingo

Delong Chen, Jianfeng Liu, Wenliang Dai et al.

AAAI 2024paperarXiv:2307.01003
53
citations
#178

Advancing Spatial Reasoning in Large Language Models: An In-Depth Evaluation and Enhancement Using the StepGame Benchmark

Fangjun Li, David C. Hogg, Anthony G. Cohn

AAAI 2024paperarXiv:2401.03991
53
citations
#179

SGNet: Structure Guided Network via Gradient-Frequency Awareness for Depth Map Super-resolution

Zhengxue Wang, Zhiqiang Yan, Jian Yang

AAAI 2024paperarXiv:2312.05799
53
citations
#180

Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge

Xuan Shen, Peiyan Dong, Lei Lu et al.

AAAI 2024paperarXiv:2312.05693
53
citations
#181

EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering

Junjue Wang, Zhuo Zheng, Zihang Chen et al.

AAAI 2024paperarXiv:2312.12222
53
citations
#182

Prot2Text: Multimodal Protein’s Function Generation with GNNs and Transformers

Hadi Abdine, Michail Chatzianastasis, Costas Bouyioukos et al.

AAAI 2024paperarXiv:2307.14367
53
citations
#183

SQLdepth: Generalizable Self-Supervised Fine-Structured Monocular Depth Estimation

Dong Wu, Mingmin Chi, Xuan Zang et al.

AAAI 2024paperarXiv:2309.00526
53
citations
#184

GAMC: An Unsupervised Method for Fake News Detection Using Graph Autoencoder with Masking

Shu Yin, Peican Zhu, Lianwei Wu et al.

AAAI 2024paperarXiv:2312.05739
53
citations
#185

Spatial Transform Decoupling for Oriented Object Detection

Hongtian Yu, Yunjie Tian, Qixiang Ye et al.

AAAI 2024paperarXiv:2308.10561
52
citations
#186

M3D: Dataset Condensation by Minimizing Maximum Mean Discrepancy

Hansong Zhang, Shikun Li, Pengju Wang et al.

AAAI 2024paperarXiv:2312.15927
52
citations
#187

Calibrating Large Language Models with Sample Consistency

Qing Lyu, Kumar Shridhar, Chaitanya Malaviya et al.

AAAI 2025paperarXiv:2402.13904
52
citations
#188

Prompt to Transfer: Sim-to-Real Transfer for Traffic Signal Control with Prompt Learning

Longchao Da, Minquan Gao, Hua Wei et al.

AAAI 2024paperarXiv:2308.14284
52
citations
#189

S2WAT: Image Style Transfer via Hierarchical Vision Transformer Using Strips Window Attention

Chiyu Zhang, Xiaogang Xu, Lei Wang et al.

AAAI 2024paperarXiv:2210.12381
52
citations
#190

Understanding the Role of the Projector in Knowledge Distillation

AAAI 2024paperarXiv:2303.11098
52
citations
#191

Aleth-NeRF: Illumination Adaptive NeRF with Concealing Field Assumption

Ziteng Cui, Lin Gu, Xiao Sun et al.

AAAI 2024paperarXiv:2312.09093
52
citations
#192

SFC: Shared Feature Calibration in Weakly Supervised Semantic Segmentation

AAAI 2024paperarXiv:2401.11719
52
citations
#193

Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient

Yongliang Wu, Shiji Zhou, Mingzhuo Yang et al.

AAAI 2025paperarXiv:2405.15304
51
citations
#194

LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection

hongcheng Guo, Jian Yang, Jiaheng Liu et al.

AAAI 2024paperarXiv:2401.04749
51
citations
#195

High-Order Structure Based Middle-Feature Learning for Visible-Infrared Person Re-identification

Liuxiang Qiu, Si Chen, Yan Yan et al.

AAAI 2024paperarXiv:2312.07853
51
citations
#196

DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection

Li Xiang, Junbo Yin, Wei Li et al.

AAAI 2024paperarXiv:2312.15742
51
citations
#197

Language Model Can Listen While Speaking

Ziyang Ma, Yakun Song, Chenpeng Du et al.

AAAI 2025paperarXiv:2408.02622
51
citations
#198

TranSplat: Generalizable 3D Gaussian Splatting from Sparse Multi-View Images with Transformers

Chuanrui Zhang, Yingshuang Zou, Zhuoling Li et al.

AAAI 2025paperarXiv:2408.13770
51
citations
#199

Strong Baselines for Parameter-Efficient Few-Shot Fine-Tuning

Samyadeep Basu, Shell Hu, Daniela Massiceti et al.

AAAI 2024paperarXiv:2304.01917
50
citations
#200

Dynamic Semantic-Based Spatial Graph Convolution Network for Skeleton-Based Human Action Recognition

Jianyang Xie, Yanda Meng, Yitian Zhao et al.

AAAI 2024paper
50
citations
PreviousNext