Most Cited AAAI 2024 "reasoning grounding" Papers

2,289 papers found • Page 1 of 12

#1

T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion

Chong Mou, Xintao Wang, Liangbin Xie et al.

AAAI 2024paperarXiv:2302.08453
1460
citations
#2

Graph of Thoughts: Solving Elaborate Problems with Large Language Models

Maciej Besta, Nils Blach, Ales Kubicek et al.

AAAI 2024paperarXiv:2308.09687
1116
citations
#3

Benchmarking Large Language Models in Retrieval-Augmented Generation

Jiawei Chen, Hongyu Lin, Xianpei Han et al.

AAAI 2024paperarXiv:2309.01431
475
citations
#4

ExpeL: LLM Agents Are Experiential Learners

Andrew Zhao, Daniel Huang, Quentin Xu et al.

AAAI 2024paperarXiv:2308.10144
376
citations
#5

Preference Ranking Optimization for Human Alignment

Feifan Song, Bowen Yu, Minghao Li et al.

AAAI 2024paperarXiv:2306.17492
337
citations
#6

MemoryBank: Enhancing Large Language Models with Long-Term Memory

Wanjun Zhong, Lianghong Guo, Qiqi Gao et al.

AAAI 2024paperarXiv:2305.10250
290
citations
#7

Follow Your Pose: Pose-Guided Text-to-Video Generation Using Pose-Free Videos

Yue Ma, Yingqing HE, Xiaodong Cun et al.

AAAI 2024paperarXiv:2304.01186
284
citations
#8

NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models

Gengze Zhou, Yicong Hong, Qi Wu

AAAI 2024paperarXiv:2305.16986
283
citations
#9

MedSegDiff-V2: Diffusion-based Medical Image Segmentation with Transformer

Junde Wu, Wei Ji, Huazhu Fu et al.

AAAI 2024paperarXiv:2301.11798
274
citations
#10

NuScenes-QA: A Multi-Modal Visual Question Answering Benchmark for Autonomous Driving

Tianwen Qian, Jingjing Chen, Linhai Zhuo et al.

AAAI 2024paperarXiv:2305.14836
271
citations
#11

Detecting and Preventing Hallucinations in Large Vision Language Models

Anisha Gunjal, Jihan Yin, Erhan Bas

AAAI 2024paperarXiv:2308.06394
264
citations
#12

AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models

Zhaopeng Gu, Bingke Zhu, Guibo Zhu et al.

AAAI 2024paperarXiv:2308.15366
252
citations
#13

Knowledge Graph Prompting for Multi-Document Question Answering

Yu Wang, Nedim Lipka, Ryan A. Rossi et al.

AAAI 2024paperarXiv:2308.11730
240
citations
#14

Omni-Kernel Network for Image Restoration

Yuning Cui, Wenqi Ren, Alois Knoll

AAAI 2024paper
235
citations
#15

Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-World Multi-Turn Dialogue

Songhua Yang, Hanjie Zhao, Senbin Zhu et al.

AAAI 2024paperarXiv:2308.03549
210
citations
#16

BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions

Wenbo Hu, Yifan Xu, Yi Li et al.

AAAI 2024paperarXiv:2308.09936
192
citations
#17

ODTrack: Online Dense Temporal Token Learning for Visual Tracking

Yaozong Zheng, Bineng Zhong, Qihua Liang et al.

AAAI 2024paperarXiv:2401.01686
188
citations
#18

PMET: Precise Model Editing in a Transformer

Xiaopeng Li, Shasha Li, Shezheng Song et al.

AAAI 2024paperarXiv:2308.08742
187
citations
#19

MSGNet: Learning Multi-Scale Inter-series Correlations for Multivariate Time Series Forecasting

Wanlin Cai, Yuxuan Liang, Xianggen Liu et al.

AAAI 2024paperarXiv:2401.00423
185
citations
#20

Generalized Planning in PDDL Domains with Pretrained Large Language Models

Tom Silver, Soham Dan, Kavitha Srinivas et al.

AAAI 2024paperarXiv:2305.11014
178
citations
#21

Fast Machine Unlearning without Retraining through Selective Synaptic Dampening

Jack Foster, Stefan Schoepf, Alexandra Brintrup

AAAI 2024paperarXiv:2308.07707
176
citations
#22

VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection

Peng Wu, Xuerong Zhou, Guansong Pang et al.

AAAI 2024paperarXiv:2308.11681
163
citations
#23

AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

Teng Hu, Jiangning Zhang, Ran Yi et al.

AAAI 2024paperarXiv:2312.05767
144
citations
#24

Hybrid-SORT: Weak Cues Matter for Online Multi-Object Tracking

Mingzhan Yang, Guangxin Han, Bin Yan et al.

AAAI 2024paperarXiv:2308.00783
143
citations
#25

SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing

Zhecheng Wang, Rajanie Prabha, Tianyuan Huang et al.

AAAI 2024paperarXiv:2312.12856
140
citations
#26

ResDiff: Combining CNN and Diffusion Model for Image Super-resolution

Shuyao Shang, Zhengyang Shan, Guangxing Liu et al.

AAAI 2024paperarXiv:2303.08714
140
citations
#27

PromptMRG: Diagnosis-Driven Prompts for Medical Report Generation

Haibo Jin, Haoxuan Che, Yi Lin et al.

AAAI 2024paperarXiv:2308.12604
137
citations
#28

Task Contamination: Language Models May Not Be Few-Shot Anymore

Changmao Li, Jeffrey Flanigan

AAAI 2024paperarXiv:2312.16337
132
citations
#29

SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research

Liangtai Sun, Yang Han, Zihan Zhao et al.

AAAI 2024paperarXiv:2308.13149
132
citations
#30

SCTNet: Single Branch CNN with Transformer Semantic Information for Real-Time Segmentation

Authors: Zhengze Xu, Dongyue Wu, Changqian Yu et al.

AAAI 2024paperarXiv:2312.17071
130
citations
#31

FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering

Zhenyu Li, Sunqi Fan, Yu Gu et al.

AAAI 2024paperarXiv:2308.12060
125
citations
#32

LGMRec: Local and Global Graph Learning for Multimodal Recommendation

Zhiqiang Guo, Jianjun Li, Guohui Li et al.

AAAI 2024paperarXiv:2312.16400
120
citations
#33

ProAgent: Building Proactive Cooperative Agents with Large Language Models

Ceyao Zhang, Kaijie Yang, Siyi Hu et al.

AAAI 2024paperarXiv:2308.11339
120
citations
#34

SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation

Wenxi Yue, Jing Zhang, Kun Hu et al.

AAAI 2024paperarXiv:2308.08746
114
citations
#35

Incomplete Contrastive Multi-View Clustering with High-Confidence Guiding

AAAI 2024paperarXiv:2312.08697
114
citations
#36

Mitigating Large Language Model Hallucinations via Autonomous Knowledge Graph-Based Retrofitting

Xinyan Guan, Yanjiang Liu, Hongyu Lin et al.

AAAI 2024paperarXiv:2311.13314
112
citations
#37

VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View

Raphael Schumann, Wanrong Zhu, Weixi Feng et al.

AAAI 2024paperarXiv:2307.06082
108
citations
#38

Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations

Likang Wu, Zhaopeng Qiu, Zhi Zheng et al.

AAAI 2024paperarXiv:2307.05722
108
citations
#39

Fluctuation-Based Adaptive Structured Pruning for Large Language Models

Yongqi An, Xu Zhao, Tao Yu et al.

AAAI 2024paperarXiv:2312.11983
106
citations
#40

OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models

Changhun Lee, Jungyu Jin, Taesu Kim et al.

AAAI 2024paperarXiv:2306.02272
105
citations
#41

TimesURL: Self-Supervised Contrastive Learning for Universal Time Series Representation Learning

jiexi Liu, Songcan Chen

AAAI 2024paperarXiv:2312.15709
105
citations
#42

Fully-Connected Spatial-Temporal Graph for Multivariate Time-Series Data

Yucheng Wang, Yuecong Xu, Jianfei Yang et al.

AAAI 2024paperarXiv:2309.05305
104
citations
#43

Can Large Language Models Serve as Rational Players in Game Theory? A Systematic Analysis

Caoyun Fan, Jindou Chen, Yaohui Jin et al.

AAAI 2024paperarXiv:2312.05488
104
citations
#44

An Attentive Inductive Bias for Sequential Recommendation beyond the Self-Attention

Yehjin Shin, Jeongwhan Choi, Hyowon Wi et al.

AAAI 2024paperarXiv:2312.10325
104
citations
#45

UCMCTrack: Multi-Object Tracking with Uniform Camera Motion Compensation

Kefu Yi, Kai Luo, Xiaolei Luo et al.

AAAI 2024paperarXiv:2312.08952
101
citations
#46

An Empirical Study of CLIP for Text-Based Person Search

Cao Min, Yang Bai, ziyin Zeng et al.

AAAI 2024paperarXiv:2308.10045
98
citations
#47

Rolling-Unet: Revitalizing MLP’s Ability to Efficiently Extract Long-Distance Dependencies for Medical Image Segmentation

Yutong Liu, Haijiang Zhu, Mengting Liu et al.

AAAI 2024paper
98
citations
#48

LDMVFI: Video Frame Interpolation with Latent Diffusion Models

Duolikun Danier, Fan Zhang, David Bull

AAAI 2024paperarXiv:2303.09508
97
citations
#49

Explicit Visual Prompts for Visual Object Tracking

Liangtao Shi, Bineng Zhong, Qihua Liang et al.

AAAI 2024paperarXiv:2401.03142
96
citations
#50

Reliable Conflictive Multi-View Learning

Cai Xu, Jiajun Si, Ziyu Guan et al.

AAAI 2024paperarXiv:2402.16897
96
citations
#51

Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models

Ruichen Wang, Zekang Chen, Chen Chen et al.

AAAI 2024paperarXiv:2305.13921
93
citations
#52

Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties

Taylor Sorensen, Liwei Jiang, Jena Hwang et al.

AAAI 2024paperarXiv:2309.00779
93
citations
#53

Decoupled Contrastive Multi-View Clustering with High-Order Random Walks

Yiding Lu, Yijie Lin, Mouxing Yang et al.

AAAI 2024paperarXiv:2308.11164
92
citations
#54

8976 PointAttN: You Only Need Attention for Point Cloud Completion

Jun Wang, Ying Cui, Dongyan Guo et al.

AAAI 2024paper
92
citations
#55

VIGC: Visual Instruction Generation and Correction

Théo Delemazure, Jérôme Lang, Grzegorz Pierczyński

AAAI 2024paperarXiv:2308.12714
88
citations
#56

MmAP: Multi-Modal Alignment Prompt for Cross-Domain Multi-Task Learning

Yi Xin, Junlong Du, Qiang Wang et al.

AAAI 2024paperarXiv:2312.08636
88
citations
#57

FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning

Haokun Chen, Yao Zhang, Denis Krompass et al.

AAAI 2024paperarXiv:2308.12305
86
citations
#58

FocalDreamer: Text-Driven 3D Editing via Focal-Fusion Assembly

Yuhan Li, Yishun Dou, Yue Shi et al.

AAAI 2024paperarXiv:2308.10608
85
citations
#59

Prompt-Based Distribution Alignment for Unsupervised Domain Adaptation

Shuanghao Bai, Min Zhang, Wanqi Zhou et al.

AAAI 2024paperarXiv:2312.09553
85
citations
#60

VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense Scene Understanding

Yi Xin, Junlong Du, Qiang Wang et al.

AAAI 2024paperarXiv:2312.08733
85
citations
#61

GLOP: Learning Global Partition and Local Construction for Solving Large-Scale Routing Problems in Real-Time

Haoran Ye, Jiarui Wang, Helan Liang et al.

AAAI 2024paperarXiv:2312.08224
85
citations
#62

Directed Diffusion: Direct Control of Object Placement through Attention Guidance

Wan-Duo Ma, Avisek Lahiri, J. P. Lewis et al.

AAAI 2024paperarXiv:2302.13153
83
citations
#63

KAM-CoT: Knowledge Augmented Multimodal Chain-of-Thoughts Reasoning

Debjyoti Mondal, Suraj Modi, Subhadarshi Panda et al.

AAAI 2024paperarXiv:2401.12863
82
citations
#64

Exposing the Deception: Uncovering More Forgery Clues for Deepfake Detection

Zhongjie Ba, Qingyu Liu, Zhenguang Liu et al.

AAAI 2024paperarXiv:2403.01786
82
citations
#65

AVSegFormer: Audio-Visual Segmentation with Transformer

Shengyi Gao, Zhe Chen, Guo Chen et al.

AAAI 2024paperarXiv:2307.01146
82
citations
#66

DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge Detection

Yunfan Ye, Yuhang Huang, Renjiao Yi et al.

AAAI 2024paperarXiv:2401.02032
81
citations
#67

MELO: Enhancing Model Editing with Neuron-Indexed Dynamic LoRA

Lang Yu, Qin Chen, Jie Zhou et al.

AAAI 2024paperarXiv:2312.11795
80
citations
#68

EcomGPT: Instruction-Tuning Large Language Models with Chain-of-Task Tasks for E-commerce

Li Yangning, Shirong Ma, Xiaobin Wang et al.

AAAI 2024paperarXiv:2308.06966
79
citations
#69

PathAsst: A Generative Foundation AI Assistant towards Artificial General Intelligence of Pathology

Yuxuan Sun, Chenglu Zhu, Sunyi Zheng et al.

AAAI 2024paperarXiv:2305.15072
79
citations
#70

MetaDiff: Meta-Learning with Conditional Diffusion for Few-Shot Learning

Baoquan Zhang, Chuyao Luo, Demin Yu et al.

AAAI 2024paperarXiv:2307.16424
79
citations
#71

SMILEtrack: SiMIlarity LEarning for Occlusion-Aware Multiple Object Tracking

Wang Yu Hsiang, Jun-Wei Hsieh, Ping-Yang Chen et al.

AAAI 2024paperarXiv:2211.08824
78
citations
#72

RewriteLM: An Instruction-Tuned Large Language Model for Text Rewriting

Lei Shu, Liangchen Luo, Jayakumar Hoskere et al.

AAAI 2024paperarXiv:2305.15685
78
citations
#73

Graph Neural Prompting with Large Language Models

Yijun Tian, Huan Song, Zichen Wang et al.

AAAI 2024paperarXiv:2309.15427
77
citations
#74

Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation

Zhouhong Gu, Xiaoxuan Zhu, Haoning Ye et al.

AAAI 2024paperarXiv:2306.05783
77
citations
#75

IT3D: Improved Text-to-3D Generation with Explicit View Synthesis

Yiwen Chen, Chi Zhang, Xiaofeng Yang et al.

AAAI 2024paperarXiv:2308.11473
75
citations
#76

V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models

Heng Wang, Jianbo Ma, Santiago Pascual et al.

AAAI 2024paperarXiv:2308.09300
75
citations
#77

FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning

Zhenhua Yang, Dezhi Peng, Yuxin Kong et al.

AAAI 2024paperarXiv:2312.12142
75
citations
#78

FedASMU: Efficient Asynchronous Federated Learning with Dynamic Staleness-Aware Model Update

Ji Liu, Juncheng Jia, Tianshi Che et al.

AAAI 2024paperarXiv:2312.05770
75
citations
#79

Temporal Adaptive RGBT Tracking with Modality Prompt

Hongyu Wang, Xiaotao Liu, Yifan Li et al.

AAAI 2024paperarXiv:2401.01244
75
citations
#80

SpikingBERT: Distilling BERT to Train Spiking Language Models Using Implicit Differentiation

Malyaban Bal, Abhronil Sengupta

AAAI 2024paperarXiv:2308.10873
73
citations
#81

Teaching Large Language Models to Translate with Comparison

Jiali Zeng, Fandong Meng, Yongjing Yin et al.

AAAI 2024paperarXiv:2307.04408
73
citations
#82

Enhancing Job Recommendation through LLM-Based Generative Adversarial Networks

Yingpeng Du, Di Luo, Rui Yan et al.

AAAI 2024paperarXiv:2307.10747
72
citations
#83

HARDVS: Revisiting Human Activity Recognition with Dynamic Vision Sensors

Xiao Wang, Zongzhen Wu, Bo Jiang et al.

AAAI 2024paperarXiv:2211.09648
72
citations
#84

Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum

Zhengliang Shi, Shen Gao, Minghang Zhu et al.

AAAI 2024paperarXiv:2308.14034
72
citations
#85

Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation

Guy Yariv, Itai Gat, Sagie Benaim et al.

AAAI 2024paperarXiv:2309.16429
72
citations
#86

SkeletonGait: Gait Recognition Using Skeleton Maps

Chao Fan, Jingzhe Ma, Dongyang Jin et al.

AAAI 2024paperarXiv:2311.13444
72
citations
#87

TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection

Hao Sun, Mingyao Zhou, Wenjing Chen et al.

AAAI 2024paperarXiv:2401.02309
71
citations
#88

Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation

Zhewei Yao, Xiaoxia Wu, Cheng Li et al.

AAAI 2024paperarXiv:2303.08302
71
citations
#89

HDMixer: Hierarchical Dependency with Extendable Patch for Multivariate Time Series Forecasting

Qihe Huang, Lei Shen, Ruixin Zhang et al.

AAAI 2024paper
71
citations
#90

Plug-In Diffusion Model for Sequential Recommendation

Haokai Ma, Ruobing Xie, Lei Meng et al.

AAAI 2024paperarXiv:2401.02913
71
citations
#91

Learning Content-Enhanced Mask Transformer for Domain Generalized Urban-Scene Segmentation

Qi Bi, Shaodi You, Theo Gevers

AAAI 2024paperarXiv:2307.00371
69
citations
#92

Learning to Rank in Generative Retrieval

Yongqi Li, Nan Yang, Liang Wang et al.

AAAI 2024paperarXiv:2306.15222
69
citations
#93

Generative-Based Fusion Mechanism for Multi-Modal Tracking

Zhangyong Tang, Tianyang Xu, Xiaojun Wu et al.

AAAI 2024paperarXiv:2309.01728
69
citations
#94

Learning to Unlearn: Instance-Wise Unlearning for Pre-trained Classifiers

Sungmin Cha, Sungjun Cho, Dasol Hwang et al.

AAAI 2024paperarXiv:2301.11578
69
citations
#95

Generating Images of Rare Concepts Using Pre-trained Diffusion Models

Dvir Samuel, Rami Ben-Ari, Simon Raviv et al.

AAAI 2024paperarXiv:2304.14530
69
citations
#96

NeRF-LiDAR: Generating Realistic LiDAR Point Clouds with Neural Radiance Fields

Junge Zhang, Feihu Zhang, Shaochen Kuang et al.

AAAI 2024paperarXiv:2304.14811
69
citations
#97

BAT: Behavior-Aware Human-Like Trajectory Prediction for Autonomous Driving

Haicheng Liao, Zhenning Li, Huanming Shen et al.

AAAI 2024paperarXiv:2312.06371
67
citations
#98

DiffusionTrack: Diffusion Model for Multi-Object Tracking

Run Luo, Zikai Song, Lintao Ma et al.

AAAI 2024paperarXiv:2308.09905
66
citations
#99

Make RepVGG Greater Again: A Quantization-Aware Approach

Xuesong Nie, Yunfeng Yan, Siyuan Li et al.

AAAI 2024paperarXiv:2212.01593
66
citations
#100

Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons

Yuheng Chen, Pengfei Cao, Yubo Chen et al.

AAAI 2024paperarXiv:2308.13198
64
citations
#101

Gated Attention Coding for Training High-Performance and Efficient Spiking Neural Networks

Xuerui Qiu, Rui-Jie Zhu, Yuhong Chou et al.

AAAI 2024paperarXiv:2308.06582
62
citations
#102

Large Language Models Are Clinical Reasoners: Reasoning-Aware Diagnosis Framework with Prompt-Generated Rationales

Taeyoon Kwon, Kai Ong, Dongjin Kang et al.

AAAI 2024paperarXiv:2312.07399
62
citations
#103

Towards Compact 3D Representations via Point Feature Enhancement Masked Autoencoders

Yaohua Zha, Huizhen Ji, Jinmin Li et al.

AAAI 2024paperarXiv:2312.10726
61
citations
#104

HGPrompt: Bridging Homogeneous and Heterogeneous Graphs for Few-Shot Prompt Learning

Xingtong Yu, Yuan Fang, Zemin Liu et al.

AAAI 2024paperarXiv:2312.01878
61
citations
#105

Ternary Spike: Learning Ternary Spikes for Spiking Neural Networks

Yufei Guo, Yuanpei Chen, Xiaode Liu et al.

AAAI 2024paperarXiv:2312.06372
60
citations
#106

Context-I2W: Mapping Images to Context-Dependent Words for Accurate Zero-Shot Composed Image Retrieval

Yuanmin Tang, Jing Yu, Keke Gai et al.

AAAI 2024paperarXiv:2309.16137
60
citations
#107

Delving into Multimodal Prompting for Fine-Grained Visual Classification

Xin Jiang, Hao Tang, Junyao Gao et al.

AAAI 2024paperarXiv:2309.08912
59
citations
#108

MASTER: Market-Guided Stock Transformer for Stock Price Forecasting

Tong Li, Zhaoyang Liu, Yanyan Shen et al.

AAAI 2024paperarXiv:2312.15235
59
citations
#109

Correlation Matching Transformation Transformers for UHD Image Restoration

Cong Wang, Jinshan Pan, Wei Wang et al.

AAAI 2024paperarXiv:2406.00629
59
citations
#110

VLCounter: Text-Aware Visual Representation for Zero-Shot Object Counting

Seunggu Kang, WonJun Moon, Euiyeon Kim et al.

AAAI 2024paperarXiv:2312.16580
59
citations
#111

UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding

Chenpeng Du, Yiwei Guo, Feiyu Shen et al.

AAAI 2024paperarXiv:2306.07547
59
citations
#112

FFT-Based Dynamic Token Mixer for Vision

Yuki Tatsunami, Masato Taki

AAAI 2024paperarXiv:2303.03932
59
citations
#113

Revisiting Graph-Based Fraud Detection in Sight of Heterophily and Spectrum

Fan Xu, Nan Wang, Hao Wu et al.

AAAI 2024paperarXiv:2312.06441
58
citations
#114

SECap: Speech Emotion Captioning with Large Language Model

Yaoxun Xu, Hangting Chen, Jianwei Yu et al.

AAAI 2024paperarXiv:2312.10381
58
citations
#115

DocFormerv2: Local Features for Document Understanding

Srikar Appalaraju, Peng Tang, Qi Dong et al.

AAAI 2024paperarXiv:2306.01733
58
citations
#116

Hot or Cold? Adaptive Temperature Sampling for Code Generation with Large Language Models

Yuqi Zhu, Jia Li, Ge Li et al.

AAAI 2024paperarXiv:2309.02772
58
citations
#117

Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation

Shilin Yan, Renrui Zhang, Ziyu Guo et al.

AAAI 2024paperarXiv:2305.16318
58
citations
#118

PC-Conv: Unifying Homophily and Heterophily with Two-Fold Filtering

Bingheng Li, Erlin Pan, Zhao Kang

AAAI 2024paperarXiv:2312.14438
57
citations
#119

Editing Language Model

Based Knowledge Graph Embeddings

AAAI 2024paperarXiv:2305.14908
57
citations
#120

SelfPromer: Self-Prompt Dehazing Transformers with Depth-Consistency

8137 Feiyu Zhu, Reid Simmons

AAAI 2024paperarXiv:2303.07033
57
citations
#121

DrFuse: Learning Disentangled Representation for Clinical Multi-Modal Fusion with Missing Modality and Modal Inconsistency

Wenfang Yao, Kejing Yin, William Cheung et al.

AAAI 2024paperarXiv:2403.06197
56
citations
#122

SwitchTab: Switched Autoencoders Are Effective Tabular Learners

Jing Wu, Suiyao Chen, Qi Zhao et al.

AAAI 2024paperarXiv:2401.02013
56
citations
#123

TEILP: Time Prediction over Knowledge Graphs via Logical Reasoning

Siheng Xiong, Yuan Yang, Ali Payani et al.

AAAI 2024paperarXiv:2312.15816
56
citations
#124

CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion Models

Zhongxi Chen, Ke Sun, Xianming Lin

AAAI 2024paperarXiv:2305.17932
55
citations
#125

Watermarking Conditional Text Generation for AI Detection: Unveiling Challenges and a Semantic-Aware Watermark Remedy

Yu Fu, Deyi Xiong, Yue Dong

AAAI 2024paperarXiv:2307.13808
55
citations
#126

Data Roaming and Quality Assessment for Composed Image Retrieval

Matan Levy, Rami Ben-Ari, Nir Darshan et al.

AAAI 2024paperarXiv:2303.09429
55
citations
#127

Relax Image-Specific Prompt Requirement in SAM: A Single Generic Prompt for Segmenting Camouflaged Objects

Jian Hu, Jiayi Lin, Shaogang Gong et al.

AAAI 2024paperarXiv:2312.07374
54
citations
#128

LatestEval: Addressing Data Contamination in Language Model Evaluation through Dynamic and Time

Sensitive Test Construction - Yucheng Li, Frank Guerin, Chenghua Lin

AAAI 2024paperarXiv:2312.12343
54
citations
#129

Panoptic Scene Graph Generation with Semantics-Prototype Learning

Li Li, Wei Ji, Yiming Wu et al.

AAAI 2024paperarXiv:2307.15567
54
citations
#130

Visual Instruction Tuning with Polite Flamingo

Delong Chen, Jianfeng Liu, Wenliang Dai et al.

AAAI 2024paperarXiv:2307.01003
53
citations
#131

SQLdepth: Generalizable Self-Supervised Fine-Structured Monocular Depth Estimation

Dong Wu, Mingmin Chi, Xuan Zang et al.

AAAI 2024paperarXiv:2309.00526
53
citations
#132

SGNet: Structure Guided Network via Gradient-Frequency Awareness for Depth Map Super-resolution

Zhengxue Wang, Zhiqiang Yan, Jian Yang

AAAI 2024paperarXiv:2312.05799
53
citations
#133

Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-Modal Structured Representations

Yufeng Huang, Jiji Tang, Zhuo Chen et al.

AAAI 2024paperarXiv:2305.06152
53
citations
#134

Advancing Spatial Reasoning in Large Language Models: An In-Depth Evaluation and Enhancement Using the StepGame Benchmark

Fangjun Li, David C. Hogg, Anthony G. Cohn

AAAI 2024paperarXiv:2401.03991
53
citations
#135

Prot2Text: Multimodal Protein’s Function Generation with GNNs and Transformers

Hadi Abdine, Michail Chatzianastasis, Costas Bouyioukos et al.

AAAI 2024paperarXiv:2307.14367
53
citations
#136

EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering

Junjue Wang, Zhuo Zheng, Zihang Chen et al.

AAAI 2024paperarXiv:2312.12222
53
citations
#137

GAMC: An Unsupervised Method for Fake News Detection Using Graph Autoencoder with Masking

Shu Yin, Peican Zhu, Lianwei Wu et al.

AAAI 2024paperarXiv:2312.05739
53
citations
#138

Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge

Xuan Shen, Peiyan Dong, Lei Lu et al.

AAAI 2024paperarXiv:2312.05693
53
citations
#139

S2WAT: Image Style Transfer via Hierarchical Vision Transformer Using Strips Window Attention

Chiyu Zhang, Xiaogang Xu, Lei Wang et al.

AAAI 2024paperarXiv:2210.12381
52
citations
#140

SFC: Shared Feature Calibration in Weakly Supervised Semantic Segmentation

AAAI 2024paperarXiv:2401.11719
52
citations
#141

Aleth-NeRF: Illumination Adaptive NeRF with Concealing Field Assumption

Ziteng Cui, Lin Gu, Xiao Sun et al.

AAAI 2024paperarXiv:2312.09093
52
citations
#142

Understanding the Role of the Projector in Knowledge Distillation

AAAI 2024paperarXiv:2303.11098
52
citations
#143

Prompt to Transfer: Sim-to-Real Transfer for Traffic Signal Control with Prompt Learning

Longchao Da, Minquan Gao, Hua Wei et al.

AAAI 2024paperarXiv:2308.14284
52
citations
#144

Spatial Transform Decoupling for Oriented Object Detection

Hongtian Yu, Yunjie Tian, Qixiang Ye et al.

AAAI 2024paperarXiv:2308.10561
52
citations
#145

M3D: Dataset Condensation by Minimizing Maximum Mean Discrepancy

Hansong Zhang, Shikun Li, Pengju Wang et al.

AAAI 2024paperarXiv:2312.15927
52
citations
#146

High-Order Structure Based Middle-Feature Learning for Visible-Infrared Person Re-identification

Liuxiang Qiu, Si Chen, Yan Yan et al.

AAAI 2024paperarXiv:2312.07853
51
citations
#147

LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection

hongcheng Guo, Jian Yang, Jiaheng Liu et al.

AAAI 2024paperarXiv:2401.04749
51
citations
#148

DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection

Li Xiang, Junbo Yin, Wei Li et al.

AAAI 2024paperarXiv:2312.15742
51
citations
#149

Dynamic Semantic-Based Spatial Graph Convolution Network for Skeleton-Based Human Action Recognition

Jianyang Xie, Yanda Meng, Yitian Zhao et al.

AAAI 2024paper
50
citations
#150

Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model

Lingjun Zhang, Xinyuan Chen, Yaohui Wang et al.

AAAI 2024paperarXiv:2312.12232
50
citations
#151

Strong Baselines for Parameter-Efficient Few-Shot Fine-Tuning

Samyadeep Basu, Shell Hu, Daniela Massiceti et al.

AAAI 2024paperarXiv:2304.01917
50
citations
#152

CUTS+: High-Dimensional Causal Discovery from Irregular Time-Series

Yuxiao Cheng, Lianglong Li, Tingxiong Xiao et al.

AAAI 2024paperarXiv:2305.05890
50
citations
#153

ArtBank: Artistic Style Transfer with Pre-trained Diffusion Model and Implicit Style Prompt Bank

Zhanjie Zhang, Quanwei Zhang, Wei Xing et al.

AAAI 2024paperarXiv:2312.06135
49
citations
#154

Feature Fusion from Head to Tail for Long-Tailed Visual Recognition

Mengke Li, Zhikai HU, Yang Lu et al.

AAAI 2024paperarXiv:2306.06963
49
citations
#155

Gramformer: Learning Crowd Counting via Graph-Modulated Transformer

Hui LIN, Zhiheng Ma, Xiaopeng Hong et al.

AAAI 2024paperarXiv:2401.03870
49
citations
#156

A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators

Chen Zhang, L. F. D’Haro, Yiming Chen et al.

AAAI 2024paperarXiv:2312.15407
49
citations
#157

Improving Audio-Visual Segmentation with Bidirectional Generation

Dawei Hao, Yuxin Mao, Bowen He et al.

AAAI 2024paperarXiv:2308.08288
49
citations
#158

Improving Automatic VQA Evaluation Using Large Language Models

Oscar Mañas, Benno Krojer, Aishwarya Agrawal

AAAI 2024paperarXiv:2310.02567
48
citations
#159

Reinforced Adaptive Knowledge Learning for Multimodal Fake News Detection

Litian Zhang, Xiaoming Zhang, Chaozhuo Li et al.

AAAI 2024paper
48
citations
#160

DeS3: Adaptive Attention-Driven Self and Soft Shadow Removal Using ViT Similarity

Yeying Jin, Wenhan Yang, W. Ye et al.

AAAI 2024paperarXiv:2211.08089
48
citations
#161

Unifying Visual and Vision-Language Tracking via Contrastive Learning

AAAI 2024paperarXiv:2401.11228
47
citations
#162

AltDiffusion: A Multilingual Text-to-Image Diffusion Model

Fulong Ye, Guang Liu, Xinya Wu et al.

AAAI 2024paperarXiv:2308.09991
47
citations
#163

Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following

Seonghyeon Ye, Hyeonbin Hwang, Sohee Yang et al.

AAAI 2024paperarXiv:2302.14691
46
citations
#164

DS-AL: A Dual-Stream Analytic Learning for Exemplar-Free Class-Incremental Learning

Huiping Zhuang, Run He, Kai Tong et al.

AAAI 2024paperarXiv:2403.17503
46
citations
#165

DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models

Namhyuk Ahn, Junsoo Lee, Chunggi Lee et al.

AAAI 2024paperarXiv:2309.06933
46
citations
#166

Learn to Follow: Decentralized Lifelong Multi-Agent Pathfinding via Planning and Learning

Alexey Skrynnik, Anton Andreychuk, Maria Nesterova et al.

AAAI 2024paperarXiv:2310.01207
45
citations
#167

DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval

Xiangpeng Yang, Linchao Zhu, Xiaohan Wang et al.

AAAI 2024paperarXiv:2401.10588
45
citations
#168

TOP-ReID: Multi-Spectral Object Re-identification with Token Permutation

Yuhao Wang, Xuehu Liu, Pingping Zhang et al.

AAAI 2024paperarXiv:2312.09612
45
citations
#169

Cross-Layer and Cross-Sample Feature Optimization Network for Few-Shot Fine-Grained Image Classification

Zhen-Xiang Ma, Zhen-Duo Chen, Li-Jun Zhao et al.

AAAI 2024paper
45
citations
#170

Fine-Grained Prototypes Distillation for Few-Shot Object Detection

Zichen Wang, Bo Yang, Haonan Yue et al.

AAAI 2024paperarXiv:2401.07629
44
citations
#171

LLMRG: Improving Recommendations through Large Language Model Reasoning Graphs

Yan Wang, Zhixuan Chu, Xin Ouyang et al.

AAAI 2024paper
44
citations
#172

Text2Analysis: A Benchmark of Table Question Answering with Advanced Data Analysis and Unclear Queries

Xinyi He, Mengyu Zhou, Xinrun Xu et al.

AAAI 2024paperarXiv:2312.13671
44
citations
#173

Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style

Shuai Tan, Bin Ji, Ye Pan

AAAI 2024paperarXiv:2403.06365
44
citations
#174

Unsupervised Continual Anomaly Detection with Contrastively-Learned Prompt

Jiaqi Liu, Kai Wu, Qiang Nie et al.

AAAI 2024paperarXiv:2401.01010
44
citations
#175

Towards Real-World Test-Time Adaptation: Tri-net Self-Training with Balanced Normalization

Yongyi Su, Xun Xu, Kui Jia

AAAI 2024paperarXiv:2309.14949
43
citations
#176

What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection

XiaoHui Zhang, Jiangyan Yi, Chenglong Wang et al.

AAAI 2024paperarXiv:2312.09651
43
citations
#177

TC-LIF: A Two-Compartment Spiking Neuron Model for Long-Term Sequential Modelling

Shimin Zhang, Qu Yang, Chenxiang Ma et al.

AAAI 2024paperarXiv:2308.13250
43
citations
#178

PREFER: Prompt Ensemble Learning via Feedback-Reflect-Refine

Chenrui Zhang, Lin Liu, Chuyuan Wang et al.

AAAI 2024paperarXiv:2308.12033
43
citations
#179

Debiasing Multimodal Sarcasm Detection with Contrastive Learning

Mengzhao Jia, Can Xie, Liqiang Jing

AAAI 2024paperarXiv:2312.10493
43
citations
#180

Elijah: Eliminating Backdoors Injected in Diffusion Models via Distribution Shift

Shengwei An, Sheng-Yen Chou, Kaiyuan Zhang et al.

AAAI 2024paperarXiv:2312.00050
43
citations
#181

Fine-Grained Distillation for Long Document Retrieval

Yucheng Zhou, Tao Shen, Xiubo Geng et al.

AAAI 2024paperarXiv:2212.10423
42
citations
#182

Object-Aware Domain Generalization for Object Detection

WooJu Lee, Dasol Hong, Hyungtae Lim et al.

AAAI 2024paperarXiv:2312.12133
42
citations
#183

EulerMormer: Robust Eulerian Motion Magnification via Dynamic Filtering within Transformer

Fei Wang, Dan Guo, Kun Li et al.

AAAI 2024paperarXiv:2312.04152
42
citations
#184

Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models

Yubin Wang, Xinyang Jiang, De Cheng et al.

AAAI 2024paperarXiv:2312.06323
42
citations
#185

Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images

Qingping Zheng, Yuanfan Guo, Jiankang Deng et al.

AAAI 2024paperarXiv:2308.16582
42
citations
#186

Large Language Models Are Neurosymbolic Reasoners

Meng Fang, Shilong Deng, Yudi Zhang et al.

AAAI 2024paperarXiv:2401.09334
41
citations
#187

Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval

Zhihang Liu, Jun Li, Hongtao Xie et al.

AAAI 2024paperarXiv:2312.12155
41
citations
#188

A Diffusion-Based Framework for Multi-Class Anomaly Detection

Haoyang He, Jiangning Zhang, Hongxu Chen et al.

AAAI 2024paperarXiv:2312.06607
41
citations
#189

Attribute-Missing Graph Clustering Network

Wenxuan Tu, Renxiang Guan, Sihang Zhou et al.

AAAI 2024paper
41
citations
#190

Norm Tweaking: High-Performance Low-Bit Quantization of Large Language Models

Liang Li, Qingyuan Li, Bo Zhang et al.

AAAI 2024paperarXiv:2309.02784
41
citations
#191

Vision-Language Pre-training with Object Contrastive Learning for 3D Scene Understanding

Taolin Zhang, Sunan He, Tao Dai et al.

AAAI 2024paperarXiv:2305.10714
41
citations
#192

Frequency Spectrum Is More Effective for Multimodal Representation and Fusion: A Multimodal Spectrum Rumor Detector

An Lao, Qi Zhang, Chongyang Shi et al.

AAAI 2024paperarXiv:2312.11023
41
citations
#193

Translate Meanings, Not Just Words: IdiomKB’s Role in Optimizing Idiomatic Translation with Language Models

Shuang Li, Jiangjie Chen, Siyu Yuan et al.

AAAI 2024paperarXiv:2308.13961
40
citations
#194

Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Operation

Xinshuo Hu, Dongfang Li, Zihao Zheng et al.

AAAI 2024paperarXiv:2308.08090
40
citations
#195

XKD: Cross-Modal Knowledge Distillation with Domain Alignment for Video Representation Learning

Pritam Sarkar, Ali Etemad

AAAI 2024paperarXiv:2211.13929
40
citations
#196

DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing

Conglong Li, Zhewei Yao, Xiaoxia Wu et al.

AAAI 2024paperarXiv:2212.03597
40
citations
#197

Few Shot Part Segmentation Reveals Compositional Logic for Industrial Anomaly Detection

Soopil Kim, Sion An, Philip Chikontwe et al.

AAAI 2024paperarXiv:2312.13783
40
citations
#198

How to Protect Copyright Data in Optimization of Large Language Models?

Timothy Chu, Zhao Song, Chiwun Yang

AAAI 2024paperarXiv:2308.12247
40
citations
#199

Rethinking Propagation for Unsupervised Graph Domain Adaptation

Meihan Liu, Zeyu Fang, Zhen Zhang et al.

AAAI 2024paperarXiv:2402.05660
40
citations
#200

Devignet: High-Resolution Vignetting Removal via a Dual Aggregated Fusion Transformer with Adaptive Channel Expansion

Shenghong Luo, Xuhang Chen, Weiwen Chen et al.

AAAI 2024paperarXiv:2308.13739
40
citations
PreviousNext