Most Cited AAAI 2024 "aggregate metrics" Papers
2,289 papers found • Page 1 of 12
Conference
T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion
Chong Mou, Xintao Wang, Liangbin Xie et al.
Benchmarking Large Language Models in Retrieval-Augmented Generation
Jiawei Chen, Hongyu Lin, Xianpei Han et al.
Preference Ranking Optimization for Human Alignment
Feifan Song, Bowen Yu, Minghao Li et al.
Follow Your Pose: Pose-Guided Text-to-Video Generation Using Pose-Free Videos
Yue Ma, Yingqing HE, Xiaodong Cun et al.
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models
Gengze Zhou, Yicong Hong, Qi Wu
NuScenes-QA: A Multi-Modal Visual Question Answering Benchmark for Autonomous Driving
Tianwen Qian, Jingjing Chen, Linhai Zhuo et al.
MedSegDiff-V2: Diffusion-based Medical Image Segmentation with Transformer
Junde Wu, Wei Ji, Huazhu Fu et al.
Detecting and Preventing Hallucinations in Large Vision Language Models
Anisha Gunjal, Jihan Yin, Erhan Bas
AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models
Zhaopeng Gu, Bingke Zhu, Guibo Zhu et al.
Omni-Kernel Network for Image Restoration
Yuning Cui, Wenqi Ren, Alois Knoll
Knowledge Graph Prompting for Multi-Document Question Answering
Yu Wang, Nedim Lipka, Ryan A. Rossi et al.
Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-World Multi-Turn Dialogue
Songhua Yang, Hanjie Zhao, Senbin Zhu et al.
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions
Wenbo Hu, Yifan Xu, Yi Li et al.
MSGNet: Learning Multi-Scale Inter-series Correlations for Multivariate Time Series Forecasting
Wanlin Cai, Yuxuan Liang, Xianggen Liu et al.
ODTrack: Online Dense Temporal Token Learning for Visual Tracking
Yaozong Zheng, Bineng Zhong, Qihua Liang et al.
Fast Machine Unlearning without Retraining through Selective Synaptic Dampening
Jack Foster, Stefan Schoepf, Alexandra Brintrup
VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection
Peng Wu, Xuerong Zhou, Guansong Pang et al.
ResDiff: Combining CNN and Diffusion Model for Image Super-resolution
Shuyao Shang, Zhengyang Shan, Guangxing Liu et al.
Task Contamination: Language Models May Not Be Few-Shot Anymore
Changmao Li, Jeffrey Flanigan
SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research
Liangtai Sun, Yang Han, Zihan Zhao et al.
SCTNet: Single Branch CNN with Transformer Semantic Information for Real-Time Segmentation
Authors: Zhengze Xu, Dongyue Wu, Changqian Yu et al.
FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering
Zhenyu Li, Sunqi Fan, Yu Gu et al.
SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation
Wenxi Yue, Jing Zhang, Kun Hu et al.
Mitigating Large Language Model Hallucinations via Autonomous Knowledge Graph-Based Retrofitting
Xinyan Guan, Yanjiang Liu, Hongyu Lin et al.
Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations
Likang Wu, Zhaopeng Qiu, Zhi Zheng et al.
TimesURL: Self-Supervised Contrastive Learning for Universal Time Series Representation Learning
jiexi Liu, Songcan Chen
OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models
Changhun Lee, Jungyu Jin, Taesu Kim et al.
Fully-Connected Spatial-Temporal Graph for Multivariate Time-Series Data
Yucheng Wang, Yuecong Xu, Jianfei Yang et al.
An Attentive Inductive Bias for Sequential Recommendation beyond the Self-Attention
Yehjin Shin, Jeongwhan Choi, Hyowon Wi et al.
Rolling-Unet: Revitalizing MLP’s Ability to Efficiently Extract Long-Distance Dependencies for Medical Image Segmentation
Yutong Liu, Haijiang Zhu, Mengting Liu et al.
UCMCTrack: Multi-Object Tracking with Uniform Camera Motion Compensation
Kefu Yi, Kai Luo, Xiaolei Luo et al.
Fluctuation-Based Adaptive Structured Pruning for Large Language Models
Yongqi An, Xu Zhao, Tao Yu et al.
An Empirical Study of CLIP for Text-Based Person Search
Cao Min, Yang Bai, ziyin Zeng et al.
Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models
Ruichen Wang, Zekang Chen, Chen Chen et al.
8976 PointAttN: You Only Need Attention for Point Cloud Completion
Jun Wang, Ying Cui, Dongyan Guo et al.
Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties
Taylor Sorensen, Liwei Jiang, Jena Hwang et al.
Decoupled Contrastive Multi-View Clustering with High-Order Random Walks
Yiding Lu, Yijie Lin, Mouxing Yang et al.
Reliable Conflictive Multi-View Learning
Cai Xu, Jiajun Si, Ziyu Guan et al.
FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning
Haokun Chen, Yao Zhang, Denis Krompass et al.
VIGC: Visual Instruction Generation and Correction
Théo Delemazure, Jérôme Lang, Grzegorz Pierczyński
Prompt-Based Distribution Alignment for Unsupervised Domain Adaptation
Shuanghao Bai, Min Zhang, Wanqi Zhou et al.
VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense Scene Understanding
Yi Xin, Junlong Du, Qiang Wang et al.
DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge Detection
Yunfan Ye, Yuhang Huang, Renjiao Yi et al.
KAM-CoT: Knowledge Augmented Multimodal Chain-of-Thoughts Reasoning
Debjyoti Mondal, Suraj Modi, Subhadarshi Panda et al.
GLOP: Learning Global Partition and Local Construction for Solving Large-Scale Routing Problems in Real-Time
Haoran Ye, Jiarui Wang, Helan Liang et al.
RewriteLM: An Instruction-Tuned Large Language Model for Text Rewriting
Lei Shu, Liangchen Luo, Jayakumar Hoskere et al.
MetaDiff: Meta-Learning with Conditional Diffusion for Few-Shot Learning
Baoquan Zhang, Chuyao Luo, Demin Yu et al.
FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning
Zhenhua Yang, Dezhi Peng, Yuxin Kong et al.
V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models
Heng Wang, Jianbo Ma, Santiago Pascual et al.
Graph Neural Prompting with Large Language Models
Yijun Tian, Huan Song, Zichen Wang et al.
Enhancing Job Recommendation through LLM-Based Generative Adversarial Networks
Yingpeng Du, Di Luo, Rui Yan et al.
Temporal Adaptive RGBT Tracking with Modality Prompt
Hongyu Wang, Xiaotao Liu, Yifan Li et al.
FedASMU: Efficient Asynchronous Federated Learning with Dynamic Staleness-Aware Model Update
Ji Liu, Juncheng Jia, Tianshi Che et al.
Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation
Zhewei Yao, Xiaoxia Wu, Cheng Li et al.
SpikingBERT: Distilling BERT to Train Spiking Language Models Using Implicit Differentiation
Malyaban Bal, Abhronil Sengupta
Plug-In Diffusion Model for Sequential Recommendation
Haokai Ma, Ruobing Xie, Lei Meng et al.
Learning to Rank in Generative Retrieval
Yongqi Li, Nan Yang, Liang Wang et al.
NeRF-LiDAR: Generating Realistic LiDAR Point Clouds with Neural Radiance Fields
Junge Zhang, Feihu Zhang, Shaochen Kuang et al.
Make RepVGG Greater Again: A Quantization-Aware Approach
Xuesong Nie, Yunfeng Yan, Siyuan Li et al.
DiffusionTrack: Diffusion Model for Multi-Object Tracking
Run Luo, Zikai Song, Lintao Ma et al.
HARDVS: Revisiting Human Activity Recognition with Dynamic Vision Sensors
Xiao Wang, Zongzhen Wu, Bo Jiang et al.
Towards Compact 3D Representations via Point Feature Enhancement Masked Autoencoders
Yaohua Zha, Huizhen Ji, Jinmin Li et al.
Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons
Yuheng Chen, Pengfei Cao, Yubo Chen et al.
HGPrompt: Bridging Homogeneous and Heterogeneous Graphs for Few-Shot Prompt Learning
Xingtong Yu, Yuan Fang, Zemin Liu et al.
Ternary Spike: Learning Ternary Spikes for Spiking Neural Networks
Yufei Guo, Yuanpei Chen, Xiaode Liu et al.
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation
Shilin Yan, Renrui Zhang, Ziyu Guo et al.
DocFormerv2: Local Features for Document Understanding
Srikar Appalaraju, Peng Tang, Qi Dong et al.
Large Language Models Are Clinical Reasoners: Reasoning-Aware Diagnosis Framework with Prompt-Generated Rationales
Taeyoon Kwon, Kai Ong, Dongjin Kang et al.
PC-Conv: Unifying Homophily and Heterophily with Two-Fold Filtering
Bingheng Li, Erlin Pan, Zhao Kang
MASTER: Market-Guided Stock Transformer for Stock Price Forecasting
Tong Li, Zhaoyang Liu, Yanyan Shen et al.
Correlation Matching Transformation Transformers for UHD Image Restoration
Cong Wang, Jinshan Pan, Wei Wang et al.
Editing Language Model
Based Knowledge Graph Embeddings
Context-I2W: Mapping Images to Context-Dependent Words for Accurate Zero-Shot Composed Image Retrieval
Yuanmin Tang, Jing Yu, Keke Gai et al.
SelfPromer: Self-Prompt Dehazing Transformers with Depth-Consistency
8137 Feiyu Zhu, Reid Simmons
SECap: Speech Emotion Captioning with Large Language Model
Yaoxun Xu, Hangting Chen, Jianwei Yu et al.
Delving into Multimodal Prompting for Fine-Grained Visual Classification
Xin Jiang, Hao Tang, Junyao Gao et al.
DrFuse: Learning Disentangled Representation for Clinical Multi-Modal Fusion with Missing Modality and Modal Inconsistency
Wenfang Yao, Kejing Yin, William Cheung et al.
Watermarking Conditional Text Generation for AI Detection: Unveiling Challenges and a Semantic-Aware Watermark Remedy
Yu Fu, Deyi Xiong, Yue Dong
TEILP: Time Prediction over Knowledge Graphs via Logical Reasoning
Siheng Xiong, Yuan Yang, Ali Payani et al.
VLCounter: Text-Aware Visual Representation for Zero-Shot Object Counting
Seunggu Kang, WonJun Moon, Euiyeon Kim et al.
GAMC: An Unsupervised Method for Fake News Detection Using Graph Autoencoder with Masking
Shu Yin, Peican Zhu, Lianwei Wu et al.
LatestEval: Addressing Data Contamination in Language Model Evaluation through Dynamic and Time
Sensitive Test Construction - Yucheng Li, Frank Guerin, Chenghua Lin
SQLdepth: Generalizable Self-Supervised Fine-Structured Monocular Depth Estimation
Dong Wu, Mingmin Chi, Xuan Zang et al.
Advancing Spatial Reasoning in Large Language Models: An In-Depth Evaluation and Enhancement Using the StepGame Benchmark
Fangjun Li, David C. Hogg, Anthony G. Cohn
Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-Modal Structured Representations
Yufeng Huang, Jiji Tang, Zhuo Chen et al.
A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators
Chen Zhang, L. F. D’Haro, Yiming Chen et al.
Feature Fusion from Head to Tail for Long-Tailed Visual Recognition
Mengke Li, Zhikai HU, Yang Lu et al.
EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering
Junjue Wang, Zhuo Zheng, Zihang Chen et al.
LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection
hongcheng Guo, Jian Yang, Jiaheng Liu et al.
S2WAT: Image Style Transfer via Hierarchical Vision Transformer Using Strips Window Attention
Chiyu Zhang, Xiaogang Xu, Lei Wang et al.
TOP-ReID: Multi-Spectral Object Re-identification with Token Permutation
Yuhao Wang, Xuehu Liu, Pingping Zhang et al.
Cross-Layer and Cross-Sample Feature Optimization Network for Few-Shot Fine-Grained Image Classification
Zhen-Xiang Ma, Zhen-Duo Chen, Li-Jun Zhao et al.
DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models
Namhyuk Ahn, Junsoo Lee, Chunggi Lee et al.
Fine-Grained Prototypes Distillation for Few-Shot Object Detection
Zichen Wang, Bo Yang, Haonan Yue et al.
Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style
Shuai Tan, Bin Ji, Ye Pan
Text2Analysis: A Benchmark of Table Question Answering with Advanced Data Analysis and Unclear Queries
Xinyi He, Mengyu Zhou, Xinrun Xu et al.
Debiasing Multimodal Sarcasm Detection with Contrastive Learning
Mengzhao Jia, Can Xie, Liqiang Jing
PREFER: Prompt Ensemble Learning via Feedback-Reflect-Refine
Chenrui Zhang, Lin Liu, Chuyuan Wang et al.
TC-LIF: A Two-Compartment Spiking Neuron Model for Long-Term Sequential Modelling
Shimin Zhang, Qu Yang, Chenxiang Ma et al.
Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval
Zhihang Liu, Jun Li, Hongtao Xie et al.
Devignet: High-Resolution Vignetting Removal via a Dual Aggregated Fusion Transformer with Adaptive Channel Expansion
Shenghong Luo, Xuhang Chen, Weiwen Chen et al.
A Diffusion-Based Framework for Multi-Class Anomaly Detection
Haoyang He, Jiangning Zhang, Hongxu Chen et al.
Attribute-Missing Graph Clustering Network
Wenxuan Tu, Renxiang Guan, Sihang Zhou et al.
Controllable Mind Visual Diffusion Model
Bohan Zeng, Shanglin Li, Xuhui Liu et al.
Towards Continual Knowledge Graph Embedding via Incremental Distillation
Jiajun Liu, Ke Wenjun, Peng Wang et al.
Multi-Architecture Multi-Expert Diffusion Models
Yunsung Lee, Jin-Young Kim, Hyojun Go et al.
No Prejudice! Fair Federated Graph Neural Networks for Personalized Recommendation
Nimesh Agrawal, Anuj Sirohi, Sandeep Kumar et al.
Text-Guided Molecule Generation with Diffusion Language Model
Haisong Gong, Qiang Liu, Shu Wu et al.
Latent Space Editing in Transformer-Based Flow Matching
Vincent Tao Hu, Wei Zhang, Meng Tang et al.
Frequency Spectrum Is More Effective for Multimodal Representation and Fusion: A Multimodal Spectrum Rumor Detector
An Lao, Qi Zhang, Chongyang Shi et al.
Few Shot Part Segmentation Reveals Compositional Logic for Industrial Anomaly Detection
Soopil Kim, Sion An, Philip Chikontwe et al.
XKD: Cross-Modal Knowledge Distillation with Domain Alignment for Video Representation Learning
Pritam Sarkar, Ali Etemad
Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer
Yaoting Wang, Liu Weisong, Guangyao Li et al.
Federated Adaptive Prompt Tuning for Multi-Domain Collaborative Learning
Shangchao Su, Mingzhao Yang, Bin Li et al.
STEM: Unleashing the Power of Embeddings for Multi-Task Recommendation
Liangcai Su, Junwei Pan, Ximei Wang et al.
SlowTrack: Increasing the Latency of Camera-Based Perception in Autonomous Driving Using Adversarial Examples
Chen Ma, Ningfei Wang, Qi Alfred Chen et al.
MathAttack: Attacking Large Language Models towards Math Solving Ability
Zihao Zhou, Qiufeng Wang, Mingyu Jin et al.
DiffBEV: Conditional Diffusion Model for Bird’s Eye View Perception
Jiayu Zou, Kun Tian, Zheng Zhu et al.
U-mixer: An Unet-Mixer Architecture with Stationarity Correction for Time Series Forecasting
Xiang Ma, Xuemei Li, Lexin Fang et al.
Translate Meanings, Not Just Words: IdiomKB’s Role in Optimizing Idiomatic Translation with Language Models
Shuang Li, Jiangjie Chen, Siyu Yuan et al.
Exploiting Label Skews in Federated Learning with Model Concatenation
Yiqun Diao, Qinbin Li, Bingsheng He
SD-MVS: Segmentation-Driven Deformation Multi-View Stereo with Spherical Refinement and EM Optimization
Zhenlong Yuan, Jiakai Cao, Zhaoxin Li et al.
Mono3DVG: 3D Visual Grounding in Monocular Images
Yangfan Zhan, Yuan Yuan, Zhitong Xiong
NeuSurf: On-Surface Priors for Neural Surface Reconstruction from Sparse Input Views
Han Huang, Yulun Wu, Junsheng Zhou et al.
LION: Implicit Vision Prompt Tuning
Haixin Wang, Jianlong Chang, Yihang Zhai et al.
Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model
Decheng Liu, Xijun Wang, Chunlei Peng et al.
Transformer-Based No-Reference Image Quality Assessment via Supervised Contrastive Learning
Jinsong Shi, Pan Gao, Jie Qin
Multi-Modal Latent Space Learning for Chain-of-Thought Reasoning in Language Models
Liqi He, Zuchao Li, Xiantao Cai et al.
Multi-Prompts Learning with Cross-Modal Alignment for Attribute-Based Person Re-identification
Yajing Zhai, Yawen Zeng, Zhiyong Huang et al.
Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
Yiwen Tang, Ray Zhang, Zoey Guo et al.
Concept-Guided Prompt Learning for Generalization in Vision-Language Models
Yi Zhang, Ce Zhang, Ke Yu et al.
FairSIN: Achieving Fairness in Graph Neural Networks through Sensitive Information Neutralization
Cheng Yang, Jixi Liu, Yunhe Yan et al.
Explaining Generalization Power of a DNN Using Interactive Concepts
Huilin Zhou, Hao Zhang, Huiqi Deng et al.
Synergistic Multiscale Detail Refinement via Intrinsic Supervision for Underwater Image Enhancement
Dehuan Zhang, Jingchun Zhou, Chunle Guo et al.
Token-Level Contrastive Learning with Modality-Aware Prompting for Multimodal Intent Recognition
Qianrui Zhou, Hua Xu, Hao Li et al.
Provably Powerful Graph Neural Networks for Directed Multigraphs
Beni Egressy, Luc von Niederhäusern, Jovan Blanuša et al.
Frequency-Adaptive Pan-Sharpening with Mixture of Experts
Xuanhua He, Keyu Yan, Rui Li et al.
Rethinking Graph Masked Autoencoders through Alignment and Uniformity
Liang Wang, Xiang Tao, Qiang Liu et al.
CFR-ICL: Cascade-Forward Refinement with Iterative Click Loss for Interactive Image Segmentation
Shoukun Sun, Min Xian, Fei Xu et al.
Exploring Sparse Visual Prompt for Domain Adaptive Dense Prediction
Senqiao Yang, Jiarui Wu, Jiaming Liu et al.
Graph-Aware Contrasting for Multivariate Time-Series Classification
Yucheng Wang, Yuecong Xu, Jianfei Yang et al.
Graph Invariant Learning with Subgraph Co-mixup for Out-of-Distribution Generalization
Tianrui Jia, Haoyang Li, Cheng Yang et al.
TCI-Former: Thermal Conduction-Inspired Transformer for Infrared Small Target Detection
Tianxiang Chen, Zhentao Tan, Qi Chu et al.
Distilling Autoregressive Models to Obtain High-Performance Non-autoregressive Solvers for Vehicle Routing Problems with Faster Inference Speed
Yubin Xiao, Di Wang, Boyang Li et al.
G2P-DDM: Generating Sign Pose Sequence from Gloss Sequence with Discrete Diffusion Model
Pan Xie, Qipeng Zhang, Peng Taiying et al.
RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering Assisted Distillation
Haiming Zhang, Xu Yan, Dongfeng Bai et al.
Attention Guided CAM: Visual Explanations of Vision Transformer Guided by Self-Attention
Saebom Leem, Hyunseok Seo
Audio Generation with Multiple Conditional Diffusion Model
Zhifang Guo, Jianguo Mao, Tao Rui et al.
G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning for Graph Transformer Networks
Anchun Gui, Jinqiang Ye, Han Xiao
Domain-Controlled Prompt Learning
Qinglong Cao, Zhengqin Xu, Yuntian Chen et al.
Deep Contrastive Graph Learning with Clustering-Oriented Guidance
Mulin Chen, Bocheng Wang, Xuelong Li
Relevant Intrinsic Feature Enhancement Network for Few-Shot Semantic Segmentation
Xiaoyi Bao, Jie Qin, Siyang Sun et al.
TopoGCL: Topological Graph Contrastive Learning
Yuzhou Chen, Jose Frias, Yulia Gel
Unifying Multi-Modal Uncertainty Modeling and Semantic Alignment for Text-to-Image Person Re-identification
Zhiwei Zhao, Bin Liu, Yan Lu et al.
Beyond TreeSHAP: Efficient Computation of Any-Order Shapley Interactions for Tree Ensembles
Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer et al.
Root Cause Analysis in Microservice Using Neural Granger Causal Discovery
Cheng-Ming Lin, Ching Chang, Wei-Yao Wang et al.
UMIE: Unified Multimodal Information Extraction with Instruction Tuning
Lin Sun, Kai Zhang, Qingyuan Li et al.
Zero-1-to-3: Domain-Level Zero-Shot Cognitive Diagnosis via One Batch of Early-Bird Students towards Three Diagnostic Objectives
Weibo Gao, Qi Liu, Hao Wang et al.
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling
Rui Liu, Yifan Hu, Yi Ren et al.
Chinese Spelling Correction as Rephrasing Language Model
Linfeng Liu, Hongqiu Wu, Hai Zhao
Entropic Open-Set Active Learning
Bardia Safaei, Vibashan VS, Celso de Melo et al.
DC-NAS: Divide-and-Conquer Neural Architecture Search for Multi-Modal Classification
Xinyan Liang, Pinhan Fu, Qian Guo et al.
LAMM: Label Alignment for Multi-Modal Prompt Learning
Jingsheng Gao, Jiacheng Ruan, Suncheng Xiang et al.
Vision Transformer Off-the-Shelf: A Surprising Baseline for Few-Shot Class-Agnostic Counting
Zhicheng Wang, Liwen Xiao, Zhiguo Cao et al.
Auto-Prox: Training-Free Vision Transformer Architecture Search via Automatic Proxy Discovery
Zimian Wei, Peijie Dong, Zheng Hui et al.
Region-Disentangled Diffusion Model for High-Fidelity PPG-to-ECG Translation
Debaditya Shome, Pritam Sarkar, Ali Etemad
A Generalized Neural Diffusion Framework on Graphs
10011 Yibo Li, Xiao Wang, Hongrui Liu et al.
Cooper: Coordinating Specialized Agents towards a Complex Dialogue Goal
Yi Cheng, Wenge Liu, Jian Wang et al.
Higher-Order Graph Convolutional Network with Flower-Petals Laplacians on Simplicial Complexes
Yiming Huang, Yujie Zeng, Qiang Wu et al.
eTag: Class-Incremental Learning via Embedding Distillation and Task-Oriented Generation
Libo Huang, Yan Zeng, Chuanguang Yang et al.
Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA
Wentao Mo, Yang Liu
Motif-Aware Riemannian Graph Neural Network with Generative-Contrastive Learning
Li Sun, Zhenhao Huang, Zixi Wang et al.
Text-Based Occluded Person Re-identification via Multi-Granularity Contrastive Consistency Learning
Xinyi Wu, Wentao Ma, Dan Guo et al.
Generalization Analysis of Machine Learning Algorithms via the Worst-Case Data-Generating Probability Measure
Xinying Zou, Samir Perlaza, Inaki Esnaola et al.
FRED: Towards a Full Rotation-Equivariance in Aerial Image Object Detection
Chanho Lee, Jinsu Son, Hyounguk Shon et al.
Navigating Open Set Scenarios for Skeleton-Based Action Recognition
Kunyu Peng, Cheng Yin, Junwei Zheng et al.
HyperFast: Instant Classification for Tabular Data
David Bonet, Daniel Mas Montserrat, Xavier Giró-i-Nieto et al.
2382 SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-Form Layout-to-Image Generation
Chengyou Jia, Minnan Luo, Zhuohang Dang et al.
Automatic Radiology Reports Generation via Memory Alignment Network
Hongyu Shen, Mingtao Pei, Juncai Liu et al.
Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement Learning
Jinxin Liu, Ziqi Zhang, Zhenyu Wei et al.
BadRL: Sparse Targeted Backdoor Attack against Reinforcement Learning
Jing Cui, Yufei Han, Yuzhe Ma et al.
Small Model Can Self-Correct
Haixia Han, Jiaqing Liang, Jie Shi et al.
Offline and Online Optical Flow Enhancement for Deep Video Compression
Chuanbo Tang, Xihua Sheng, Zhuoyuan Li et al.
ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning
Chen-Xiao Gao, Chenyang Wu, Mingjun Cao et al.
Multi-Class Support Vector Machine with Maximizing Minimum Margin
Feiping Nie, Zhezheng Hao, Rong Wang
Out-of-Distribution Detection in Long-Tailed Recognition with Calibrated Outlier Class Learning
Wenjun Miao, Guansong Pang, Xiao Bai et al.
DTL: Disentangled Transfer Learning for Visual Recognition
Minghao Fu, Ke Zhu, Jianxin Wu
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Yinmin Zhang, Jie Liu, Chuming Li et al.
Efficient Deweahter Mixture-of-Experts with Uncertainty-Aware Feature-Wise Linear Modulation
Rongyu Zhang, Yulin Luo, Jiaming Liu et al.
EarnHFT: Efficient Hierarchical Reinforcement Learning for High Frequency Trading
Molei Qin, Shuo Sun, Wentao Zhang et al.
Semantic-Guided Generative Image Augmentation Method with Diffusion Models for Image Classification
Bohan Li, Xiao Xu, Xinghao Wang et al.
SimDistill: Simulated Multi-Modal Distillation for BEV 3D Object Detection
Haimei Zhao, Qiming Zhang, Shanshan Zhao et al.
AesFA: An Aesthetic Feature
Aware Arbitrary Neural Style Transfer
CLIM: Contrastive Language-Image Mosaic for Region Representation
Size Wu, Wenwei Zhang, Lumin XU et al.
SCD-Net: Spatiotemporal Clues Disentanglement Network for Self-Supervised Skeleton-Based Action Recognition
Cong Wu, Xiao-Jun Wu, Josef Kittler et al.
NodeMixup: Tackling Under-Reaching for Graph Neural Networks
Weigang Lu, Ziyu Guan, Wei Zhao et al.
Runtime Analysis of the SMS-EMOA for Many-Objective Optimization
Weijie Zheng, Benjamin Doerr
Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-identification
Jiaer Xia, Lei Tan, Pingyang Dai et al.
Object-Aware Adaptive-Positivity Learning for Audio-Visual Question Answering
Zhangbin Li, Jinxing Zhou, Dan Guo et al.
Does Few-Shot Learning Suffer from Backdoor Attacks?
Xinwei Liu, Xiaojun Jia, Jindong Gu et al.