Most Cited AAAI "multi-view reasoning pyramid" Papers
5,317 papers found • Page 3 of 27
Conference
Unleashing the Potential of Large Language Models as Prompt Optimizers: Analogical Analysis with Gradient-based Model Optimizers
Xinyu Tang, Xiaolei Wang, Wayne Xin Zhao et al.
LAMM: Label Alignment for Multi-Modal Prompt Learning
Jingsheng Gao, Jiacheng Ruan, Suncheng Xiang et al.
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling
Rui Liu, Yifan Hu, Yi Ren et al.
TopoGCL: Topological Graph Contrastive Learning
Yuzhou Chen, Jose Frias, Yulia Gel
Boosting Adversarial Transferability across Model Genus by Deformation-Constrained Warping
Qinliang Lin, Cheng Luo, Zenghao Niu et al.
DP-AdamBC: Your DP-Adam Is Actually DP-SGD (Unless You Apply Bias Correction)
Qiaoyue Tang, Frederick Shpilevskiy, Mathias Lécuyer
Entropic Open-Set Active Learning
Bardia Safaei, Vibashan VS, Celso de Melo et al.
ContraNovo: A Contrastive Learning Approach to Enhance De Novo Peptide Sequencing
Zhi Jin, Sheng Xu, Xiang Zhang et al.
Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education Systems
Weibo Gao, Qi Liu, Linan Yue et al.
Higher-Order Graph Convolutional Network with Flower-Petals Laplacians on Simplicial Complexes
Yiming Huang, Yujie Zeng, Qiang Wu et al.
Conditional Variational Autoencoder for Sign Language Translation with Cross-Modal Alignment
Rui Zhao, Liang Zhang, Biao Fu et al.
Chinese Spelling Correction as Rephrasing Language Model
Linfeng Liu, Hongqiu Wu, Hai Zhao
Cross-Covariate Gait Recognition: A Benchmark
Shinan Zou, Chao Fan, Jianbo Xiong et al.
Enriching Multimodal Sentiment Analysis Through Textual Emotional Descriptions of Visual-Audio Content
Sheng Wu, Dongxiao He, Xiaobao Wang et al.
A Generalized Neural Diffusion Framework on Graphs
10011 Yibo Li, Xiao Wang, Hongrui Liu et al.
UMIE: Unified Multimodal Information Extraction with Instruction Tuning
Lin Sun, Kai Zhang, Qingyuan Li et al.
Graphic Design with Large Multimodal Model
Yutao Cheng, Zhao Zhang, Maoke Yang et al.
Vision Transformer Off-the-Shelf: A Surprising Baseline for Few-Shot Class-Agnostic Counting
Zhicheng Wang, Liwen Xiao, Zhiguo Cao et al.
Zero-1-to-3: Domain-Level Zero-Shot Cognitive Diagnosis via One Batch of Early-Bird Students towards Three Diagnostic Objectives
Weibo Gao, Qi Liu, Hao Wang et al.
Offline and Online Optical Flow Enhancement for Deep Video Compression
Chuanbo Tang, Xihua Sheng, Zhuoyuan Li et al.
SeqGPT: An Out-of-the-Box Large Language Model for Open Domain Sequence Understanding
Tianyu Yu, Chengyue Jiang, Chao Lou et al.
AMD: Autoregressive Motion Diffusion
Bo Han, Hao Peng, Minjing Dong et al.
Learning Domain-Independent Heuristics for Grounded and Lifted Planning
BEV-MAE: Bird’s Eye View Masked Autoencoders for Point Cloud Pre-training in Autonomous Driving Scenarios
ZhiWei Lin, Yongtao Wang, Shengxiang Qi et al.
Auto-Prox: Training-Free Vision Transformer Architecture Search via Automatic Proxy Discovery
Zimian Wei, Peijie Dong, Zheng Hui et al.
NLSR: Neuron-Level Safety Realignment of Large Language Models Against Harmful Fine-Tuning
Xin Yi, Shunfan Zheng, Linlin Wang et al.
Frozen CLIP Transformer Is an Efficient Point Cloud Encoder
Xiaoshui Huang, Zhou Huang, Sheng Li et al.
Navigating Open Set Scenarios for Skeleton-Based Action Recognition
Kunyu Peng, Cheng Yin, Junwei Zheng et al.
TagCLIP: A Local-to-Global Framework to Enhance Open-Vocabulary Multi-Label Classification of CLIP without Training
Yuqi Lin, Minghao Chen, Kaipeng Zhang et al.
DC-NAS: Divide-and-Conquer Neural Architecture Search for Multi-Modal Classification
Xinyan Liang, Pinhan Fu, Qian Guo et al.
TrackGo: A Flexible and Efficient Method for Controllable Video Generation
Haitao Zhou, Chuang Wang, Rui Nie et al.
Critic-Guided Decision Transformer for Offline Reinforcement Learning
Yuanfu Wang, Chao Yang, Ying Wen et al.
Cooper: Coordinating Specialized Agents towards a Complex Dialogue Goal
Yi Cheng, Wenge Liu, Jian Wang et al.
Say Anything with Any Style
Shuai Tan, Bin Ji, Yu Ding et al.
Automatic Radiology Reports Generation via Memory Alignment Network
Hongyu Shen, Mingtao Pei, Juncai Liu et al.
Polyper: Boundary Sensitive Polyp Segmentation
Hao Shao, Yang Zhang, Qibin Hou
SparX: A Sparse Cross-Layer Connection Mechanism for Hierarchical Vision Mamba and Transformer Networks
Meng Lou, Yunxiang Fu, Yizhou Yu
NILUT: Conditional Neural Implicit 3D Lookup Tables for Image Enhancement
Marcos Conde, Javier Vazquez-Corral, Michael Brown et al.
Graph Disentangled Contrastive Learning with Personalized Transfer for Cross-Domain Recommendation
Jing Liu, Lele Sun, Wei-zhi Nie et al.
Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
Qirui Chen, Shangzhe Di, Weidi Xie
Small Model Can Self-Correct
Haixia Han, Jiaqing Liang, Jie Shi et al.
Out-of-Distribution Detection in Long-Tailed Recognition with Calibrated Outlier Class Learning
Wenjun Miao, Guansong Pang, Xiao Bai et al.
When Hypergraph Meets Heterophily: New Benchmark Datasets and Baseline
Ming Li, Yongchun Gu, Yi Wang et al.
DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification
Yuhao Wang, Yang Liu, Aihua Zheng et al.
Can We Get Rid of Handcrafted Feature Extractors? SparseViT: Nonsemantics-Centered, Parameter-Efficient Image Manipulation Localization Through Spare-Coding Transformer
Lei Su, Xiaochen Ma, Xuekang Zhu et al.
HyperFast: Instant Classification for Tabular Data
David Bonet, Daniel Mas Montserrat, Xavier Giró-i-Nieto et al.
A Comprehensive Overhaul of Multimodal Assistant with Small Language Models
Minjie Zhu, Yichen Zhu, Ning Liu et al.
Dual Self-Paced Cross-Modal Hashing
Yuan Sun, Jian Dai, Zhenwen Ren et al.
Why Does Dropping Edges Usually Outperform Adding Edges in Graph Contrastive Learning?
Yanchen Xu, Siqi Huang, Hongyuan Zhang et al.
CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs
Siyu Wang, Cailian Chen, Xinyi Le et al.
Working Memory Capacity of ChatGPT: An Empirical Study
Dongyu Gong, Xingchen Wan, Dingmin Wang
Generalization Analysis of Machine Learning Algorithms via the Worst-Case Data-Generating Probability Measure
Xinying Zou, Samir Perlaza, Inaki Esnaola et al.
FedDiv: Collaborative Noise Filtering for Federated Learning with Noisy Labels
Authors: Jichang Li, Guanbin Li, Hui Cheng et al.
Motif-Aware Riemannian Graph Neural Network with Generative-Contrastive Learning
Li Sun, Zhenhao Huang, Zixi Wang et al.
eTag: Class-Incremental Learning via Embedding Distillation and Task-Oriented Generation
Libo Huang, Yan Zeng, Chuanguang Yang et al.
Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement Learning
Jinxin Liu, Ziqi Zhang, Zhenyu Wei et al.
Perception-Guided Jailbreak Against Text-to-Image Models
Yihao Huang, Le Liang, Tianlin Li et al.
Mean Teacher DETR with Masked Feature Alignment: A Robust Domain Adaptive Detection Transformer Framework
Weixi Weng, Chun Yuan
Object-Aware Adaptive-Positivity Learning for Audio-Visual Question Answering
Zhangbin Li, Jinxing Zhou, Dan Guo et al.
A Multi-Modal Contrastive Diffusion Model for Therapeutic Peptide Generation
Yongkang Wang, Xuan Liu, Feng Huang et al.
DisCo: Graph-Based Disentangled Contrastive Learning for Cold-Start Cross-Domain Recommendation
Hourun Li, Yifan Wang, Zhiping Xiao et al.
Text-Based Occluded Person Re-identification via Multi-Granularity Contrastive Consistency Learning
Xinyi Wu, Wentao Ma, Dan Guo et al.
FRED: Towards a Full Rotation-Equivariance in Aerial Image Object Detection
Chanho Lee, Jinsu Son, Hyounguk Shon et al.
DiffuseHigh: Training-Free Progressive High-Resolution Image Synthesis Through Structure Guidance
Younghyun Kim, Geunmin Hwang, Junyu Zhang et al.
SCD-Net: Spatiotemporal Clues Disentanglement Network for Self-Supervised Skeleton-Based Action Recognition
Cong Wu, Xiao-Jun Wu, Josef Kittler et al.
Reinforcement Learning and Data
Generation for Syntax-Guided Synthesis
Equity-Transformer: Solving NP-Hard Min-Max Routing Problems as Sequential Generation with Equity Context
Jiwoo Son, Minsu Kim, Sanghyeok Choi et al.
Recasting Regional Lighting for Shadow Removal
Yuhao Liu, Zhanghan Ke, Ke Xu et al.
2382 SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-Form Layout-to-Image Generation
Chengyou Jia, Minnan Luo, Zhuohang Dang et al.
BadRL: Sparse Targeted Backdoor Attack against Reinforcement Learning
Jing Cui, Yufei Han, Yuzhe Ma et al.
DTL: Disentangled Transfer Learning for Visual Recognition
Minghao Fu, Ke Zhu, Jianxin Wu
Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA
Wentao Mo, Yang Liu
Exploring Unbiased Deepfake Detection via Token-Level Shuffling and Mixing
Xinghe Fu, Zhiyuan Yan, Taiping Yao et al.
UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios
Baichuan Zhou, Haote Yang, Dairong Chen et al.
Open Models, Closed Minds? On Agents Capabilities in Mimicking Human Personalities through Open Large Language Models
Lucio La Cava, Andrea Tagarelli
Multi-Class Support Vector Machine with Maximizing Minimum Margin
Feiping Nie, Zhezheng Hao, Rong Wang
Debate on Graph: A Flexible and Reliable Reasoning Framework for Large Language Models
Jie Ma, Zhitao Gao, Qi Chai et al.
Robust Tracking via Mamba-based Context-aware Token Learning
Jinxia Xie, Bineng Zhong, Qihua Liang et al.
BLADE: Enhancing Black-Box Large Language Models with Small Domain-Specific Models
Haitao Li, Qingyao Ai, Jia Chen et al.
ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning
Chen-Xiao Gao, Chenyang Wu, Mingjun Cao et al.
GroundVLP: Harnessing Zero-Shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection
Haozhan Shen, Tiancheng Zhao, Mingwei Zhu et al.
SGFormer: Semantic Graph Transformer for Point Cloud-Based 3D Scene Graph Generation
Changsheng Lv, Mengshi Qi, Xia Li et al.
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Yinmin Zhang, Jie Liu, Chuming Li et al.
SimDistill: Simulated Multi-Modal Distillation for BEV 3D Object Detection
Haimei Zhao, Qiming Zhang, Shanshan Zhao et al.
ParGo: Bridging Vision-Language with Partial and Global Views
An-Lan Wang, Bin Shan, Wei Shi et al.
AesFA: An Aesthetic Feature
Aware Arbitrary Neural Style Transfer
Empowering LLMs with Pseudo-Untrimmed Videos for Audio-Visual Temporal Understanding
Yunlong Tang, Daiki Shimada, Jing Bi et al.
CLIM: Contrastive Language-Image Mosaic for Region Representation
Size Wu, Wenwei Zhang, Lumin XU et al.
Task-Agnostic Privacy-Preserving Representation Learning for Federated Learning against Attribute Inference Attacks
Caridad Arroyo Arevalo, Sayedeh Leila Noorbakhsh, Yun Dong et al.
SasWOT: Real-Time Semantic Segmentation Architecture Search WithOut Training
Chendi Zhu, Lujun Li, Yuli Wu et al.
A Dual Stealthy Backdoor: From Both Spatial and Frequency Perspectives
Yudong Gao, Honglong Chen, Peng Sun et al.
MindTuner: Cross-Subject Visual Decoding with Visual Fingerprint and Semantic Correction
Zixuan Gong, Qi Zhang, Guangyin Bao et al.
Towards a Multimodal Large Language Model with Pixel-Level Insight for Biomedicine
Xiaoshuang Huang, Lingdong Shen, Jia Liu et al.
EarnHFT: Efficient Hierarchical Reinforcement Learning for High Frequency Trading
Molei Qin, Shuo Sun, Wentao Zhang et al.
Efficient Deweahter Mixture-of-Experts with Uncertainty-Aware Feature-Wise Linear Modulation
Rongyu Zhang, Yulin Luo, Jiaming Liu et al.
HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models
Kazi Hasan Ibn Arif, JinYi Yoon, Dimitrios S. Nikolopoulos et al.
MagiCapture: High-Resolution Multi-Concept Portrait Customization
9256 Junha Hyung, Jaeyo Shin, Jaegul Choo
VITA: ‘Carefully Chosen and Weighted Less’ Is Better in Medication Recommendation
VQCNIR: Clearer Night Image Restoration with Vector-Quantized Codebook
Wenbin Zou, Hongxia Gao, Tian Ye et al.
Towards Robust Knowledge Unlearning: An Adversarial Framework for Assessing and Improving Unlearning Robustness in Large Language Models
Hongbang Yuan, Zhuoran Jin, Pengfei Cao et al.
BSAFusion: A Bidirectional Stepwise Feature Alignment Network for Unaligned Medical Image Fusion
Huafeng Li, Dayong Su, Qing Cai et al.
LLMEmb: Large Language Model Can Be a Good Embedding Generator for Sequential Recommendation
Qidong Liu, Xian Wu, Wanyu Wang et al.
Disguise without Disruption: Utility-Preserving Face De-identification
Zikui Cai, Zhongpai Gao, Benjamin Planche et al.
Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models
Xiao Cui, Mo Zhu, Yulei Qin et al.
Exploiting Diffusion Prior for Real-World Image Dehazing with Unpaired Training
Yunwei Lan, Zhigao Cui, Chang Liu et al.
Catalyst for Clustering-Based Unsupervised Object Re-identification: Feature Calibration
Huafeng Li, Qingsong Hu, Zhanxuan Hu
Text-to-Image Generation for Abstract Concepts
Jiayi Liao, Xu Chen, Qiang Fu et al.
Runtime Analysis of the SMS-EMOA for Many-Objective Optimization
Weijie Zheng, Benjamin Doerr
Decoupled Contrastive Learning for Long-Tailed Recognition
Shiyu Xuan, Shiliang Zhang
NodeMixup: Tackling Under-Reaching for Graph Neural Networks
Weigang Lu, Ziyu Guan, Wei Zhao et al.
G-NAS: Generalizable Neural Architecture Search for Single Domain Generalization Object Detection
Fan Wu, Jinling Gao, Lanqing Hong et al.
FastLGS: Speeding Up Language Embedded Gaussians with Feature Grid Mapping
Yuzhou Ji, He Zhu, Junshu Tang et al.
MambaPro: Multi-Modal Object Re-identification with Mamba Aggregation and Synergistic Prompt
Yuhao Wang, Xuehu Liu, Tianyu Yan et al.
EvoChart: A Benchmark and a Self-Training Approach Towards Real-World Chart Understanding
Muye Huang, Han Lai, Xinyu Zhang et al.
SHaRPose: Sparse High-Resolution Representation for Human Pose Estimation
Xiaoqi An, Lin Zhao, Chen Gong et al.
Adversarial Socialbots Modeling Based on Structural Information Principles
Xianghua Zeng, Hao Peng, Angsheng Li
OpenVIS: Open-vocabulary Video Instance Segmentation
Pinxue Guo, Hao Huang, Peiyang He et al.
Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-identification
Jiaer Xia, Lei Tan, Pingyang Dai et al.
CLIP-CID: Efficient CLIP Distillation via Cluster-Instance Discrimination
Kaicheng Yang, Tiancheng Gu, Xiang An et al.
Semantic-Guided Generative Image Augmentation Method with Diffusion Models for Image Classification
Bohan Li, Xiao Xu, Xinghao Wang et al.
Transitivity-Preserving Graph Representation Learning for Bridging Local Connectivity and Role-Based Similarity
Van Thuy Hoang, O-Joun Lee
HybridGait: A Benchmark for Spatial-Temporal Cloth-Changing Gait Recognition with Hybrid Explorations
Yilan Dong, Chunlin Yu, Ruiyang Ha et al.
Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
Yun Qu, Yuhang Jiang, Boyuan Wang et al.
Dynamic Weighted Combiner for Mixed-Modal Image Retrieval
Fuxiang Huang, Lei Zhang, Xiaowei Fu et al.
V2Meow: Meowing to the Visual Beat via Video-to-Music Generation
Kun Su, Judith Li, Qingqing Huang et al.
Decoupled Spatio-Temporal Consistency Learning for Self-Supervised Tracking
Yaozong Zheng, Bineng Zhong, Qihua Liang et al.
Does Few-Shot Learning Suffer from Backdoor Attacks?
Xinwei Liu, Xiaojun Jia, Jindong Gu et al.
NightHaze: Nighttime Image Dehazing via Self-Prior Learning
Beibei Lin, Yeying Jin, Yan Wending et al.
Leveraging Large Language Models for Node Generation in Few-Shot Learning on Text-Attributed Graphs
Jianxiang Yu, Yuxiang Ren, Chenghua Gong et al.
Graph Contrastive Invariant Learning from the Causal Perspective
9672 Yanhu Mo, Xiao Wang, Shaohua Fan et al.
MM-TTS: Multi-Modal Prompt Based Style Transfer for Expressive Text-to-Speech Synthesis
Wenhao Guan, Yishuang Li, Tao Li et al.
WPMixer: Efficient Multi-Resolution Mixing for Long-Term Time Series Forecasting
Md Mahmuddun Nabi Murad, Mehmet Aktukmak, Yasin Yilmaz
GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval
Yuting Wang, Jinpeng Wang, Bin Chen et al.
SCP: Spherical-Coordinate-Based Learned Point Cloud Compression
Ao Luo, Linxin Song, Keisuke Nonaka et al.
Deep Active Learning with Noise Stability
Xingjian Li, Pengkun Yang, Yangcheng Gu et al.
Non-exemplar Online Class-Incremental Continual Learning via Dual-Prototype Self-Augment and Refinement
Fushuo Huo, Wenchao Xu, Jingcai Guo et al.
Generalizable Sleep Staging via Multi-Level Domain Alignment
Jiquan Wang, Sha Zhao, Haiteng Jiang et al.
Learning to Reweight for Generalizable Graph Neural Network
Zhengyu Chen, Teng Xiao, Kun Kuang et al.
MLNet: Mutual Learning Network with Neighborhood Invariance for Universal Domain Adaptation
Yanzuo Lu, Meng Shen, Andy J Ma et al.
NightRain: Nighttime Video Deraining via Adaptive-Rain-Removal and Adaptive-Correction
Beibei Lin, Yeying Jin, Wending Yan et al.
AdapterGNN: Parameter-Efficient Fine-Tuning Improves Generalization in GNNs
Shengrui Li, Xueting Han, Jing Bai
Adaptive Guidance: Training-free Acceleration of Conditional Diffusion Models
Angela Castillo, Jonas Kohler, Juan C. Pérez et al.
SCANS: Mitigating the Exaggerated Safety for LLMs via Safety-Conscious Activation Steering
Zouying Cao, Yifei Yang, Hai Zhao
MeDM: Mediating Image Diffusion Models for Video-to-Video Translation with Temporal Correspondence Guidance
Ernie Chu, Tzuhsuan Huang, Shuo-Yen LIN et al.
L4DR: LiDAR-4DRadar Fusion for Weather-Robust 3D Object Detection
Xun Huang, Ziyu Xu, Hai Wu et al.
$z$-SignFedAvg: A Unified Stochastic Sign-Based Compression for Federated Learning
Zhiwei Tang, Yanmeng Wang, Tsung-Hui Chang
CARAT: Contrastive Feature Reconstruction and Aggregation for Multi-Modal Multi-Label Emotion Recognition
Cheng Peng, Ke Chen, Lidan Shou et al.
Trash to Treasure: Low-Light Object Detection via Decomposition-and-Aggregation
Xiaohan Cui, Long Ma, Tengyu Ma et al.
7471 PNeRFLoc: Visual Localization with Point-Based Neural Radiance Fields
Boming Zhao, Luwei Yang, Mao Mao et al.
Learning Time Slot Preferences via Mobility Tree for Next POI Recommendation
Tianhao Huang, Xuan Pan, Xiangrui Cai et al.
AE-NeRF: Audio Enhanced Neural Radiance Field for Few Shot Talking Head Synthesis
Dongze Li, Kang Zhao, Wei Wang et al.
Sterling: Synergistic Representation Learning on Bipartite Graphs
Baoyu Jing, Yuchen Yan, Kaize Ding et al.
WeditGAN: Few-Shot Image Generation via Latent Space Relocation
Yuxuan Duan, Li Niu, Yan Hong et al.
LLMEval: A Preliminary Study on How to Evaluate Large Language Models
Yue Zhang, Ming Zhang, HaiPeng Yuan et al.
Hyperspectral Image Reconstruction via Combinatorial Embedding of Cross-Channel Spatio-Spectral Clues
Xingxing Yang, Jie Chen, Zaifeng Yang
Numerical Pruning for Efficient Autoregressive Models
Xuan Shen, Zhao Song, Yufa Zhou et al.
GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models
Jian Ma, Yonglin Deng, Chen Chen et al.
Symbolic Cognitive Diagnosis via Hybrid Optimization for Intelligent Education Systems
Junhao Shen, Hong Qian, Wei Zhang et al.
Retrieval-Augmented Primitive Representations for Compositional Zero-Shot Learning
Chenchen Jing, Yukun Li, Hao Chen et al.
Effective Diffusion Transformer Architecture for Image Super-Resolution
Kun Cheng, Lei Yu, Zhijun Tu et al.
ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models
Jiaxiang Cheng, Pan Xie, Xin Xia et al.
Multi-Domain Incremental Learning for Face Presentation Attack Detection
Keyao Wang, Guosheng Zhang, Haixiao Yue et al.
Weakly-Supervised Temporal Action Localization by Inferring Salient Snippet-Feature
Wu Yun, Mengshi Qi, Chuanming Wang et al.
On the Role of Server Momentum in Federated Learning
Jianhui Sun, Xidong Wu, Heng Huang et al.
Beyond Mimicking Under-Represented Emotions: Deep Data Augmentation with Emotional Subspace Constraints for EEG-Based Emotion Recognition
Zhi ZHANG, Sheng-hua Zhong, Yan Liu
Summarizing Stream Data for Memory-Constrained Online Continual Learning
Jianyang Gu, Kai Wang, Wei Jiang et al.
Learning to Prompt Knowledge Transfer for Open-World Continual Learning
Yujie Li, Xin Yang, Hao Wang et al.
DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code Generation
Qiming Zhu, Jialun Cao, Yaojie Lu et al.
Simple Image-Level Classification Improves Open-Vocabulary Object Detection
Ruohuan Fang, Guansong Pang, Xiao Bai
Audio Entailment: Assessing Deductive Reasoning for Audio Understanding
Soham Deshmukh, Shuo Han, Hazim Bukhari et al.
A Diffusion-Based Pre-training Framework for Crystal Property Prediction
Zixing Song, Ziqiao Meng, Irwin King
Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-Hoc Retrieval
Weihang Su, Qingyao Ai, Xiangsheng Li et al.
B2Opt: Learning to Optimize Black-box Optimization with Little Budget
Xiaobin Li, Kai Wu, Xiaoyu Zhang et al.
Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation
Hui Fu, Zeqing Wang, Ke Gong et al.
Game4Loc: A UAV Geo-Localization Benchmark from Game Data
Yuxiang Ji, Boyong He, Zhuoyue Tan et al.
Causal Walk: Debiasing Multi-Hop Fact Verification with Front-Door Adjustment
Feature Transportation Improves Graph Neural Networks
Moshe Eliasof, Eldad Haber, Eran Treister
How to Overcome Curse-of-Dimensionality for Out-of-Distribution Detection?
Soumya Suvra Ghosal, Yiyou Sun, Yixuan Li
Sketch and Refine: Towards Fast and Accurate Lane Detection
Chao Chen, Jie Liu, Chang Zhou et al.
Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-Free Multi-Exposure Image Fusion
Guanyao Wu, Hongming Fu, Jinyuan Liu et al.
FoldToken: Learning Protein Language via Vector Quantization and Beyond
Zhangyang Gao, Cheng Tan, Jue Wang et al.
Bi-ViT: Pushing the Limit of Vision Transformer Quantization
Yanjing Li, Sheng Xu, Mingbao Lin et al.
CLIP-Gaze: Towards General Gaze Estimation via Visual-Linguistic Model
Pengwei Yin, Guanzhong Zeng, Jingjing Wang et al.
GCNext: Towards the Unity of Graph Convolutions for Human Motion Prediction
Xinshun Wang, Qiongjie Cui, Chen Chen et al.
SAMFlow: Eliminating Any Fragmentation in Optical Flow with Segment Anything Model
FM-OV3D: Foundation Model-Based Cross-Modal Knowledge Blending for Open-Vocabulary 3D Detection
Dongmei Zhang, Chang Li, Renrui Zhang et al.
GridFormer: Point-Grid Transformer for Surface Reconstruction
Shengtao Li, Ge Gao, Yudong Liu et al.
Convolutional Channel-Wise Competitive Learning for the Forward-Forward Algorithm
Andreas Papachristodoulou, Christos Kyrkou, Stelios Timotheou et al.
DiffAIL: Diffusion Adversarial Imitation Learning
Bingzheng Wang, Guoqiang Wu, Teng Pang et al.
Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models
Yuanzhao Zhai, Tingkai Yang, Kele Xu et al.
Towards Explainable Joint Models via Information Theory for Multiple Intent Detection and Slot Filling
Xianwei Zhuang, Xuxin Cheng, Yuexian Zou
Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget
Johannes Lehner, Benedikt Alkin, Andreas Fürst et al.
Decoding AI’s Nudge: A Unified Framework to Predict Human Behavior in AI-Assisted Decision Making
Zhuoyan Li, Zhuoran Lu, Ming Yin
MV-VTON: Multi-View Virtual Try-On with Diffusion Models
Haoyu Wang, Zhilu Zhang, Donglin Di et al.
UniGen: A Unified Generative Framework for Retrieval and Question Answering with Large Language Models
Xiaoxi Li, Yujia Zhou, Zhicheng Dou
EDA: Evolving and Distinct Anchors for Multimodal Motion Prediction
Longzhong Lin, Xuewu Lin, Tianwei Lin et al.
Is Sarcasm Detection a Step-by-Step Reasoning Process in Large Language Models?
Ben Yao, Yazhou Zhang, Qiuchi Li et al.
T2MAC: Targeted and Trusted Multi-Agent Communication through Selective Engagement and Evidence-Driven Integration
Chuxiong Sun, Zehua Zang, Jiabao Li et al.
Aligning Geometric Spatial Layout in Cross-View Geo-Localization via Feature Recombination
Qingwang Zhang, Yingying Zhu
Question Calibration and Multi-Hop Modeling for Temporal Question Answering
Chao Xue, Di Liang, Pengfei Wang et al.