Most Cited AAAI "span mask pre-training" Papers
5,317 papers found • Page 2 of 27
Conference
LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation
Mushui Liu, Yuhang Ma, Zhen Yang et al.
MAGIC: Generating Self-Correction Guideline for In-Context Text-to-SQL
Arian Askari, Christian Poelitz, Xinye Tang
Selective Forgetting: Advancing Machine Unlearning Techniques and Evaluation in Language Models
Lingzhi Wang, Xingshan Zeng, Jinsong Guo et al.
Synergistic Multiscale Detail Refinement via Intrinsic Supervision for Underwater Image Enhancement
Dehuan Zhang, Jingchun Zhou, Chunle Guo et al.
FairSIN: Achieving Fairness in Graph Neural Networks through Sensitive Information Neutralization
Cheng Yang, Jixi Liu, Yunhe Yan et al.
SCALM: Detecting Bad Practices in Smart Contracts Through LLMs
Zongwei Li, Xiaoqi Li, Wenkai Li et al.
Explaining Generalization Power of a DNN Using Interactive Concepts
Huilin Zhou, Hao Zhang, Huiqi Deng et al.
Stable-Hair: Real-World Hair Transfer via Diffusion Model
Yuxuan Zhang, Qing Zhang, Yiren Song et al.
Provably Powerful Graph Neural Networks for Directed Multigraphs
Beni Egressy, Luc von Niederhäusern, Jovan Blanuša et al.
Concept-Guided Prompt Learning for Generalization in Vision-Language Models
Yi Zhang, Ce Zhang, Ke Yu et al.
Token-Level Contrastive Learning with Modality-Aware Prompting for Multimodal Intent Recognition
Qianrui Zhou, Hua Xu, Hao Li et al.
Frequency-Adaptive Pan-Sharpening with Mixture of Experts
Xuanhua He, Keyu Yan, Rui Li et al.
G2P-DDM: Generating Sign Pose Sequence from Gloss Sequence with Discrete Diffusion Model
Pan Xie, Qipeng Zhang, Peng Taiying et al.
Rethinking Graph Masked Autoencoders through Alignment and Uniformity
Liang Wang, Xiang Tao, Qiang Liu et al.
Relevant Intrinsic Feature Enhancement Network for Few-Shot Semantic Segmentation
Xiaoyi Bao, Jie Qin, Siyang Sun et al.
CFR-ICL: Cascade-Forward Refinement with Iterative Click Loss for Interactive Image Segmentation
Shoukun Sun, Min Xian, Fei Xu et al.
Exploring Sparse Visual Prompt for Domain Adaptive Dense Prediction
Senqiao Yang, Jiarui Wu, Jiaming Liu et al.
Test-Time Domain Adaptation by Learning Domain-Aware Batch Normalization
Yanan Wu, Zhixiang Chi, Yang Wang et al.
Graph Invariant Learning with Subgraph Co-mixup for Out-of-Distribution Generalization
Tianrui Jia, Haoyang Li, Cheng Yang et al.
Audio Generation with Multiple Conditional Diffusion Model
Zhifang Guo, Jianguo Mao, Tao Rui et al.
Graph-Aware Contrasting for Multivariate Time-Series Classification
Yucheng Wang, Yuecong Xu, Jianfei Yang et al.
Harnessing Large Language Models for Knowledge Graph Question Answering via Adaptive Multi-Aspect Retrieval-Augmentation
Derong Xu, Xinhang Li, Ziheng Zhang et al.
Domain-Controlled Prompt Learning
Qinglong Cao, Zhengqin Xu, Yuntian Chen et al.
PointRWKV: Efficient RWKV-Like Model for Hierarchical Point Cloud Learning
Qingdong He, Jiangning Zhang, Jinlong Peng et al.
TCI-Former: Thermal Conduction-Inspired Transformer for Infrared Small Target Detection
Tianxiang Chen, Zhentao Tan, Qi Chu et al.
RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering Assisted Distillation
Haiming Zhang, Xu Yan, Dongfeng Bai et al.
G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning for Graph Transformer Networks
Anchun Gui, Jinqiang Ye, Han Xiao
Distilling Autoregressive Models to Obtain High-Performance Non-autoregressive Solvers for Vehicle Routing Problems with Faster Inference Speed
Yubin Xiao, Di Wang, Boyang Li et al.
Guided Real Image Dehazing Using YCbCr Color Space
Wenxuan Fang, Junkai Fan, Yu Zheng et al.
Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference
Barys Liskavets, Maxim Ushakov, Shuvendu Roy et al.
Root Cause Analysis in Microservice Using Neural Granger Causal Discovery
Cheng-Ming Lin, Ching Chang, Wei-Yao Wang et al.
Learning Generalized Medical Image Segmentation from Decoupled Feature Queries
1207 Qi Bi, Jingjun Yi, Hao Zheng et al.
Attention Guided CAM: Visual Explanations of Vision Transformer Guided by Self-Attention
Saebom Leem, Hyunseok Seo
CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models
Zihui Cheng, Qiguang Chen, Jin Zhang et al.
Deep Contrastive Graph Learning with Clustering-Oriented Guidance
Mulin Chen, Bocheng Wang, Xuelong Li
Evolutionary Large Language Model for Automated Feature Transformation
Nanxu Gong, Chandan K Reddy, Wangyang Ying et al.
Unifying Multi-Modal Uncertainty Modeling and Semantic Alignment for Text-to-Image Person Re-identification
Zhiwei Zhao, Bin Liu, Yan Lu et al.
Region-Disentangled Diffusion Model for High-Fidelity PPG-to-ECG Translation
Debaditya Shome, Pritam Sarkar, Ali Etemad
Unveiling the Impact of Coding Data Instruction Fine-Tuning on Large Language Models Reasoning
Xinlu Zhang, Zhiyu Zoey Chen, Xi Ye et al.
ConDSeg: A General Medical Image Segmentation Framework via Contrast-Driven Feature Enhancement
Mengqi Lei, Haochen Wu, Xinhua Lv et al.
TopoGCL: Topological Graph Contrastive Learning
Yuzhou Chen, Jose Frias, Yulia Gel
Beyond TreeSHAP: Efficient Computation of Any-Order Shapley Interactions for Tree Ensembles
Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer et al.
Vision Transformer Off-the-Shelf: A Surprising Baseline for Few-Shot Class-Agnostic Counting
Zhicheng Wang, Liwen Xiao, Zhiguo Cao et al.
UMIE: Unified Multimodal Information Extraction with Instruction Tuning
Lin Sun, Kai Zhang, Qingyuan Li et al.
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling
Rui Liu, Yifan Hu, Yi Ren et al.
Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education Systems
Weibo Gao, Qi Liu, Linan Yue et al.
Chinese Spelling Correction as Rephrasing Language Model
Linfeng Liu, Hongqiu Wu, Hai Zhao
Zero-1-to-3: Domain-Level Zero-Shot Cognitive Diagnosis via One Batch of Early-Bird Students towards Three Diagnostic Objectives
Weibo Gao, Qi Liu, Hao Wang et al.
Entropic Open-Set Active Learning
Bardia Safaei, Vibashan VS, Celso de Melo et al.
Cooper: Coordinating Specialized Agents towards a Complex Dialogue Goal
Yi Cheng, Wenge Liu, Jian Wang et al.
Auto-Prox: Training-Free Vision Transformer Architecture Search via Automatic Proxy Discovery
Zimian Wei, Peijie Dong, Zheng Hui et al.
Offline and Online Optical Flow Enhancement for Deep Video Compression
Chuanbo Tang, Xihua Sheng, Zhuoyuan Li et al.
Enriching Multimodal Sentiment Analysis Through Textual Emotional Descriptions of Visual-Audio Content
Sheng Wu, Dongxiao He, Xiaobao Wang et al.
DP-AdamBC: Your DP-Adam Is Actually DP-SGD (Unless You Apply Bias Correction)
Qiaoyue Tang, Frederick Shpilevskiy, Mathias Lécuyer
LAMM: Label Alignment for Multi-Modal Prompt Learning
Jingsheng Gao, Jiacheng Ruan, Suncheng Xiang et al.
DC-NAS: Divide-and-Conquer Neural Architecture Search for Multi-Modal Classification
Xinyan Liang, Pinhan Fu, Qian Guo et al.
Exploring Enhanced Contextual Information for Video-Level Object Tracking
Ben Kang, Xin Chen, Simiao Lai et al.
Graphic Design with Large Multimodal Model
Yutao Cheng, Zhao Zhang, Maoke Yang et al.
Dual Self-Paced Cross-Modal Hashing
Yuan Sun, Jian Dai, Zhenwen Ren et al.
Higher-Order Graph Convolutional Network with Flower-Petals Laplacians on Simplicial Complexes
Yiming Huang, Yujie Zeng, Qiang Wu et al.
DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification
Yuhao Wang, Yang Liu, Aihua Zheng et al.
A Multi-Modal Contrastive Diffusion Model for Therapeutic Peptide Generation
Yongkang Wang, Xuan Liu, Feng Huang et al.
A Generalized Neural Diffusion Framework on Graphs
10011 Yibo Li, Xiao Wang, Hongrui Liu et al.
Perception-Guided Jailbreak Against Text-to-Image Models
Yihao Huang, Le Liang, Tianlin Li et al.
A Comprehensive Overhaul of Multimodal Assistant with Small Language Models
Minjie Zhu, Yichen Zhu, Ning Liu et al.
When Hypergraph Meets Heterophily: New Benchmark Datasets and Baseline
Ming Li, Yongchun Gu, Yi Wang et al.
UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios
Baichuan Zhou, Haote Yang, Dairong Chen et al.
Navigating Open Set Scenarios for Skeleton-Based Action Recognition
Kunyu Peng, Cheng Yin, Junwei Zheng et al.
Automatic Radiology Reports Generation via Memory Alignment Network
Hongyu Shen, Mingtao Pei, Juncai Liu et al.
Working Memory Capacity of ChatGPT: An Empirical Study
Dongyu Gong, Xingchen Wan, Dingmin Wang
eTag: Class-Incremental Learning via Embedding Distillation and Task-Oriented Generation
Libo Huang, Yan Zeng, Chuanguang Yang et al.
BadRL: Sparse Targeted Backdoor Attack against Reinforcement Learning
Jing Cui, Yufei Han, Yuzhe Ma et al.
GroundVLP: Harnessing Zero-Shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection
Haozhan Shen, Tiancheng Zhao, Mingwei Zhu et al.
2382 SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-Form Layout-to-Image Generation
Chengyou Jia, Minnan Luo, Zhuohang Dang et al.
Generalization Analysis of Machine Learning Algorithms via the Worst-Case Data-Generating Probability Measure
Xinying Zou, Samir Perlaza, Inaki Esnaola et al.
HyperFast: Instant Classification for Tabular Data
David Bonet, Daniel Mas Montserrat, Xavier Giró-i-Nieto et al.
FRED: Towards a Full Rotation-Equivariance in Aerial Image Object Detection
Chanho Lee, Jinsu Son, Hyounguk Shon et al.
CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs
Siyu Wang, Cailian Chen, Xinyi Le et al.
Text-Based Occluded Person Re-identification via Multi-Granularity Contrastive Consistency Learning
Xinyi Wu, Wentao Ma, Dan Guo et al.
LEARN: Knowledge Adaptation from Large Language Model to Recommendation for Practical Industrial Application
Jian Jia, Yipei Wang, Yan Li et al.
Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA
Wentao Mo, Yang Liu
Small Model Can Self-Correct
Haixia Han, Jiaqing Liang, Jie Shi et al.
DiffuseHigh: Training-Free Progressive High-Resolution Image Synthesis Through Structure Guidance
Younghyun Kim, Geunmin Hwang, Junyu Zhang et al.
Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement Learning
Jinxin Liu, Ziqi Zhang, Zhenyu Wei et al.
Motif-Aware Riemannian Graph Neural Network with Generative-Contrastive Learning
Li Sun, Zhenhao Huang, Zixi Wang et al.
CLIM: Contrastive Language-Image Mosaic for Region Representation
Size Wu, Wenwei Zhang, Lumin XU et al.
Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
Qirui Chen, Shangzhe Di, Weidi Xie
Empowering LLMs with Pseudo-Untrimmed Videos for Audio-Visual Temporal Understanding
Yunlong Tang, Daiki Shimada, Jing Bi et al.
ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning
Chen-Xiao Gao, Chenyang Wu, Mingjun Cao et al.
NLSR: Neuron-Level Safety Realignment of Large Language Models Against Harmful Fine-Tuning
Xin Yi, Shunfan Zheng, Linlin Wang et al.
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Yinmin Zhang, Jie Liu, Chuming Li et al.
Out-of-Distribution Detection in Long-Tailed Recognition with Calibrated Outlier Class Learning
Wenjun Miao, Guansong Pang, Xiao Bai et al.
DTL: Disentangled Transfer Learning for Visual Recognition
Minghao Fu, Ke Zhu, Jianxin Wu
Efficient Deweahter Mixture-of-Experts with Uncertainty-Aware Feature-Wise Linear Modulation
Rongyu Zhang, Yulin Luo, Jiaming Liu et al.
Open Models, Closed Minds? On Agents Capabilities in Mimicking Human Personalities through Open Large Language Models
Lucio La Cava, Andrea Tagarelli
Multi-Class Support Vector Machine with Maximizing Minimum Margin
Feiping Nie, Zhezheng Hao, Rong Wang
SasWOT: Real-Time Semantic Segmentation Architecture Search WithOut Training
Chendi Zhu, Lujun Li, Yuli Wu et al.
NodeMixup: Tackling Under-Reaching for Graph Neural Networks
Weigang Lu, Ziyu Guan, Wei Zhao et al.
Runtime Analysis of the SMS-EMOA for Many-Objective Optimization
Weijie Zheng, Benjamin Doerr
Decoupled Spatio-Temporal Consistency Learning for Self-Supervised Tracking
Yaozong Zheng, Bineng Zhong, Qihua Liang et al.
SCD-Net: Spatiotemporal Clues Disentanglement Network for Self-Supervised Skeleton-Based Action Recognition
Cong Wu, Xiao-Jun Wu, Josef Kittler et al.
Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-identification
Jiaer Xia, Lei Tan, Pingyang Dai et al.
Object-Aware Adaptive-Positivity Learning for Audio-Visual Question Answering
Zhangbin Li, Jinxing Zhou, Dan Guo et al.
AesFA: An Aesthetic Feature
Aware Arbitrary Neural Style Transfer
Exploring Unbiased Deepfake Detection via Token-Level Shuffling and Mixing
Xinghe Fu, Zhiyuan Yan, Taiping Yao et al.
Semantic-Guided Generative Image Augmentation Method with Diffusion Models for Image Classification
Bohan Li, Xiao Xu, Xinghao Wang et al.
SimDistill: Simulated Multi-Modal Distillation for BEV 3D Object Detection
Haimei Zhao, Qiming Zhang, Shanshan Zhao et al.
SparX: A Sparse Cross-Layer Connection Mechanism for Hierarchical Vision Mamba and Transformer Networks
Meng Lou, Yunxiang Fu, Yizhou Yu
Catalyst for Clustering-Based Unsupervised Object Re-identification: Feature Calibration
Huafeng Li, Qingsong Hu, Zhanxuan Hu
FastLGS: Speeding Up Language Embedded Gaussians with Feature Grid Mapping
Yuzhou Ji, He Zhu, Junshu Tang et al.
EarnHFT: Efficient Hierarchical Reinforcement Learning for High Frequency Trading
Molei Qin, Shuo Sun, Wentao Zhang et al.
Towards Robust Knowledge Unlearning: An Adversarial Framework for Assessing and Improving Unlearning Robustness in Large Language Models
Hongbang Yuan, Zhuoran Jin, Pengfei Cao et al.
Retrieval-Augmented Primitive Representations for Compositional Zero-Shot Learning
Chenchen Jing, Yukun Li, Hao Chen et al.
Robust Tracking via Mamba-based Context-aware Token Learning
Jinxia Xie, Bineng Zhong, Qihua Liang et al.
MindTuner: Cross-Subject Visual Decoding with Visual Fingerprint and Semantic Correction
Zixuan Gong, Qi Zhang, Guangyin Bao et al.
V2Meow: Meowing to the Visual Beat via Video-to-Music Generation
Kun Su, Judith Li, Qingqing Huang et al.
WeditGAN: Few-Shot Image Generation via Latent Space Relocation
Yuxuan Duan, Li Niu, Yan Hong et al.
Does Few-Shot Learning Suffer from Backdoor Attacks?
Xinwei Liu, Xiaojun Jia, Jindong Gu et al.
Multi-Domain Incremental Learning for Face Presentation Attack Detection
Keyao Wang, Guosheng Zhang, Haixiao Yue et al.
MeDM: Mediating Image Diffusion Models for Video-to-Video Translation with Temporal Correspondence Guidance
Ernie Chu, Tzuhsuan Huang, Shuo-Yen LIN et al.
Generalizable Sleep Staging via Multi-Level Domain Alignment
Jiquan Wang, Sha Zhao, Haiteng Jiang et al.
Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
Yun Qu, Yuhang Jiang, Boyuan Wang et al.
Non-exemplar Online Class-Incremental Continual Learning via Dual-Prototype Self-Augment and Refinement
Fushuo Huo, Wenchao Xu, Jingcai Guo et al.
Beyond Mimicking Under-Represented Emotions: Deep Data Augmentation with Emotional Subspace Constraints for EEG-Based Emotion Recognition
Zhi ZHANG, Sheng-hua Zhong, Yan Liu
Learning Time Slot Preferences via Mobility Tree for Next POI Recommendation
Tianhao Huang, Xuan Pan, Xiangrui Cai et al.
HybridGait: A Benchmark for Spatial-Temporal Cloth-Changing Gait Recognition with Hybrid Explorations
Yilan Dong, Chunlin Yu, Ruiyang Ha et al.
MLNet: Mutual Learning Network with Neighborhood Invariance for Universal Domain Adaptation
Yanzuo Lu, Meng Shen, Andy J Ma et al.
G-NAS: Generalizable Neural Architecture Search for Single Domain Generalization Object Detection
Fan Wu, Jinling Gao, Lanqing Hong et al.
SGFormer: Semantic Graph Transformer for Point Cloud-Based 3D Scene Graph Generation
Changsheng Lv, Mengshi Qi, Xia Li et al.
A Diffusion-Based Pre-training Framework for Crystal Property Prediction
Zixing Song, Ziqiao Meng, Irwin King
FM-OV3D: Foundation Model-Based Cross-Modal Knowledge Blending for Open-Vocabulary 3D Detection
Dongmei Zhang, Chang Li, Renrui Zhang et al.
Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-Hoc Retrieval
Weihang Su, Qingyao Ai, Xiangsheng Li et al.
Adaptive Guidance: Training-free Acceleration of Conditional Diffusion Models
Angela Castillo, Jonas Kohler, Juan C. Pérez et al.
Summarizing Stream Data for Memory-Constrained Online Continual Learning
Jianyang Gu, Kai Wang, Wei Jiang et al.
Learning to Prompt Knowledge Transfer for Open-World Continual Learning
Yujie Li, Xin Yang, Hao Wang et al.
Hierarchical Classification Auxiliary Network for Time Series Forecasting
Yanru Sun, Zongxia Xie, Dongyue Chen et al.
NightRain: Nighttime Video Deraining via Adaptive-Rain-Removal and Adaptive-Correction
Beibei Lin, Yeying Jin, Wending Yan et al.
Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models
Xiao Cui, Mo Zhu, Yulei Qin et al.
GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models
Jian Ma, Yonglin Deng, Chen Chen et al.
7471 PNeRFLoc: Visual Localization with Point-Based Neural Radiance Fields
Boming Zhao, Luwei Yang, Mao Mao et al.
Towards Explainable Joint Models via Information Theory for Multiple Intent Detection and Slot Filling
Xianwei Zhuang, Xuxin Cheng, Yuexian Zou
Numerical Pruning for Efficient Autoregressive Models
Xuan Shen, Zhao Song, Yufa Zhou et al.
Weakly-Supervised Temporal Action Localization by Inferring Salient Snippet-Feature
Wu Yun, Mengshi Qi, Chuanming Wang et al.
WPMixer: Efficient Multi-Resolution Mixing for Long-Term Time Series Forecasting
Md Mahmuddun Nabi Murad, Mehmet Aktukmak, Yasin Yilmaz
NightHaze: Nighttime Image Dehazing via Self-Prior Learning
Beibei Lin, Yeying Jin, Yan Wending et al.
On the Role of Server Momentum in Federated Learning
Jianhui Sun, Xidong Wu, Heng Huang et al.
Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-Free Multi-Exposure Image Fusion
Guanyao Wu, Hongming Fu, Jinyuan Liu et al.
DiffAIL: Diffusion Adversarial Imitation Learning
Bingzheng Wang, Guoqiang Wu, Teng Pang et al.
Graph Contrastive Invariant Learning from the Causal Perspective
9672 Yanhu Mo, Xiao Wang, Shaohua Fan et al.
How to Overcome Curse-of-Dimensionality for Out-of-Distribution Detection?
Soumya Suvra Ghosal, Yiyou Sun, Yixuan Li
Audio Entailment: Assessing Deductive Reasoning for Audio Understanding
Soham Deshmukh, Shuo Han, Hazim Bukhari et al.
Simple Image-Level Classification Improves Open-Vocabulary Object Detection
Ruohuan Fang, Guansong Pang, Xiao Bai
B2Opt: Learning to Optimize Black-box Optimization with Little Budget
Xiaobin Li, Kai Wu, Xiaoyu Zhang et al.
3DMambaIPF: A State Space Model for Iterative Point Cloud Filtering via Differentiable Rendering
Qingyuan Zhou, Weidong Yang, Ben Fei et al.
Text-to-Image Generation for Abstract Concepts
Jiayi Liao, Xu Chen, Qiang Fu et al.
SCANS: Mitigating the Exaggerated Safety for LLMs via Safety-Conscious Activation Steering
Zouying Cao, Yifei Yang, Hai Zhao
Hyperspectral Image Reconstruction via Combinatorial Embedding of Cross-Channel Spatio-Spectral Clues
Xingxing Yang, Jie Chen, Zaifeng Yang
ResMaster: Mastering High-Resolution Image Generation via Structural and Fine-Grained Guidance
Shuwei Shi, Wenbo Li, Yuechen Zhang et al.
ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and Uncertainty in Zeroth-Order Optimization
Shuoran Jiang, Qingcai Chen, Yang Xiang et al.
Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget
Johannes Lehner, Benedikt Alkin, Andreas Fürst et al.
FoldToken: Learning Protein Language via Vector Quantization and Beyond
Zhangyang Gao, Cheng Tan, Jue Wang et al.
Is Sarcasm Detection a Step-by-Step Reasoning Process in Large Language Models?
Ben Yao, Yazhou Zhang, Qiuchi Li et al.
SAVSR: Arbitrary-Scale Video Super-resolution via a Learned Scale-Adaptive Network
Zekun Li, Hongying Liu, Fanhua Shang et al.
GCNext: Towards the Unity of Graph Convolutions for Human Motion Prediction
Xinshun Wang, Qiongjie Cui, Chen Chen et al.
AE-NeRF: Audio Enhanced Neural Radiance Field for Few Shot Talking Head Synthesis
Dongze Li, Kang Zhao, Wei Wang et al.
Training on the Benchmark Is Not All You Need
Shiwen Ni, Xiangtao Kong, Chengming Li et al.
Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models
Yuanzhao Zhai, Tingkai Yang, Kele Xu et al.
Adaptive FSS: A Novel Few-Shot Segmentation Framework via Prototype Enhancement
Jing Wang, Jiangyun Li, Chen Chen et al.
Question Calibration and Multi-Hop Modeling for Temporal Question Answering
Chao Xue, Di Liang, Pengfei Wang et al.
Boosting Neural Cognitive Diagnosis with Student’s Affective State Modeling
Shanshan Wang, Zhen Zeng, Xun Yang et al.
ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models
Jiaxiang Cheng, Pan Xie, Xin Xia et al.
Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation
Jiaqi Huang, Zunnan Xu, Ting Liu et al.
Generating and Reweighting Dense Contrastive Patterns for Unsupervised Anomaly Detection
Songmin Dai, Yifan Wu, Xiaoqiang Li et al.
Controlling Large Language Models Through Concept Activation Vectors
Hanyu Zhang, Xiting Wang, Chengao Li et al.
Exploring Diverse Representations for Open Set Recognition
Yu Wang, Junxian Mu, Pengfei Zhu et al.
Improving Cross-Modal Alignment with Synthetic Pairs for Text-Only Image Captioning
Zhiyue Liu, Jinyuan Liu, Fanrong Ma
Occlusion-Embedded Hybrid Transformer for Light Field Super-Resolution
Zeyu Xiao, Zhuoyuan Li, Wei Jia
StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching
Jixun Yao, Yang Yuguang, Yu Pan et al.
Trusted Unified Feature-Neighborhood Dynamics for Multi-View Classification
Haojian Huang, Chuanyu Qin, Zhe Liu et al.
Leaving the Nest: Going beyond Local Loss Functions for Predict-Then-Optimize
Sanket Shah, Bryan Wilder, Andrew Perrault et al.
Self-Distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach
Ziyin Zhang, Ning Lu, Minghui Liao et al.
Underwater Organism Color Fine-Tuning via Decomposition and Guidance
Xiaofeng Cong, Jie Gui, Junming Hou
AdaDiff: Adaptive Step Selection for Fast Diffusion Models
Hui Zhang, Zuxuan Wu, Zhen Xing et al.
Upper Bounding Barlow Twins: A Novel Filter for Multi-Relational Clustering
Xiaowei Qian, Bingheng Li, Zhao Kang
GOODAT: Towards Test-Time Graph Out-of-Distribution Detection
Luzhi Wang, Di Jin, He Zhang et al.
Conformal Autoregressive Generation: Beam Search with Coverage Guarantees
Nicolas Deutschmann, Marvin Alberts, María Rodríguez Martínez
Pruning Large Language Models with Semi-Structural Adaptive Sparse Training
Weiyu Huang, Yuezhou Hu, Guohao Jian et al.
Federated Learning with Extremely Noisy Clients via Negative Distillation
Yang Lu, Lin Chen, Yonggang Zhang et al.
BSAFusion: A Bidirectional Stepwise Feature Alignment Network for Unaligned Medical Image Fusion
Huafeng Li, Dayong Su, Qing Cai et al.
Aligning Geometric Spatial Layout in Cross-View Geo-Localization via Feature Recombination
Qingwang Zhang, Yingying Zhu
Sketch and Refine: Towards Fast and Accurate Lane Detection
Chao Chen, Jie Liu, Chang Zhou et al.
Instrumental Variable Estimation for Causal Inference in Longitudinal Data with Time-Dependent Latent Confounders
Debo Cheng, Ziqi Xu, Jiuyong Li et al.
LogicAD: Explainable Anomaly Detection via VLM-based Text Feature Extraction
Er Jin, Qihui Feng, Yongli Mou et al.
Enhancing Chain of Thought Prompting in Large Language Models via Reasoning Patterns
Yufeng Zhang, Xuepeng Wang, Lingxiang Wu et al.
MV-VTON: Multi-View Virtual Try-On with Diffusion Models
Haoyu Wang, Zhilu Zhang, Donglin Di et al.
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Junyi Chen, Longteng Guo, Jia Sun et al.
STAR: Boosting Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models
Mingyu Derek Ma, Xiaoxuan Wang, Po-Nien Kung et al.
Weighted Envy-Freeness for Submodular Valuations
Luisa Montanari, Ulrike Schmidt-Kraepelin, Warut Suksompong et al.
Diffusion Language-Shapelets for Semi-supervised Time-Series Classification
Zhen Liu, Wenbin Pei, Disen Lan et al.
FedFixer: Mitigating Heterogeneous Label Noise in Federated Learning
Xinyuan Ji, Zhaowei Zhu, Wei Xi et al.