Most Cited AAAI "hard-label queries" Papers

5,317 papers found • Page 16 of 27

#3001

BIG-FUSION: Brain-Inspired Global-Local Context Fusion Framework for Multimodal Emotion Recognition in Conversations

Yusong Wang, Xuanye Fang, Huifeng Yin et al.

AAAI 2025paper
#3002

Undermining Mental Proof: How AI Can Make Cooperation Harder by Making Thinking Easier

Zachary Wojtowicz, Simon DeDeo

AAAI 2025paperarXiv:2407.14452
#3003

DepMGNN: Matrixial Graph Neural Network for Video-based Automatic Depression Assessment

Zijian Wu, Leijing Zhou, Shuanglin Li et al.

AAAI 2025paper
#3004

Leveraging Asynchronous Spiking Neural Networks for Ultra Efficient Event-Based Visual Processing

DingYi Zeng, Yuchen Wang, Honglin Cao et al.

AAAI 2025paper
#3005

Learning Concept Prerequisite Relation via Global Knowledge Relation Optimization

Miao Zhang, Jiawei Wang, Kui Xiao et al.

AAAI 2025paper
#3006

SalM²: An Extremely Lightweight Saliency Mamba Model for Real-Time Cognitive Awareness of Driver Attention

Chunyu Zhao, Wentao Mu, Xian Zhou et al.

AAAI 2025paper
#3007

Look Around Before Locating: Considering Content and Structure Information for Visual Grounding

Shiyi Zheng, Peizhi Zhao, Zhilong Zheng et al.

AAAI 2025paper
#3008

PerReactor: Offline Personalised Multiple Appropriate Facial Reaction Generation

Hengde Zhu, Xiangyu Kong, Weicheng Xie et al.

AAAI 2025paper
#3009

Bridge Then Begin Anew: Generating Target-Relevant Intermediate Model for Source-Free Visual Emotion Adaptation

Jiankun Zhu, Sicheng Zhao, Jing Jiang et al.

AAAI 2025paperarXiv:2412.13577
#3010

Aspect Enhancement and Text Simplification in Multimodal Aspect-Based Sentiment Analysis for Multi-Aspect and Multi-Sentiment Scenarios

Linlin Zhu, Heli Sun, Qunshu Gao et al.

AAAI 2025paper
#3011

Progressive Self-Learning for Domain Adaptation on Symbolic Regression of Integer Sequences

Yaohui Zhu, Kaiming Sun, Zhengdong Luo et al.

AAAI 2025paper
#3012

HSRDiff: A Hierarchical Self-Regulation Diffusion Model for Stochastic Semantic Segmentation

Han Yang, Chuanguang Yang, Zhulin An et al.

AAAI 2025paper
#3013

AQUAFace: Age-Invariant Quality Adaptive Face Recognition for Unconstrained Selfie vs ID Verification

Shivang Agarwal, Jyoti Chaudhary, Sadiq Siraj Ebrahim et al.

AAAI 2025paper
#3014

AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image Generation

Jingkun An, Yinghao Zhu, Zongjian Li et al.

AAAI 2025paperarXiv:2403.13352
#3015

CA-MLIF: Cross-Attention and Multimodal Low-Rank Interaction Fusion Framework for Tumor Prognostic Prediction

Yajun An, Jiale Chen, Huan Lin et al.

AAAI 2025paper
#3016

HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models

Kazi Hasan Ibn Arif, JinYi Yoon, Dimitrios S. Nikolopoulos et al.

AAAI 2025paperarXiv:2408.10945
#3017

Can Generative Models Improve Self-Supervised Representation Learning?

Sana Ayromlou, Vahid Reza Khazaie, Fereshteh Forghani et al.

AAAI 2025paperarXiv:2403.05966
#3018

The Master Key Filters Hypothesis: Deep Filters Are General

Zahra Babaiee, Peyman M. Kiasari, Daniela Rus et al.

AAAI 2025paperarXiv:2412.16751
#3019

Frozen Language Models Are Gradient Coherence Rectifiers in Vision Transformers

Lichen Bai, Zixuan Xiong, Hai Lin et al.

AAAI 2025paper
#3020

Plug-and-Play Tri-Branch Invertible Block for Image Rescaling

Jingwei Bao, Jinhua Hao, Pengcheng Xu et al.

AAAI 2025paperarXiv:2412.13508
#3021

Dual Manifold Regularization Steered Robust Representation Learning for Point Cloud Analysis

Jian Bi, Qianliang Wu, Jianjun Qian et al.

AAAI 2025paper
#3022

Learning Fine-grained Domain Generalization via Hyperbolic State Space Hallucination

Qi Bi, Jingjun Yi, Haolan Zhan et al.

AAAI 2025paperarXiv:2504.08020
#3023

CustomTTT: Motion and Appearance Customized Video Generation via Test-Time Training

Xiuli Bi, Jian Lu, Bo Liu et al.

AAAI 2025paperarXiv:2412.15646
#3024

FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing

Lingling Cai, Kang Zhao, Hangjie Yuan et al.

AAAI 2025paperarXiv:2409.20500
#3025

Dynamic Adapter with Semantics Disentangling for Cross-lingual Cross-modal Retrieval

Rui Cai, Zhiyu Dong, Jianfeng Dong et al.

AAAI 2025paperarXiv:2412.13510
#3026

Divide-and-Conquer: Tree-structured Strategy with Answer Distribution Estimator for Goal-Oriented Visual Dialogue

Shuo Cai, Xinzhe Han, Shuhui Wang

AAAI 2025paperarXiv:2502.05806
#3027

Object-level Geometric Structure Preserving for Natural Image Stitching

Wenxiao Cai, Wankou Yang

AAAI 2025paperarXiv:2402.12677
#3028

Zero-shot Video Restoration and Enhancement Using Pre-Trained Image Diffusion Model

Cong Cao, Huanjing Yue, Xin Liu et al.

AAAI 2025paperarXiv:2407.01960
#3029

ObjVariantEnsemble: Advancing Point Cloud LLM Evaluation in Challenging Scenes with Subtly Distinguished Objects

Qihang Cao, Huangxun Chen

AAAI 2025paperarXiv:2412.14837
#3030

Deep Graph Online Hashing for Multi-Label Image Retrieval

Yuan Cao, Xiangru Chen, Zifan Liu et al.

AAAI 2025paper
#3031

Segment Any 3D Gaussians

Jiazhong Cen, Jiemin Fang, Chen Yang et al.

AAAI 2025paperarXiv:2312.00860
#3032

Text2Relight: Creative Portrait Relighting with Text Guidance

Junuk Cha, Mengwei Ren, Krishna Kumar Singh et al.

AAAI 2025paperarXiv:2412.13734
#3033

KeyGS: A Keyframe-Centric Gaussian Splatting Method for Monocular Image Sequences

Keng-Wei Chang, Zi-Ming Wang, Shang-Hong Lai

AAAI 2025paperarXiv:2412.20767
#3034

RFL: Simplifying Chemical Structure Recognition with Ring-Free Language

Qikai Chang, Mingjun Chen, Changpeng Pi et al.

AAAI 2025paperarXiv:2412.07594
#3035

Sharpening Neural Implicit Functions with Frequency Consolidation Priors

Chao Chen, Yu-Shen Liu, Zhizhong Han

AAAI 2025paperarXiv:2412.19720
#3036

MaskPrompt: Open-Vocabulary Affordance Segmentation with Object Shape Mask Prompts

Dongpan Chen, Dehui Kong, Jinghua Li et al.

AAAI 2025paper
#3037

Skeleton-based Action Recognition with Non-linear Dependency Modeling and Hilbert-Schmidt Independence Criterion

Haipeng Chen, Yuheng Yang, Yingda Lyu

AAAI 2025paper
#3038

Causal-Inspired Multitask Learning for Video-Based Human Pose Estimation

Haipeng Chen, Sifan Wu, Zhigang Wang et al.

AAAI 2025paperarXiv:2501.14356
#3039

Adversarial Learning Under Hybrid Perturbations for Robust Acute Lymphoblastic Leukemia Classification

Jie Chen, Xinyuan Liu, Xintong Liu et al.

AAAI 2025paper
#3040

Dual-Level Precision Edges Guided Multi-View Stereo with Accurate Planarization

Kehua Chen, Zhenlong Yuan, Tianlu Mao et al.

AAAI 2025paperarXiv:2412.20328
#3041

Contrasting Adversarial Perturbations: The Space of Harmless Perturbations

Lu Chen, Shaofeng Li, Benhao Huang et al.

AAAI 2025paperarXiv:2402.02095
#3042

CustomContrast: A Multilevel Contrastive Perspective for Subject-Driven Text-to-Image Customization

Nan Chen, Mengqi Huang, Zhuowei Chen et al.

AAAI 2025paperarXiv:2409.05606
#3043

Infinite-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation

Qihua Chen, Yue Ma, Hongfa Wang et al.

AAAI 2025paper
#3044

Unsupervised Degradation Representation Aware Transform for Real-World Blind Image Super-Resolution

Sen Chen, Hongying Liu, Chaowei Fang et al.

AAAI 2025paper
#3045

Mixture-of-Attack-Experts with Class Regularization for Unified Physical-Digital Face Attack Detection

Shunxin Chen, Ajian Liu, Junze Zheng et al.

AAAI 2025paperarXiv:2504.00458
#3046

Cross-View Referring Multi-Object Tracking

Sijia Chen, En Yu, Wenbing Tao

AAAI 2025paperarXiv:2412.17807
#3047

DiffDVC: Accurate Event Detection for Dense Video Captioning via Diffusion Models

Wei Chen, Jianwei Niu, Xuefeng Liu et al.

AAAI 2025paper
#3048

Ultra-High-Definition Dynamic Multi-Exposure Image Fusion via Infinite Pixel Learning

Xingchi Chen, Zhuoran Zheng, Xuerui Li et al.

AAAI 2025paperarXiv:2412.11685
#3049

M3Net: Multimodal Multi-task Learning for 3D Detection, Segmentation, and Occupancy Prediction in Autonomous Driving

Xuesong Chen, Shaoshuai Shi, Tao Ma et al.

AAAI 2025paperarXiv:2503.18100
#3050

Dr. Tongue: Sign-Oriented Multi-label Detection for Remote Tongue Diagnosis

Yiliang Chen, Steven SC Ho, Cheng Xu et al.

AAAI 2025paperarXiv:2501.03053
#3051

Comprehensive Multi-Modal Prototypes Are Simple and Effective Classifiers for Vast-Vocabulary Object Detection

Yitong Chen, Wenhao Yao, Lingchen Meng et al.

AAAI 2025paperarXiv:2412.17800
#3052

3D Measurement of Complex Textured Objects Based on Bidirectional Fringe Projection

Yuchong Chen, Jian Yu, Shaoyan Gai et al.

AAAI 2025paper
#3053

Unsupervised Diffusion-Based Degradation Modeling for Real-World Super-Resolution

Yuying Chen, Mingde Yao, Wenbo Li et al.

AAAI 2025paper
#3054

EvHDR-GS: Event-guided HDR Video Reconstruction with 3D Gaussian Splatting

Zehao Chen, Zhan Lu, De Ma et al.

AAAI 2025paper
#3055

VFM-Adapter: Adapting Visual Foundation Models for Dense Prediction with Dynamic Hybrid Operation Mapping

Zheng Chen, Yu Zeng, Zehui Chen et al.

AAAI 2025paper
#3056

VersaGen: Unleashing Versatile Visual Control for Text-to-Image Synthesis

Zhipeng Chen, Lan Yang, Yonggang Qi et al.

AAAI 2025paperarXiv:2412.11594
#3057

EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions

Zhiyuan Chen, Jiajiong Cao, Zhiquan Chen et al.

AAAI 2025paperarXiv:2407.08136
#3058

Gradient Alignment Improves Test-Time Adaptation for Medical Image Segmentation

Ziyang Chen, Yiwen Ye, Yongsheng Pan et al.

AAAI 2025paperarXiv:2408.07343
#3059

3DPGS: 3D Probabilistic Graph Search for Archaeological Piece Grouping

Junfeng Cheng, Yingkai Yang, Tania Stathaki

AAAI 2025paper
#3060

Effective Diffusion Transformer Architecture for Image Super-Resolution

Kun Cheng, Lei Yu, Zhijun Tu et al.

AAAI 2025paperarXiv:2409.19589
#3061

Aligning Instance Brownian Bridge with Texts for Open-Vocabulary Video Instance Segmentation

Zesen Cheng, Kehan Li, Li Hao et al.

AAAI 2025paper
#3062

Bridge 2D-3D: Uncertainty-aware Hierarchical Registration Network with Domain Alignment

Zhixin Cheng, Jiacheng Deng, Xinjun Li et al.

AAAI 2025paperarXiv:2504.01641
#3063

Zero-Shot Scene Change Detection

Kyusik Cho, Dong Yeop Kim, Euntai Kim

AAAI 2025paperarXiv:2406.11210
#3064

Distribution-Level Feature Distancing for Machine Unlearning: Towards a Better Trade-off Between Model Utility and Forgetting

Dasol Choi, Dongbin Na

AAAI 2025paperarXiv:2409.14747
#3065

SIDL: A Real-World Dataset for Restoring Smartphone Images with Dirty Lenses

Sooyoung Choi, Sungyong Park, Heewon Kim

AAAI 2025paper
#3066

Intrinsic Image Decomposition for Robust Self-supervised Monocular Depth Estimation on Reflective Surfaces

Wonhyeok Choi, Kyumin Hwang, Minwoo Choi et al.

AAAI 2025paperarXiv:2503.22209
#3067

MASS: Overcoming Language Bias in Image-Text Matching

Jiwan Chung, Seungwon Lim, Sangkyu Lee et al.

AAAI 2025paperarXiv:2501.11469
#3068

AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples

Antonio Emanuele Cinà, Jérôme Rony, Maura Pintor et al.

AAAI 2025paperarXiv:2404.19460
#3069

GCD-Sampling: A General Cross-scale Decoupled Sampling for Point Cloud

Tao Dai, Yanzi Wang, Jianyu Xiong et al.

AAAI 2025paper
#3070

Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Image Generation

Quan Dao, Hao Phung, Trung Tuan Dao et al.

AAAI 2025paper
#3071

PIXELS: Progressive Image Xemplar-based Editing with Latent Surgery

Shristi Das Biswas, Matthew Shreve, Xuelu Li et al.

AAAI 2025paperarXiv:2501.09826
#3072

Single Exposure Quantitative Phase Imaging with a Conventional Microscope Using Diffusion Models

Gabriel della Maggiora, Luis Alberto Croquevielle, Harry Horsley et al.

AAAI 2025paperarXiv:2406.04388
#3073

Deep Non-Rigid Structure-from-Motion Revisited: Canonicalization and Sequence Modeling

Hui Deng, Jiawei Shi, Zhen Qin et al.

AAAI 2025paperarXiv:2412.07230
#3074

DiffCorr: Conditional Diffusion Model with Reliable Pseudo-Label Guidance for Unsupervised Point Cloud Shape Correspondence

Jiacheng Deng, Jiahao Lu, Zhixin Cheng et al.

AAAI 2025paper
#3075

Adaptive Siamese Masked Autoencoder with Global Optimization for Unsupervised Point Cloud Shape Correspondence

Jiacheng Deng, Jiahao Lu

AAAI 2025paper
#3076

OTIAS: OcTree Implicit Adaptive Sampling for Multispectral and Hyperspectral Image Fusion

Shangqi Deng, Jun Ma, Liang-Jian Deng et al.

AAAI 2025paper
#3077

Boundary-Aware Temporal Dynamic Pseudo-Supervision Pairs Generation for Zero-Shot Natural Language Video Localization

Xiongwen Deng, Haoyu Tang, Han Jiang et al.

AAAI 2025paper
#3078

Occlusion-Insensitive Talking Head Video Generation via Facelet Compensation

Yuhui Deng, Yuqin Lu, Yangyang Xu et al.

AAAI 2025paper
#3079

Dis²Booth: Learning Image Distribution with Disentangled Features for Text-to-Image Diffusion Models

Guanqi Ding, Chengyu Yang, Shuhui Wang et al.

AAAI 2025paper
#3080

Muses: 3D-Controllable Image Generation via Multi-Modal Agent Collaboration

Yanbo Ding, Shaobin Zhuang, Kunchang Li et al.

AAAI 2025paperarXiv:2408.10605
#3081

AS-Det: Active Sampling for Adaptive 3D Object Detection in Point Clouds

Ziheng Ding, Xiaze Zhang, Qi Jing et al.

AAAI 2025paper
#3082

GarFast: Realistic and Fast Garment Transfer with a Simplified Parser-Free Approach

Chenghu Du, Junyin Wang, Yi Rong et al.

AAAI 2025paper
#3083

Latent Diffusion-Enhanced Virtual Try-On via Optimized Pseudo-Label Generation

Chenghu Du, Junyin Wang, Feng Yu et al.

AAAI 2025paper
#3084

HybridReg: Robust 3D Point Cloud Registration with Hybrid Motions

Keyu Du, Hao Xu, Haipeng Li et al.

AAAI 2025paperarXiv:2503.07019
#3085

A Diffusion-Based Framework for Occluded Object Movement

Zheng-Peng Duan, Jiawei Zhang, Siyu Liu et al.

AAAI 2025paperarXiv:2504.01873
#3086

IniRetinex: Rethinking Retinex-type Low-Light Image Enhancer via Initialization Perspective

Guodong Fan, Zishu Yao, Guang-Yong Chen et al.

AAAI 2025paper
#3087

Vision-guided Text Mining for Unsupervised Cross-modal Hashing with Community Similarity Quantization

Haozhi Fan, Yuan Cao

AAAI 2025paper
#3088

EMHI: A Multimodal Egocentric Human Motion Dataset with HMD and Body-Worn IMUs

Zhen Fan, Peng Dai, Zhuo Su et al.

AAAI 2025paperarXiv:2408.17168
#3089

CoSDA: Enhancing the Robustness of Inversion-based Generative Image Watermarking Framework

Han Fang, Kejiang Chen, Zijin Yang et al.

AAAI 2025paper
#3090

SSUN-Net: Spatial-Spectral Prior-Aware Unfolding Network for Pan-Sharpening

Shijie Fang, Hongping Gan

AAAI 2025paper
#3091

AE-NeRF: Augmenting Event-Based Neural Radiance Fields for Non-ideal Conditions and Larger Scenes

Chaoran Feng, Wangbo Yu, Xinhua Cheng et al.

AAAI 2025paperarXiv:2501.02807
#3092

VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering

Chun-Mei Feng, Yang Bai, Tao Luo et al.

AAAI 2025paperarXiv:2312.12273
#3093

Weakly Supervised Gland Segmentation with Class Semantic Consistency and Purified Labels Filtration

Siyang Feng, Huadeng Wang, Chu Han et al.

AAAI 2025paper
#3094

HDLayout: Hierarchical and Directional Layout Planning for Arbitrary Shaped Visual Text Generation

Tonghui Feng, Chunsheng Yan, Qianru Wang et al.

AAAI 2025paper
#3095

Simplifying Control Mechanism in Text-to-Image Diffusion Models

Zhida Feng, Li Chen, Yuenan Sun et al.

AAAI 2025paper
#3096

BGHR: Bridging the Gap Between HBox-Supervised and RBox-Supervised Oriented Object Detection via Adaptive Fine-Grained Sample Mining

Chenlin Fu, Yingying Zhu

AAAI 2025paper
#3097

Foundation Model Driven Appearance Extraction for Robust Multiple Object Tracking

Teng Fu, Haiyang Yu, Ke Niu et al.

AAAI 2025paper
#3098

MFL-Owner: Ownership Protection for Multi-modal Federated Learning via Orthogonal Transform Watermark

Keke Gai, Dongjue Wang, Jing Yu et al.

AAAI 2025paper
#3099

DFDNet: Disentangling and Filtering Dynamics for Enhanced Video Prediction

Lianqiang Gan, Junyu Lai, Jingze Ju et al.

AAAI 2025paper
#3100

PNVC: Towards Practical INR-based Video Compression

Ge Gao, Ho Man Kwan, Fan Zhang et al.

AAAI 2025paperarXiv:2409.00953
#3101

AIM: Let Any Multimodal Large Language Models Embrace Efficient In-Context Learning

Jun Gao, Qian Qiao, Tianxiang Wu et al.

AAAI 2025paper
#3102

TC-LLaVA: Rethinking the Transfer of LLava from Image to Video Understanding with Temporal Considerations

Mingze Gao, Jingyu Liu, Mingda Li et al.

AAAI 2025paper
#3103

EventMamba: Enhancing Spatio-Temporal Locality with State Space Models for Event-Based Video Reconstruction

Chengjie Ge, Xueyang Fu, Peng He et al.

AAAI 2025paperarXiv:2503.19721
#3104

Implicit Location-Caption Alignment via Complementary Masking for Weakly-Supervised Dense Video Captioning

Shiping Ge, Qiang Chen, Zhiwei Jiang et al.

AAAI 2025paperarXiv:2412.12791
#3105

ParseCaps: An Interpretable Parsing Capsule Network for Medical Image Diagnosis

Xinyu Geng, Jiaming Wang, Xiaolin Huang et al.

AAAI 2025paperarXiv:2411.01564
#3106

MaintaAvatar: A Maintainable Avatar Based on Neural Radiance Fields by Continual Learning

Shengbo Gu, Yu-Kun Qiu, Yu-Ming Tang et al.

AAAI 2025paperarXiv:2502.02372
#3107

OT-StainNet: Optimal Transport Driven Semantic Matching for Weakly Paired H&E-to-IHC Stain Transfer

Xianchao Guan, Yifeng Wang, Ye Zhang et al.

AAAI 2025paper
#3108

Surgical Workflow Recognition and Blocking Effectiveness Detection in Laparoscopic Liver Resection with Pringle Maneuver

Diandian Guo, Weixin Si, Zhixi Li et al.

AAAI 2025paperarXiv:2408.10538
#3109

Enhancing Low-Rank Adaptation with Recoverability-Based Reinforcement Pruning for Object Counting

Haojie Guo, Junyu Gao, Yuan Yuan

AAAI 2025paper
#3110

PromptDet: A Lightweight 3D Object Detection Framework with LiDAR Prompts

Kun Guo, Qiang Ling

AAAI 2025paperarXiv:2412.12460
#3111

OpenVIS: Open-vocabulary Video Instance Segmentation

Pinxue Guo, Hao Huang, Peiyang He et al.

AAAI 2025paperarXiv:2305.16835
#3112

SpikeGS: Reconstruct 3D Scene Captured by a Fast-Moving Bio-Inspired Camera

Yijia Guo, Liwen Hu, Yuanxi Bai et al.

AAAI 2025paper
#3113

VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

Yongxin Guo, Jingyu Liu, Mingda Li et al.

AAAI 2025paperarXiv:2405.13382
#3114

LLaVA Needs More Knowledge: Retrieval Augmented Natural Language Generation with Knowledge Graph for Explaining Thoracic Pathologies

Ameer Hamza, Abdullah, Yong Hyun Ahn et al.

AAAI 2025paperarXiv:2410.04749
#3115

DME-Driver: Integrating Human Decision Logic and 3D Scene Perception in Autonomous Driving

Wencheng Han, Dongqian Guo, Cheng-Zhong Xu et al.

AAAI 2025paperarXiv:2401.03641
#3116

ID-Sculpt: ID-aware 3D Head Generation from Single In-the-wild Portrait Image

Jinkun Hao, Junshu Tang, Jiangning Zhang et al.

AAAI 2025paperarXiv:2406.16710
#3117

Efficient Online Training for Zero-Shot Time-Lapse Microscopy Denoising and Super-Resolution

Ruian He, Ri Cheng, Xinkai Lyu et al.

AAAI 2025paper
#3118

MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement

Xu He, Zhiyong Wu, Xiaoyu Li et al.

AAAI 2025paperarXiv:2408.14211
#3119

Long-Tailed Out-of-Distribution Detection: Prioritizing Attention to Tail

Yina He, Lei Peng, Yongcun Zhang et al.

AAAI 2025paperarXiv:2408.06742
#3120

FashionTailor: Controllable Clothing Editing for Human Images with Appearance Preserving

Jie Hou, Jianghong Ma, Xiangyu Mu et al.

AAAI 2025paper
#3121

Prompt Tuning In a Compact Attribute Space

Shiyu Hou, Tianfei Zhou, Shuai Zhang et al.

AAAI 2025paper
#3122

BloomScene: Lightweight Structured 3D Gaussian Splatting for Crossmodal Scene Generation

Xiaolu Hou, Mingcheng Li, Dingkang Yang et al.

AAAI 2025paperarXiv:2501.10462
#3123

Training-and-Prompt-Free General Painterly Harmonization via Zero-Shot Disentenglement on Style and Content References

Teng-Fang Hsiao, Bo-Kai Ruan, Hong-Han Shuai

AAAI 2025paperarXiv:2404.12900
#3124

GaussianSR: High Fidelity 2D Gaussian Splatting for Arbitrary-Scale Image Super-Resolution

Jintong Hu, Bin Xia, Bin Chen et al.

AAAI 2025paperarXiv:2407.18046
#3125

VRVVC: Variable-Rate NeRF-Based Volumetric Video Compression

Qiang Hu, Houqiang Zhong, Zihan Zheng et al.

AAAI 2025paperarXiv:2412.11362
#3126

Identity-Text Video Corpus Grounding

Bin Huang, Xin Wang, Hong Chen et al.

AAAI 2025paper
#3127

SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control

Binyuan Huang, Yuqing Wen, Yucheng Zhao et al.

AAAI 2025paperarXiv:2403.19438
#3128

Wavelet-Assisted Multi-Frequency Attention Network for Pansharpening

Jie Huang, Rui Huang, Jinghao Xu et al.

AAAI 2025paperarXiv:2502.04903
#3129

AUTE: Peer-Alignment and Self-Unlearning Boost Adversarial Robustness for Training Ensemble Models

Lifeng Huang, Tian Su, Chengying Gao et al.

AAAI 2025paper
#3130

EvoChart: A Benchmark and a Self-Training Approach Towards Real-World Chart Understanding

Muye Huang, Han Lai, Xinyu Zhang et al.

AAAI 2025paperarXiv:2409.01577
#3131

Resolving Multi-Condition Confusion for Finetuning-Free Personalized Image Generation

Qihan Huang, Siming Fu, Jinlong Liu et al.

AAAI 2025paperarXiv:2409.17920
#3132

Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation

Shaofei Huang, Rui Ling, Hongyu Li et al.

AAAI 2025paperarXiv:2408.15876
#3133

DreamPhysics: Learning Physics-Based 3D Dynamics with Video Diffusion Priors

Tianyu Huang, Haoze Zhang, Yihan Zeng et al.

AAAI 2025paperarXiv:2406.01476
#3134

Manta: Enhancing Mamba for Few-Shot Action Recognition of Long Sub-Sequence

Wenbo Huang, Jinghui Zhang, Guang Li et al.

AAAI 2025paperarXiv:2412.07481
#3135

CLIP-RestoreX: Restore Image Structure and Perception in Exposure Correction

Xiang Huang, Qing Zhang, Jian-Fang Hu et al.

AAAI 2025paper
#3136

Towards a Multimodal Large Language Model with Pixel-Level Insight for Biomedicine

Xiaoshuang Huang, Lingdong Shen, Jia Liu et al.

AAAI 2025paperarXiv:2412.09278
#3137

PSReg: Prior-guided Sparse Mixture of Experts for Point Cloud Registration

Xiaoshui Huang, Zhou Huang, Yifan Zuo et al.

AAAI 2025paperarXiv:2501.07762
#3138

Medical MLLM Is Vulnerable: Cross-Modality Jailbreak and Mismatched Attacks on Medical Multimodal Large Language Models

Xijie Huang, Xinyuan Wang, Hantao Zhang et al.

AAAI 2025paperarXiv:2405.20775
#3139

L4DR: LiDAR-4DRadar Fusion for Weather-Robust 3D Object Detection

Xun Huang, Ziyu Xu, Hai Wu et al.

AAAI 2025paperarXiv:2408.03677
#3140

SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization

Yongle Huang, Haodong Chen, Zhenbang Xu et al.

AAAI 2025paperarXiv:2501.01245
#3141

PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space Model

Yunlong Huang, Junshuo Liu, Ke Xian et al.

AAAI 2025paperarXiv:2408.03540
#3142

EGSRAL:An Enhanced 3D Gaussian Splatting Based Renderer with Automated Labeling for Large-Scale Driving Scene

Yixiong Huo, Guangfeng Jiang, Hongyang Wei et al.

AAAI 2025paperarXiv:2412.15550
#3143

High-Resolution Frame Interpolation with Patch-based Cascaded Diffusion

Junhwa Hur, Charles Herrmann, Saurabh Saxena et al.

AAAI 2025paperarXiv:2410.11838
#3144

Zero-shot Depth Completion via Test-time Alignment with Affine-invariant Depth Prior

Lee Hyoseok, Kyeong Seon Kim, Kwon Byung-Ki et al.

AAAI 2025paperarXiv:2502.06338
#3145

VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting

Muhammet Furkan Ilaslan, Ali Köksal, Kevin Qinghong Lin et al.

AAAI 2025paperarXiv:2412.11621
#3146

Every Component Counts: Rethinking the Measure of Success for Medical Semantic Segmentation in Multi-Instance Segmentation Tasks

Alexander Jaus, Constantin Marc Seibold, Simon Reiß et al.

AAAI 2025paperarXiv:2410.18684
#3147

Game4Loc: A UAV Geo-Localization Benchmark from Game Data

Yuxiang Ji, Boyong He, Zhuoyue Tan et al.

AAAI 2025paperarXiv:2409.16925
#3148

Orchestrating the Symphony of Prompt Distribution Learning for Human-Object Interaction Detection

Mingda Jia, Liming Zhao, Ge Li et al.

AAAI 2025paperarXiv:2412.08506
#3149

FlexiTex: Enhancing Texture Generation via Visual Guidance

Dadong Jiang, Xianghui Yang, Zibo Zhao et al.

AAAI 2025paperarXiv:2409.12431
#3150

ARNet: Self-Supervised FG-SBIR with Unified Sample Feature Alignment and Multi-Scale Token Recycling

Jianan Jiang, Hao Tang, Zhilin Jiang et al.

AAAI 2025paperarXiv:2406.11551
#3151

SCCS: Deep Neural Spectral Clustering for Self-Supervised Subcellular Structure Segmentation

Jimao Jiang, Diya Sun, Tianbing Wang et al.

AAAI 2025paper
#3152

Restabilizing Diffusion Models with Predictive Noise Fusion Strategy for Image Super-Resolution

Luoqian Jiang, Yong Guo, Bingna Xu et al.

AAAI 2025paper
#3153

Query Quantized Neural SLAM

Sijia Jiang, Jing Hua, Zhizhong Han

AAAI 2025paperarXiv:2412.16476
#3154

Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective

Can Jin, Tianjin Huang, Yihua Zhang et al.

AAAI 2025paperarXiv:2312.01397
#3155

Pedestrian Attribute Recognition: A New Benchmark Dataset and a Large Language Model Augmented Framework

Jiandong Jin, Xiao Wang, Qian Zhu et al.

AAAI 2025paperarXiv:2408.09720
#3156

A Method for Enhancing Generalization of Adam by Multiple Integrations

Long Jin, Han Nong, Liangming Chen et al.

AAAI 2025paperarXiv:2412.12473
#3157

Bridging the Semantic Granularity Gap Between Text and Frame Representations for Partially Relevant Video Retrieval

WooJin Jun, WonJun Moon, Cheol-Ho Cho et al.

AAAI 2025paper
#3158

CodecNeRF: Toward Fast Encoding and Decoding, Compact, and High-quality Novel-view Synthesis

Gyeongjin Kang, Younggeun Lee, Seungjun Oh et al.

AAAI 2025paperarXiv:2404.04913
#3159

DiffusionREC: Diffusion Model with Adaptive Condition for Referring Expression Comprehension

Jingcheng Ke, Waikeung Wong, Jia Wang et al.

AAAI 2025paper
#3160

PLATYPUS: Progressive Local Surface Estimator for Arbitrary-Scale Point Cloud Upsampling

Donghyun Kim, Hyeonkyeong Kwon, Yumin Kim et al.

AAAI 2025paperarXiv:2411.00432
#3161

Generalized Zero-Shot Learning for Point Cloud Segmentation with Evidence-Based Dynamic Calibration

Hyeonseok Kim, Byeongkeun Kang, Yeejin Lee

AAAI 2025paperarXiv:2509.08280
#3162

APR-RD: Complemental Two Steps for Self-Supervised Real Image Denoising

Hyunjun Kim, Nam Ik Cho

AAAI 2025paper
#3163

DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation

Jisoo Kim, Jungbin Cho, Joonho Park et al.

AAAI 2025paperarXiv:2408.06010
#3164

ProtoOcc: Accurate, Efficient 3D Occupancy Prediction Using Dual Branch Encoder-Prototype Query Decoder

Jungho Kim, Changwon Kang, Dongyoung Lee et al.

AAAI 2025paperarXiv:2412.08774
#3165

MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head Generation

Seyeon Kim, Siyoon Jin, Jihye Park et al.

AAAI 2025paperarXiv:2403.19144
#3166

TSDF-Based Efficient Motion-Compensated Temporal Interpolation for 3D Dynamic Sequences

Soowoong Kim, Minseong Kwon, Junho Choi et al.

AAAI 2025paper
#3167

ViPCap: Retrieval Text-Based Visual Prompts for Lightweight Image Captioning

Taewhan Kim, Soeun Lee, Si-Woo Kim et al.

AAAI 2025paperarXiv:2412.19289
#3168

Sequence Matters: Harnessing Video Models in 3D Super-Resolution

Hyun-kyu Ko, Dongheok Park, Youngin Park et al.

AAAI 2025paperarXiv:2412.11525
#3169

UniDet3D: Multi-dataset Indoor 3D Object Detection

Maksim Kolodiazhnyi, Anna Vorontsova, Matvey Skripkin et al.

AAAI 2025paperarXiv:2409.04234
#3170

Do Not DeepFake Me: Privacy-Preserving Neural 3D Head Reconstruction Without Sensitive Images

Jiayi Kong, Xurui Song, Shuo Huai et al.

AAAI 2025paperarXiv:2312.04106
#3171

Real-Time Neural Denoising with Render-Aware Knowledge Distillation

Mengxun Kong, Jie Guo, Chen Wang et al.

AAAI 2025paper
#3172

Stable Mean Teacher for Semi-supervised Video Action Detection

Akash Kumar, Sirshapan Mitra, Yogesh Singh Rawat

AAAI 2025paperarXiv:2412.07072
#3173

A Unified Degradation-Robust Approach to SSL and UDA for 3D Medical Images

Suruchi Kumari, Pravendra Singh

AAAI 2025paper
#3174

SAFIRE: Segment Any Forged Image Region

Myung-Joon Kwon, Wonjun Lee, Seung-Hun Nam et al.

AAAI 2025paperarXiv:2412.08197
#3175

Exploiting Diffusion Prior for Real-World Image Dehazing with Unpaired Training

Yunwei Lan, Zhigao Cui, Chang Liu et al.

AAAI 2025paperarXiv:2503.15017
#3176

Color Transfer with Modulated Flows

Maria Larchenko, Alexander Lobashev, Dmitry Guskov et al.

AAAI 2025paperarXiv:2503.19062
#3177

Rethinking Open-Vocabulary Segmentation of Radiance Fields in 3D Space

Hyunjee Lee, Youngsik Yun, Jeongmin Bae et al.

AAAI 2025paperarXiv:2408.07416
#3178

NBA3D: Neighbor-Based Confidence Adjustment for 3D Rare Object Detection Using LiDAR

Jooyoung Lee, Jaeyoon Lee, Jongwon Choi

AAAI 2025paper
#3179

MAMS: Model-Agnostic Module Selection Framework for Video Captioning

Sangho Lee, Il Yong Chun, Hogun Park

AAAI 2025paperarXiv:2501.18269
#3180

Enabling Region-Specific Control via Lassos in Point-Based Colorization

Sanghyeon Lee, Jooyeol Yun, Jaegul Choo

AAAI 2025paperarXiv:2412.13469
#3181

Concept Matching with Agent for Out-of-Distribution Detection

Yuxiao Lee, Xiaofeng Cao, Jingcai Guo et al.

AAAI 2025paperarXiv:2405.16766
#3182

FNIN: A Fourier Neural Operator-based Numerical Integration Network for Surface-from-gradients

Jiaqi Leng, Yakun Ju, Yuanxu Duan et al.

AAAI 2025paper
#3183

Disentangled Motion Modeling for Video Frame Interpolation

Jaihyun Lew, Jooyoung Choi, Chaehun Shin et al.

AAAI 2025paperarXiv:2406.17256
#3184

StyO: Stylize Your Face in Only One-Shot

Bonan Li, Zicheng Zhang, Xuecheng Nie et al.

AAAI 2025paperarXiv:2303.03231
#3185

FEAST-Mamba: FEAture and SpaTial Aware Mamba Network with Bidirectional Orthogonal Fusion for Cross-Modal Point Cloud Segmentation

Chade Li, Pengju Zhang, Bo Liu et al.

AAAI 2025paper
#3186

RemDet: Rethinking Efficient Model Design for UAV Object Detection

Chen Li, Rui Zhao, Zeyu Wang et al.

AAAI 2025paperarXiv:2412.10040
#3187

U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation

Chenxin Li, Xinyu Liu, Wuyang Li et al.

AAAI 2025paperarXiv:2406.02918
#3188

Consistency of Compositional Generalization Across Multiple Levels

Chuanhao Li, Zhen Li, Chenchen Jing et al.

AAAI 2025paperarXiv:2412.13636
#3189

An Efficient Framework for Enhancing Discriminative Models via Diffusion Techniques

Chunxiao Li, Xiaoxiao Wang, Boming Miao et al.

AAAI 2025paperarXiv:2412.09063
#3190

Cascaded Diffusion Models for Virtual Try-On: Improving Control and Resolution

Guangyuan Li, Yongkang Wang, Junsheng Luan et al.

AAAI 2025paper
#3191

MaskViM: Domain Generalized Semantic Segmentation with State Space Models

Jiahao Li, Yang Lu, Yuan Xie et al.

AAAI 2025paper
#3192

Know Where You Are From: Event-Based Segmentation via Spatio-Temporal Propagation

Ke Li, Gengyu Lyu, Hao Chen et al.

AAAI 2025paper
#3193

Similar Modality Enhancement and Action Consistency Learning for Weakly Supervised Temporal Action Localization

Maodong Li, Chao Zheng, Jian Wang et al.

AAAI 2025paper
#3194

REGNav: Room Expert Guided Image-Goal Navigation

Pengna Li, Kangyi Wu, Jingwen Fu et al.

AAAI 2025paperarXiv:2502.10785
#3195

Region-aware Difference Distilling with Attribute-guided Contrastive Regularization for Change Captioning

Rong Li, Liang Li, Jiehua Zhang et al.

AAAI 2025paper
#3196

Enhancing Generalizability via Utilization of Unlabeled Data for Occupancy Perception

Ruihang Li, Tao Li, Shanding Ye et al.

AAAI 2025paper
#3197

A Compact Implicit Neural Representation for Efficient Storage of Massive 4D Functional Magnetic Resonance Imaging

Ruoran Li, Runzhao Yang, Wenxin Xiang et al.

AAAI 2025paperarXiv:2312.00082
#3198

DigitalLLaVA: Incorporating Digital Cognition Capability for Physical World Comprehension in Multimodal LLMs

Shiyu Li, Pengxu Wei, Pengchong Qiao et al.

AAAI 2025paper
#3199

Transferable Adversarial Face Attack with Text Controlled Attribute

Wenyun Li, Zheng Zhang, Xiangyuan Lan et al.

AAAI 2025paperarXiv:2412.11735
#3200

MambaLCT: Boosting Tracking via Long-term Context State Space Model

Xiaohai Li, Bineng Zhong, Qihua Liang et al.

AAAI 2025paperarXiv:2412.13615