Most Cited AAAI "fixed-viewpoint video" Papers

5,317 papers found • Page 17 of 27

#3201

HVDualformer: Histogram-Vision Dual Transformer for White Balance

Yan-Tsung Peng, Guan-Rong Chen

AAAI 2025paper
#3202

Semi-supervised 3D Semantic Scene Completion with 2D Vision Foundation Model Guidance

Duc-Hai Pham, Duc-Dung Nguyen, Anh Pham et al.

AAAI 2025paperarXiv:2408.11559
#3203

Leveraging Anatomical Consistency for Multi-Object Detection in Ultrasound Images via Source-free Unsupervised Domain Adaptation

Bin Pu, Xingguo Lv, Jiewen Yang et al.

AAAI 2025paper
#3204

Dive into Aerial Remote Sensing Underwater Depth Estimation with Hyperspectral Imagery

Jiahao Qi, Xingyue Liu, Chen Chen et al.

AAAI 2025paper
#3205

PhysDiff: Physiology-based Dynamicity Disentangled Diffusion Model for Remote Physiological Measurement

Wei Qian, Gaoji Su, Dan Guo et al.

AAAI 2025paper
#3206

Holistic Correction with Object Prototype for Video Object Segmentation

Shengye Qiao, Changqun Xia, Yanjie Liang et al.

AAAI 2025paper
#3207

Integrating Low-Level Visual Cues for Enhanced Unsupervised Semantic Segmentation

Yuhao Qing, Dan Zeng, Shaorong Xie et al.

AAAI 2025paper
#3208

PC-BEV: An Efficient Polar-Cartesian BEV Fusion Framework for LiDAR Semantic Segmentation

Shoumeng Qiu, Xinrun Li, Xiangyang Xue et al.

AAAI 2025paperarXiv:2412.14821
#3209

High-Fidelity Polarimetric Implicit 3D Reconstruction with View-Dependent Physical Representation

Yu Qiu, Sijia Wen, Hainan Zhang et al.

AAAI 2025paper
#3210

HSOD-BIT-V2: A Challenging Benchmark for Hyperspectral Salient Object Detection

Yuhao Qiu, Shuyan Bai, Tingfa Xu et al.

AAAI 2025paper
#3211

Universal Features Guided Zero-Shot Category-Level Object Pose Estimation

Wentian Qu, Chenyu Meng, Heng Li et al.

AAAI 2025paperarXiv:2501.02831
#3212

GHOST: Gaussian Hypothesis Open-Set Technique

Ryan Rabinowitz, Steve Cruz, Manuel Günther et al.

AAAI 2025paperarXiv:2502.03359
#3213

CDTR: Semantic Alignment for Video Moment Retrieval Using Concept Decomposition Transformer

Ran Ran, Jiwei Wei, Xiangyi Cai et al.

AAAI 2025paper
#3214

Improving Integrated Gradient-based Transferable Adversarial Examples by Refining the Integration Path

Yuchen Ren, Zhengyu Zhao, Chenhao Lin et al.

AAAI 2025paperarXiv:2412.18844
#3215

GenHMR: Generative Human Mesh Recovery

Muhammad Usama Saleem, Ekkasit Pinyoanuntapong, Pu Wang et al.

AAAI 2025paperarXiv:2412.14444
#3216

FunEditor: Achieving Complex Image Edits via Function Aggregation with Diffusion Models

Mohammadreza Samadi, Fred X. Han, Mohammad Salameh et al.

AAAI 2025paperarXiv:2408.08495
#3217

PVTree: Realistic and Controllable Palm Vein Generation for Recognition Tasks

Sheng Shang, Chenglong Zhao, Ruixin Zhang et al.

AAAI 2025paperarXiv:2503.02547
#3218

Video Summarization Using Denoising Diffusion Probabilistic Model

Zirui Shang, Yubo Zhu, Hongxi Li et al.

AAAI 2025paperarXiv:2412.08357
#3219

IMAGDressing-v1: Customizable Virtual Dressing

Fei Shen, Xin Jiang, Xin He et al.

AAAI 2025paperarXiv:2407.12705
#3220

In2NeCT: Inter-class and Intra-class Neural Collapse Tuning for Semantic Segmentation of Imbalanced Remote Sensing Images

Junao Shen, Qiyun Hu, Tian Feng et al.

AAAI 2025paper
#3221

Topology-Aware 3D Gaussian Splatting: Leveraging Persistent Homology for Optimized Structural Integrity

Tianqi Shen, Shaohua Liu, Jiaqi Feng et al.

AAAI 2025paperarXiv:2412.16619
#3222

Free-Moving Object Reconstruction and Pose Estimation with Virtual Camera

Haixin Shi, Yinlin Hu, Daniel Koguciuk et al.

AAAI 2025paperarXiv:2405.05858
#3223

Normal-NeRF: Ambiguity-Robust Normal Estimation for Highly Reflective Scenes

Ji Shi, Xianghua Ying, Ruohao Guo et al.

AAAI 2025paperarXiv:2501.09460
#3224

Neural Block Compression: Variable Bitrates Feature Blocks for Texture Representation

Rui Shi, Yishun Dou, Zhong Zheng et al.

AAAI 2025paper
#3225

HS-FPN: High Frequency and Spatial Perception FPN for Tiny Object Detection

Zican Shi, Jing Hu, Jie Ren et al.

AAAI 2025paperarXiv:2412.10116
#3226

SdalsNet: Self-Distilled Attention Localization and Shift Network for Unsupervised Camouflaged Object Detection

Peiyao Shou, Yixiu Liu, Wei Wang et al.

AAAI 2025paper
#3227

OGP-Net: Optical Guidance Meets Pixel-Level Contrastive Distillation for Robust Multi-Modal and Missing Modality Segmentation

Aniruddh Sikdar, Jayant Teotia, Suresh Sundaram

AAAI 2025paper
#3228

Fine-Grained Perception in Panoramic Scenes: A Novel Task, Dataset, and Method for Object Importance Ranking

Jia Song, Chenglizhao Chen, Xu Yu et al.

AAAI 2025paper
#3229

CtrlAvatar: Controllable Avatars Generation via Disentangled Invertible Networks

Wenfeng Song, Yang Ding, Fei Hou et al.

AAAI 2025paper
#3230

ERL-MPP: Evolutionary Reinforcement Learning with Multi-head Puzzle Perception for Solving Large-scale Jigsaw Puzzles of Eroded Gaps

Xingke Song, Xiaoying Yang, Chenglin Yao et al.

AAAI 2025paperarXiv:2504.09608
#3231

Temporal Coherent Object Flow for Multi-Object Tracking

Zikai Song, Run Luo, Lintao Ma et al.

AAAI 2025paper
#3232

Toward Improving Robustness and Accuracy in Unsupervised Domain Adaptation

Aishwarya Soni, Tanima Dutta

AAAI 2025paper
#3233

Hierarchical Vector Quantization for Unsupervised Action Segmentation

Federico Spurio, Emad Bahrami, Gianpiero Francesca et al.

AAAI 2025paperarXiv:2412.17640
#3234

Can We Get Rid of Handcrafted Feature Extractors? SparseViT: Nonsemantics-Centered, Parameter-Efficient Image Manipulation Localization Through Spare-Coding Transformer

Lei Su, Xiaochen Ma, Xuekang Zhu et al.

AAAI 2025paperarXiv:2412.14598
#3235

EigenSR: Eigenimage-Bridged Pre-Trained RGB Learners for Single Hyperspectral Image Super-Resolution

Xi Su, Xiangfei Shen, Mingyang Wan et al.

AAAI 2025paperarXiv:2409.04050
#3236

Dual-branch Graph Feature Learning for NLOS Imaging

Xiongfei Su, Tianyi Zhu, Lina Liu et al.

AAAI 2025paperarXiv:2502.19683
#3237

Explicit Relational Reasoning Network for Scene Text Detection

Yuchen Su, Zhineng Chen, Yongkun Du et al.

AAAI 2025paperarXiv:2412.14692
#3238

3D Annotation-Free Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving

Boyi Sun, Yuhang Liu, Xingxia Wang et al.

AAAI 2025paperarXiv:2405.15286
#3239

NeuralFlix: A Simple While Effective Framework for Semantic Decoding of Videos from Non-invasive Brain Recordings

Jingyuan Sun, Mingxiao Li, Marie-Francine Moens

AAAI 2025paper
#3240

Guided and Variance-Corrected Fusion with One-shot Style Alignment for Large-Content Image Generation

Shoukun Sun, Min Xian, Tiankai Yao et al.

AAAI 2025paperarXiv:2412.12771
#3241

M2Flow: A Motion Information Fusion Framework for Enhanced Unsupervised Optical Flow Estimation in Autonomous Driving

Xunpei Sun, Gang Chen, Zuoxun Hou

AAAI 2025paper
#3242

Leveraging Large Vision-Language Model as User Intent-Aware Encoder for Composed Image Retrieval

Zelong Sun, Dong Jing, Guoxing Yang et al.

AAAI 2025paperarXiv:2412.11087
#3243

C2P-CLIP: Injecting Category Common Prompt in CLIP to Enhance Generalization in Deepfake Detection

Chuangchuang Tan, Renshuai Tao, Huan Liu et al.

AAAI 2025paperarXiv:2408.09647
#3244

Neighbor Does Matter: Density-Aware Contrastive Learning for Medical Semi-supervised Segmentation

Feilong Tang, Zhongxing Xu, Ming Hu et al.

AAAI 2025paperarXiv:2412.19871
#3245

MUSE: Mamba Is Efficient Multi-scale Learner for Text-video Retrieval

Haoran Tang, Meng Cao, Jinfa Huang et al.

AAAI 2025paperarXiv:2408.10575
#3246

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving

Tao Tang, Dafeng Wei, Zhengyu Jia et al.

AAAI 2025paperarXiv:2401.01065
#3247

More Text, Less Point: Towards 3D Data-Efficient Point-Language Understanding

Yuan Tang, Xu Han, Xianzhi Li et al.

AAAI 2025paperarXiv:2408.15966
#3248

RAGG: Retrieval-Augmented Grasp Generation Model

Zhenhua Tang, Bin Zhu, Yanbin Hao et al.

AAAI 2025paper
#3249

From Representation Space to Prognostic Insights: Whole Slide Image Generation with Hierarchical Diffusion Model for Survival Prediction

Zhihao Tang, Xi Zhang, Chaozhuo Li

AAAI 2025paper
#3250

3D²-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling

Zichen Tang, Hongyu Yang, Hanchen Zhang et al.

AAAI 2025paper
#3251

Stitch, Contrast, and Segment: Learning a Human Action Segmentation Model Using Trimmed Skeleton Videos

Haitao Tian, Pierre Payeur

AAAI 2025paper
#3252

Unsupervised Self-Prior Embedding Neural Representation for Iterative Sparse-View CT Reconstruction

Xuanyu Tian, Lixuan Chen, Qing Wu et al.

AAAI 2025paperarXiv:2502.05445
#3253

AI-generated Image Quality Assessment in Visual Communication

Yu Tian, Yixuan Li, Baoliang Chen et al.

AAAI 2025paperarXiv:2412.15677
#3254

G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o

Tony Cheng Tong, Sirui He, Zhiwen Shao et al.

AAAI 2025paperarXiv:2412.13647
#3255

Memory-Augmented Re-Completion for 3D Semantic Scene Completion

Yu-Wen Tseng, Sheng-Ping Yang, Jhih-Ciang Wu et al.

AAAI 2025paper
#3256

VOILA: Complexity-Aware Universal Segmentation of CT Images by Voxel Interacting with Language

Zishuo Wan, Yu Gao, Wanyuan Pang et al.

AAAI 2025paperarXiv:2501.03482
#3257

ParGo: Bridging Vision-Language with Partial and Global Views

An-Lan Wang, Bin Shan, Wei Shi et al.

AAAI 2025paperarXiv:2408.12928
#3258

RA-GAR: A Richly Annotated Benchmark for Gait Attribute Recognition

Chenye Wang, Saihui Hou, Aoqi Li et al.

AAAI 2025paper
#3259

Towards Efficient Object Re-Identification with a Novel Cloud-Edge Collaborative Framework

Chuanming Wang, Yuxin Yang, Mengshi Qi et al.

AAAI 2025paperarXiv:2401.02041
#3260

Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled Guidance

Cunzheng Wang, Ziyuan Guo, Yuxuan Duan et al.

AAAI 2025paperarXiv:2409.01347
#3261

A Black-Box Evaluation Framework for Semantic Robustness in Bird’s Eye View Detection

Fu Wang, Yanghao Zhang, Xiangyu Yin et al.

AAAI 2025paperarXiv:2412.13913
#3262

Scene Graph-Grounded Image Generation

Fuyun Wang, Tong Zhang, Yuanzhi Wang et al.

AAAI 2025paper
#3263

S³-Mamba: Small-Size-Sensitive Mamba for Lesion Segmentation

Gui Wang, Yuexiang Li, Wenting Chen et al.

AAAI 2025paper
#3264

BLS-GAN: A Deep Layer Separation Framework for Eliminating Bone Overlap in Conventional Radiographs

Haolin Wang, Yafei Ou, Prasoon Ambalathankandy et al.

AAAI 2025paperarXiv:2409.07304
#3265

EMControl: Adding Conditional Control to Text-to-Image Diffusion Models via Expectation-Maximization

He Wang, Longquan Dai, Jinhui Tang

AAAI 2025paper
#3266

M2OST: Many-to-one Regression for Predicting Spatial Transcriptomics from Digital Pathology Images

Hongyi Wang, Xiuju Du, Jing Liu et al.

AAAI 2025paperarXiv:2409.15092
#3267

RAP-SR: RestorAtion Prior Enhancement in Diffusion Models for Realistic Image Super-Resolution

Jiangang Wang, Qingnan Fan, Jinwei Chen et al.

AAAI 2025paperarXiv:2412.07149
#3268

MM-Mixing: Multi-Modal Mixing Alignment for 3D Understanding

Jiaze Wang, Yi Wang, Ziyu Guo et al.

AAAI 2025paperarXiv:2405.18523
#3269

OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision

Junjie Wang, Bin Chen, Bin Kang et al.

AAAI 2025paperarXiv:2405.17913
#3270

InpDiffusion: Image Inpainting Localization via Conditional Diffusion Models

Kai Wang, Shaozhang Niu, Qixian Hao et al.

AAAI 2025paperarXiv:2501.02816
#3271

Tracking Everything Everywhere across Multiple Cameras

Li-Heng Wang, YuJu Cheng, Tyng-Luh Liu

AAAI 2025paper
#3272

VLScene: Vision-Language Guidance Distillation for Camera-Based 3D Semantic Scene Completion

Meng Wang, Huilong Pi, Ruihui Li et al.

AAAI 2025paperarXiv:2503.06219
#3273

Deep Multi-modal Graph Clustering via Graph Transformer Network

Qianqian Wang, Haiming Xu, Zihao Zhang et al.

AAAI 2025paper
#3274

The Parables of the Mustard Seed and the Yeast: Extremely Low-Budget, High-Performance Nighttime Semantic Segmentation

Shiqin Wang, Xin Xu, Haoyang Chen et al.

AAAI 2025paper
#3275

GFlow: Recovering 4D World from Monocular Video

Shizun Wang, Xingyi Yang, Qiuhong Shen et al.

AAAI 2025paperarXiv:2405.18426
#3276

Imagine: Image-Guided 3D Part Assembly with Structure Knowledge Graph

Weihao Wang, Yu Lan, Mingyu You et al.

AAAI 2025paper
#3277

MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences

Weitao Wang, Haoran Xu, Yuxiao Yang et al.

AAAI 2025paperarXiv:2412.06614
#3278

FreeGen: Bridging Visual-Linguistic Discrepancies Towards Diffusion-based Pixel-level Data Synthesis

Wenzhuang Wang, Mingcan Ma, Yong Chen et al.

AAAI 2025paper
#3279

DCTMamba: Advancing JPEG Image Restoration Through Long-Sequence Modeling and Adaptive Frequency Strategy

Xi Wang, Xueyang Fu, Liang Li et al.

AAAI 2025paper
#3280

From 2D CAD Drawings to 3D Parametric Models: A Vision-Language Approach

Xilin Wang, Jia Zheng, Yuanchao Hu et al.

AAAI 2025paperarXiv:2412.11892
#3281

Lifting Scheme-Based Implicit Disentanglement of Emotion-Related Facial Dynamics in the Wild

Xingjian Wang, Li Chai

AAAI 2025paperarXiv:2412.13168
#3282

MIMTrack: In-Context Tracking via Masked Image Modeling

Xingmei Wang, Guohao Nie, Jiaxiang Meng et al.

AAAI 2025paper
#3283

From Coarse to Fine: A Matching and Alignment Framework for Unsupervised Cross-View Geo-Localization

Xueyi Wang, Lele Zhang, Zheng Fan et al.

AAAI 2025paper
#3284

RefDetector: A Simple Yet Effective Matching-based Method for Referring Expression Comprehension

Yabing Wang, Zhuotao Tian, Zheng Qin et al.

AAAI 2025paper
#3285

Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression Comprehension

Yaxian Wang, Henghui Ding, Shuting He et al.

AAAI 2025paperarXiv:2501.01416
#3286

Breaking Barriers in Physical-World Adversarial Examples: Improving Robustness and Transferability via Robust Feature

Yichen Wang, Yuxuan Chou, Ziqi Zhou et al.

AAAI 2025paperarXiv:2412.16958
#3287

Capturing the Unseen: Vision-Free Facial Motion Capture Using Inertial Measurement Units

Youjia Wang, Yiwen Wu, Hengan Zhou et al.

AAAI 2025paperarXiv:2402.03944
#3288

Re-Attentional Controllable Video Diffusion Editing

Yuanzhi Wang, Yong Li, Mengyi Liu et al.

AAAI 2025paperarXiv:2412.11710
#3289

MambaPro: Multi-Modal Object Re-identification with Mamba Aggregation and Synergistic Prompt

Yuhao Wang, Xuehu Liu, Tianyu Yan et al.

AAAI 2025paperarXiv:2412.10707
#3290

IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Primary Word Emphasis

Yuji Wang, Jingchen Ni, Yong Liu et al.

AAAI 2025paperarXiv:2503.00936
#3291

Target Scanpath-Guided 360-Degree Image Enhancement

Yujia Wang, Fang-Lue Zhang, Neil A. Dodgson

AAAI 2025paper
#3292

DualNet: Robust Self-Supervised Stereo Matching with Pseudo-Label Supervision

Yun Wang, Jiahao Zheng, Chenghao Zhang et al.

AAAI 2025paper
#3293

Mamba YOLO: A Simple Baseline for Object Detection with State Space Model

Zeyu Wang, Chen Li, Huiying Xu et al.

AAAI 2025paperarXiv:2406.05835
#3294

Style Nursing with Spatial and Semantic Guidance for Zero-Shot Traffic Scene Style Transfer

Zhen Wang, Zihang Lin, Meng Yuan et al.

AAAI 2025paper
#3295

Thermal-Aware Low-Light Image Enhancement: A Real-World Benchmark and a New Light-Weight Model

Zhen Wang, Yaozu Wu, Dongyuan Li et al.

AAAI 2025paper
#3296

Attention-Imperceptible Backdoor Attacks on Vision Transformers

Zhishen Wang, Rui Wang, Lihua Jing

AAAI 2025paper
#3297

LLM-RG4: Flexible and Factual Radiology Report Generation Across Diverse Input Contexts

Zhuhao Wang, Yihua Sun, Zihan Li et al.

AAAI 2025paperarXiv:2412.12001
#3298

MSV-PCT: Multi-Sparse-View Enhanced Transformer Framework for Salient Object Detection in Point Clouds

Zihao Wang, Yiming Huang, Gengyu Lyu et al.

AAAI 2025paper
#3299

GlyphSR: A Simple Glyph-Aware Framework for Scene Text Image Super-Resolution

Baole Wei, Yuxuan Zhou, Liangcai Gao et al.

AAAI 2025paper
#3300

Power of Diversity: Enhancing Data-Free Black-Box Attack with Domain-Augmented Learning

Yang Wei, Jingyu Tan, Guowen Xu et al.

AAAI 2025paper
#3301

Achieving Lightweight Super-Resolution for Real-Time Computer Graphics

Yu Wen, Chen Zhang, Chenhao Xie et al.

AAAI 2025paper
#3302

Multi-axis Prompt and Multi-dimension Fusion Network for All-in-one Weather-degraded Image Restoration

Yuanbo Wen, Tao Gao, Jing Zhang et al.

AAAI 2025paper
#3303

USDRL: Unified Skeleton-Based Dense Representation Learning with Multi-Grained Feature Decorrelation

Wanjiang Weng, Hongsong Wang, Junbo Wang et al.

AAAI 2025paperarXiv:2412.09220
#3304

Spin: Diffusion-based Semantic Image Painting Through Independent Information Injection

Dantong Wu, Zhiqiang Chen, Tianjiao Du et al.

AAAI 2025paper
#3305

Structural Pruning via Spatial-aware Information Redundancy for Semantic Segmentation

Dongyue Wu, Zilin Guo, Li Yu et al.

AAAI 2025paperarXiv:2412.12672
#3306

SVRMamba: Slice-to-Volume Reconstruction from Multiple MRI Stacks with Slice Sequence Guided Mamba

Jiangjie Wu, Hongjiang Wei, Yuyao Zhang

AAAI 2025paper
#3307

VarCMP: Adapting Cross-Modal Pre-Training Models for Video Anomaly Retrieval

Peng Wu, Wanshun Su, Xiangteng He et al.

AAAI 2025paper
#3308

Realistic Noise Synthesis with Diffusion Models

Qi Wu, Mingyan Han, Ting Jiang et al.

AAAI 2025paperarXiv:2305.14022
#3309

PanAdapter: Two-Stage Fine-Tuning with Spatial-Spectral Priors Injecting for Pansharpening

RuoCheng Wu, Zien Zhang, Shangqi Deng et al.

AAAI 2025paperarXiv:2409.06980
#3310

CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities

Tao Wu, Yong Zhang, Xintao Wang et al.

AAAI 2025paperarXiv:2408.13239
#3311

Deconfound Semantic Shift and Incompleteness in Incremental Few-shot Semantic Segmentation

Yirui Wu, Yuhang Xia, Hao Li et al.

AAAI 2025paper
#3312

Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark

Yongliang Wu, Wenbo Zhu, Jiawang Cao et al.

AAAI 2025paperarXiv:2412.08879
#3313

MUCD: Unsupervised Point Cloud Change Detection via Masked Consistency

Yue Wu, Zhipeng Wang, Yongzhe Yuan et al.

AAAI 2025paper
#3314

Unified Knowledge Maintenance Pruning and Progressive Recovery with Weight Recalling for Large Vision-Language Models

Zimeng Wu, Jiaxin Chen, Yunhong Wang

AAAI 2025paper
#3315

RETRACTED: GEONet: Global Enhancement and Optimization Network for Lane Detection

Suyang Xi, Yunhao Liu, Hong Ding et al.

AAAI 2025paper
#3316

PlaNet: Learning to Mitigate Atmospheric Turbulence in Planetary Images

Yifei Xia, Chu Zhou, Chengxuan Zhu et al.

AAAI 2025paper
#3317

CA-Edit: Causality-Aware Condition Adapter for High-Fidelity Local Facial Attribute Editing

Xiaole Xian, Xilin He, Zenghao Niu et al.

AAAI 2025paperarXiv:2412.13565
#3318

SMR-Net: Semantic-Guided Mutually Reinforcing Network for Cross-Modal Image Fusion and Salient Object Detection

Guobao Xiao, Xinyu Liu, Zebin Lin et al.

AAAI 2025paper
#3319

Boosting Vision State Space Model with Fractal Scanning

Haoke Xiao, Lv Tang, Peng-tao Jiang et al.

AAAI 2025paper
#3320

Text Proxy: Decomposing Retrieval from a 1-to-N Relationship into N 1-to-1 Relationships for Text-Video Retrieval

Jian Xiao, Zhenzhen Hu, Jia Li et al.

AAAI 2025paperarXiv:2410.06618
#3321

Cross-modulated Attention Transformer for RGBT Tracking

Yun Xiao, Jiacong Zhao, Andong Lu et al.

AAAI 2025paperarXiv:2408.02222
#3322

Omni-Query Active Learning for Source-Free Domain Adaptive Cross-Modality 3D Semantic Segmentation

Jianxiang Xie, Yao Wu, Yachao Zhang et al.

AAAI 2025paper
#3323

TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning

Jingjing Xie, Yuxin Zhang, Jun Peng et al.

AAAI 2025paperarXiv:2412.08176
#3324

Discrete Prior-Based Temporal-Coherent Content Prediction for Blind Face Video Restoration

Lianxin Xie, Bingbing Zheng, Wen Xue et al.

AAAI 2025paperarXiv:2501.09960
#3325

Expand VSR Benchmark for VLLM to Expertize in Spatial Rules

Peijin Xie, Lin Sun, Bingquan Liu et al.

AAAI 2025paperarXiv:2412.18224
#3326

PointTalk: Audio-Driven Dynamic Lip Point Cloud for 3D Gaussian-based Talking Head Synthesis

Yifan Xie, Tao Feng, Xin Zhang et al.

AAAI 2025paperarXiv:2412.08504
#3327

HieraFashDiff: Hierarchical Fashion Design with Multi-stage Diffusion Models

Zhifeng Xie, Hao Li, Huiming Ding et al.

AAAI 2025paperarXiv:2401.07450
#3328

Few-Shot Incremental Learning via Foreground Aggregation and Knowledge Transfer for Audio-Visual Semantic Segmentation

Jingqiao Xiu, Mengze Li, Zongxin Yang et al.

AAAI 2025paper
#3329

DiffScene: Diffusion-Based Safety-Critical Scenario Generation for Autonomous Vehicles

Chejian Xu, Aleksandr Petiushko, Ding Zhao et al.

AAAI 2025paper
#3330

FR²Seg: Continual Segmentation Across Multiple Sites via Fourier Style Replay and Adaptive Consistency Regularization

Cheng Xu, Weiwen Zhang, Hongrui Zhang et al.

AAAI 2025paper
#3331

Less Is More: Token Context-Aware Learning for Object Tracking

Chenlong Xu, Bineng Zhong, Qihua Liang et al.

AAAI 2025paperarXiv:2501.00758
#3332

3DHumanEdit: Multi-modal Body Part-aware Conditioning Information Integration for 3D Human Manipulation

FeiFan Xu, Tianyi Chen, Fan Yang et al.

AAAI 2025paper
#3333

Motion Artifact Removal in Pixel-Frequency Domain via Alternate Masks and Diffusion Model

Jiahua Xu, Dawei Zhou, Lei Hu et al.

AAAI 2025paperarXiv:2412.07590
#3334

OmniSR: Shadow Removal Under Direct and Indirect Lighting

Jiamin Xu, Zelong Li, Yuxin Zheng et al.

AAAI 2025paperarXiv:2410.01719
#3335

Multiple Feature Refining Network for Visual Emotion Distribution Learning

Qinfu Xu, Shaozu Yuan, Yiwei Wei et al.

AAAI 2025paper
#3336

SCKD: Semi-Supervised Cross-Modality Knowledge Distillation for 4D Radar Object Detection

Ruoyu Xu, Zhiyu Xiang, Chenwei Zhang et al.

AAAI 2025paperarXiv:2412.14571
#3337

LiON: Learning Point-Wise Abstaining Penalty for LiDAR Outlier DetectioN Using Diverse Synthetic Data

Shaocong Xu, Pengfei Li, Qianpu Sun et al.

AAAI 2025paperarXiv:2309.10230
#3338

Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models

Yifang Xu, Yunzhuo Sun, Benxiang Zhai et al.

AAAI 2025paperarXiv:2501.07972
#3339

HOIMamba: Efficient Mamba-based Disentangled Progressive Learning for HOI Detection

Yongchao Xu, Jiawei Liu, Sen Tao et al.

AAAI 2025paper
#3340

OOTDiffusion: Outfitting Fusion Based Latent Diffusion for Controllable Virtual Try-On

Yuhao Xu, Tao Gu, Weifeng Chen et al.

AAAI 2025paperarXiv:2403.01779
#3341

FLAME: Learning to Navigate with Multimodal LLM in Urban Environments

Yunzhe Xu, Yiyuan Pan, Zhe Liu et al.

AAAI 2025paperarXiv:2408.11051
#3342

FATE: Feature-Adapted Parameter Tuning for Vision-Language Models

Zhengqin Xu, Zelin Peng, Xiaokang Yang et al.

AAAI 2025paper
#3343

Toward Modality Gap: Vision Prototype Learning for Weakly-supervised Semantic Segmentation with CLIP

Zhongxing Xu, Feilong Tang, Zhe Chen et al.

AAAI 2025paperarXiv:2412.19650
#3344

RetouchGPT: LLM-based Interactive High-Fidelity Face Retouching via Imperfection Prompting

Wen Xue, Chun Ding, Ruotao Xu et al.

AAAI 2025paper
#3345

Physical Marker: Revealing Invisible Hyperlinks Hidden in Printed Trademarks

Yuliang Xue, Lei Tan, Guobiao Li et al.

AAAI 2025paper
#3346

Towards Universal Rainy Image Restoration: Benchmark and Baseline

Hujie Yan

AAAI 2025paper
#3347

SGTC: Semantic-Guided Triplet Co-training for Sparsely Annotated Semi-Supervised Medical Image Segmentation

Ke Yan, Qing Cai, Fan Zhang et al.

AAAI 2025paperarXiv:2412.15526
#3348

Data-Free Universal Attack by Exploiting the Intrinsic Vulnerability of Deep Models

YangTian Yan, Jinyu Tian

AAAI 2025paperarXiv:2503.22205
#3349

Robust Image Hashing Based on Contrastive Masked Autoencoder with Weak-Strong Augmentation Alignment

Cundian Yang, Guibo Luo, Yuesheng Zhu et al.

AAAI 2025paper
#3350

PlanLLM: Video Procedure Planning with Refinable Large Language Models

Dejie Yang, Zijing Zhao, Yang Liu

AAAI 2025paperarXiv:2412.19139
#3351

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection

Enquan Yang, Peng Xing, Hanyang Sun et al.

AAAI 2025paper
#3352

Diffusion Prior Interpolation for Flexibility Real-World Face Super-Resolution

Jiarui Yang, Tao Dai, Yufei Zhu et al.

AAAI 2025paperarXiv:2412.16552
#3353

SMamba: Sparse Mamba for Event-based Object Detection

Nan Yang, Yang Wang, Zhanwen Liu et al.

AAAI 2025paperarXiv:2501.11971
#3354

One-Shot Reference-based Structure-Aware Image to Sketch Synthesis

Rui Yang, Honghong Yang, Li Zhao et al.

AAAI 2025paper
#3355

LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding

Senqiao Yang, Jiaming Liu, Renrui Zhang et al.

AAAI 2025paperarXiv:2312.14074
#3356

Asymmetric Hierarchical Difference-aware Interaction Network for Event-guided Motion Deblurring

Wen Yang, Jinjian Wu, Leida Li et al.

AAAI 2025paper
#3357

Dual Information Purification for Lightweight SAR Object Detection

Xi Yang, Jiachen Sun, Songsong Duan et al.

AAAI 2025paper
#3358

DriveGazen: Event-Based Driving Status Recognition Using Conventional Camera

Xiaoyin Yang, Xin Yang

AAAI 2025paperarXiv:2412.11753
#3359

Semantic Segmentation on Raindrop Degraded Images Using Two-Stage Dual Teacher-Student Learning

Xin Yang, Wending Yan, Yuan Yuan et al.

AAAI 2025paper
#3360

ERF: A Benchmark Dataset for Robust Semantic Segmentation Under Extreme Rainfall Conditions

Xin Yang, Xin Zhang, Xinchao Wang

AAAI 2025paper
#3361

FreqTS: Frequency-Aware Token Selection for Accelerating Diffusion Models

Xinye Yang, Yuxin Yang, Haoran Pang et al.

AAAI 2025paper
#3362

Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving

Yu Yang, Jianbiao Mei, Yukai Ma et al.

AAAI 2025paperarXiv:2408.14197
#3363

UAWTrack: Universal 3D Single Object Tracking in Adverse Weather

Yuxiang Yang, Hongjie Gu, Yingqi Deng et al.

AAAI 2025paper
#3364

RealPortrait: Realistic Portrait Animation with Diffusion Transformers

Zejun Yang, Huawei Wei, Zhisheng Wang

AAAI 2025paper
#3365

Single Image Rolling Shutter Removal with Diffusion Models

Zhanglei Yang, Haipeng Li, Mingbo Hong et al.

AAAI 2025paperarXiv:2407.02906
#3366

MMGDreamer: Mixed-Modality Graph for Geometry-Controllable 3D Indoor Scene Generation

Zhifei Yang, Keyang Lu, Chao Zhang et al.

AAAI 2025paperarXiv:2502.05874
#3367

MoRe: Class Patch Attention Needs Regularization for Weakly Supervised Semantic Segmentation

Zhiwei Yang, Yucong Meng, Kexue Fu et al.

AAAI 2025paperarXiv:2412.11076
#3368

MM-Tracker: Motion Mamba for UAV-platform Multiple Object Tracking

Mufeng Yao, Jinlong Peng, Qingdong He et al.

AAAI 2025paper
#3369

As Pseudo-Label Free as Possible: Leveraging Adaptive Feature Generation for Sparsely Annotated Object Detection

Shuilian Yao, Yu Liu, Qi Jia et al.

AAAI 2025paper
#3370

Towards Open-Vocabulary Remote Sensing Image Semantic Segmentation

Chengyang Ye, Yunzhi Zhuge, Pingping Zhang

AAAI 2025paperarXiv:2412.19492
#3371

VersaFusion: A Versatile Diffusion-Based Framework for Fine-Grained Image Editing and Enhancement

Haocun Ye, Xinlong Jiang, Chenlong Gao et al.

AAAI 2025paper
#3372

PromptHaze: Prompting Real-world Dehazing via Depth Anything Model

Tian Ye, Sixiang Chen, Haoyu Chen et al.

AAAI 2025paper
#3373

Optimized Gradient Clipping for Noisy Label Learning

Xichen Ye, Yifan Wu, Weizhong Zhang et al.

AAAI 2025paperarXiv:2412.08941
#3374

Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language

Jeong Hun Yeo, Chae Won Kim, Hyunjun Kim et al.

AAAI 2025paperarXiv:2409.00986
#3375

FlexDataset: Crafting Annotated Dataset Generation for Diverse Applications

Ellen Yi-Ge, Leo Shawn

AAAI 2025paper
#3376

ImagePiece: Content-aware Re-tokenization for Efficient Image Recognition

Seungdong Yoa, Seungjun Lee, Hye-Seung Cho et al.

AAAI 2025paperarXiv:2412.16491
#3377

FOCUS: Towards Universal Foreground Segmentation

Zuyao You, Lingyu Kong, Lingchen Meng et al.

AAAI 2025paperarXiv:2501.05238
#3378

SGFormer: Semantic-Geometry Fusion Transformer for Multi-modal 3D Panoptic Segmentation

Hongqi Yu, Sixian Chan, Xiaolong Zhou et al.

AAAI 2025paper
#3379

Separating the Wheat from the Chaff: Spatio-Temporal Transformer with View-interweaved Attention for Photon-Efficient Depth Sensing

Letian Yu, Jiaxi Yang, Bo Dong et al.

AAAI 2025paper
#3380

ReMoGPT: Part-Level Retrieval-Augmented Motion-Language Models

Qing Yu, Mikihiro Tanaka, Kent Fujiwara

AAAI 2025paper
#3381

STGC-NeRF: Spatial-Temporal Geometric Consistency for LiDAR Neural Radiance Fields in Dynamic Scenes

Shangshu Yu, Xiaotian Sun, Wen Li et al.

AAAI 2025paper
#3382

Fine-grained Adaptive Visual Prompt for Generative Medical Visual Question Answering

Ting Yu, Zixuan Tong, Jun Yu et al.

AAAI 2025paper
#3383

OTPNet: ODE-inspired Tuning-free Proximal Network for Remote Sensing Image Fusion

Wei Yu, Zonglin Li, Qinglin Liu et al.

AAAI 2025paper
#3384

Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective

Xinmiao Yu, Xiaocheng Feng, Yun Li et al.

AAAI 2025paperarXiv:2412.17787
#3385

Building a Multi-modal Spatiotemporal Expert for Zero-shot Action Recognition with CLIP

Yating Yu, Congqi Cao, Yueran Zhang et al.

AAAI 2025paperarXiv:2412.09895
#3386

OLMD: Orientation-aware Long-term Motion Decoupling for Continuous Sign Language Recognition

Yiheng Yu, Sheng Liu, Yuan Feng et al.

AAAI 2025paperarXiv:2503.08205
#3387

Where Precision Meets Efficiency: Transformation Diffusion Model for Point Cloud Registration

Yongzhe Yuan, Yue Wu, Xiaolong Fan et al.

AAAI 2025paper
#3388

Efficient Neural Network Encoding for 3D Color Lookup Tables

Vahid Zehtab, David B. Lindell, Marcus A. Brubaker et al.

AAAI 2025paperarXiv:2412.15438
#3389

Gaze Label Alignment: Alleviating Domain Shift for Gaze Estimation

Guanzhong Zeng, Jingjing Wang, Zefu Xu et al.

AAAI 2025paperarXiv:2412.15601
#3390

TGFormer: Transformer with Track Query Group for Multi-Object Tracking

Rui Zeng, Yuanzhou Huang, Songwei Pei

AAAI 2025paper
#3391

World Knowledge-Enhanced Reasoning Using Instruction-Guided Interactor in Autonomous Driving

Mingliang Zhai, Cheng Li, Zengyuan Guo et al.

AAAI 2025paperarXiv:2412.06324
#3392

DetRF: Detachable Novel Views Synthesis of Dynamic Scenes Using Backdrop-Driven Neural Radiance Fields

Boyu Zhang, Zheng Zhu, Wenbo Xu

AAAI 2025paper
#3393

Training-Free and Hardware-Friendly Acceleration for Diffusion Models via Similarity-based Token Pruning

Evelyn Zhang, Jiayi Tang, Xuefei Ning et al.

AAAI 2025paper
#3394

When Open-Vocabulary Visual Question Answering Meets Causal Adapter: Benchmark and Approach

Feifei Zhang, Zhaoyi Zhang, Xi Zhang et al.

AAAI 2025paper
#3395

DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming

Jiaxin Zhang, Wentao Yang, Songxuan Lai et al.

AAAI 2025paperarXiv:2406.19101
#3396

Just a Few Glances: Open-Set Visual Perception with Image Prompt Paradigm

Jinrong Zhang, Penghui Wang, Chunxiao Liu et al.

AAAI 2025paperarXiv:2412.10719
#3397

R^2-Art: Category-Level Articulation Pose Estimation from Single RGB Image via Cascade Render Strategy

Li Zhang, Haonan Jiang, Yukang Huo et al.

AAAI 2025paper
#3398

Common Sense Bias Modeling for Classification Tasks

Miao Zhang, Zee Fryer, Ben Colman et al.

AAAI 2025paperarXiv:2401.13213
#3399

IRMamba: Pixel Difference Mamba with Layer Restoration for Infrared Small Target Detection

Mingjin Zhang, Xiaolong Li, Fei Gao et al.

AAAI 2025paper
#3400

MOCID: Motion Context and Displacement Information Learning for Moving Infrared Small Target Detection

Mingjin Zhang, Yuanjun Ouyang, Fei Gao et al.

AAAI 2025paper