Most Cited AAAI "physics-guided architecture" Papers
5,317 papers found • Page 17 of 27
Conference
Semi-supervised 3D Semantic Scene Completion with 2D Vision Foundation Model Guidance
Duc-Hai Pham, Duc-Dung Nguyen, Anh Pham et al.
Leveraging Anatomical Consistency for Multi-Object Detection in Ultrasound Images via Source-free Unsupervised Domain Adaptation
Bin Pu, Xingguo Lv, Jiewen Yang et al.
Dive into Aerial Remote Sensing Underwater Depth Estimation with Hyperspectral Imagery
Jiahao Qi, Xingyue Liu, Chen Chen et al.
PhysDiff: Physiology-based Dynamicity Disentangled Diffusion Model for Remote Physiological Measurement
Wei Qian, Gaoji Su, Dan Guo et al.
Holistic Correction with Object Prototype for Video Object Segmentation
Shengye Qiao, Changqun Xia, Yanjie Liang et al.
Integrating Low-Level Visual Cues for Enhanced Unsupervised Semantic Segmentation
Yuhao Qing, Dan Zeng, Shaorong Xie et al.
PC-BEV: An Efficient Polar-Cartesian BEV Fusion Framework for LiDAR Semantic Segmentation
Shoumeng Qiu, Xinrun Li, Xiangyang Xue et al.
High-Fidelity Polarimetric Implicit 3D Reconstruction with View-Dependent Physical Representation
Yu Qiu, Sijia Wen, Hainan Zhang et al.
HSOD-BIT-V2: A Challenging Benchmark for Hyperspectral Salient Object Detection
Yuhao Qiu, Shuyan Bai, Tingfa Xu et al.
Universal Features Guided Zero-Shot Category-Level Object Pose Estimation
Wentian Qu, Chenyu Meng, Heng Li et al.
GHOST: Gaussian Hypothesis Open-Set Technique
Ryan Rabinowitz, Steve Cruz, Manuel Günther et al.
CDTR: Semantic Alignment for Video Moment Retrieval Using Concept Decomposition Transformer
Ran Ran, Jiwei Wei, Xiangyi Cai et al.
Improving Integrated Gradient-based Transferable Adversarial Examples by Refining the Integration Path
Yuchen Ren, Zhengyu Zhao, Chenhao Lin et al.
GenHMR: Generative Human Mesh Recovery
Muhammad Usama Saleem, Ekkasit Pinyoanuntapong, Pu Wang et al.
FunEditor: Achieving Complex Image Edits via Function Aggregation with Diffusion Models
Mohammadreza Samadi, Fred X. Han, Mohammad Salameh et al.
PVTree: Realistic and Controllable Palm Vein Generation for Recognition Tasks
Sheng Shang, Chenglong Zhao, Ruixin Zhang et al.
Video Summarization Using Denoising Diffusion Probabilistic Model
Zirui Shang, Yubo Zhu, Hongxi Li et al.
IMAGDressing-v1: Customizable Virtual Dressing
Fei Shen, Xin Jiang, Xin He et al.
In2NeCT: Inter-class and Intra-class Neural Collapse Tuning for Semantic Segmentation of Imbalanced Remote Sensing Images
Junao Shen, Qiyun Hu, Tian Feng et al.
Topology-Aware 3D Gaussian Splatting: Leveraging Persistent Homology for Optimized Structural Integrity
Tianqi Shen, Shaohua Liu, Jiaqi Feng et al.
Free-Moving Object Reconstruction and Pose Estimation with Virtual Camera
Haixin Shi, Yinlin Hu, Daniel Koguciuk et al.
Normal-NeRF: Ambiguity-Robust Normal Estimation for Highly Reflective Scenes
Ji Shi, Xianghua Ying, Ruohao Guo et al.
Neural Block Compression: Variable Bitrates Feature Blocks for Texture Representation
Rui Shi, Yishun Dou, Zhong Zheng et al.
HS-FPN: High Frequency and Spatial Perception FPN for Tiny Object Detection
Zican Shi, Jing Hu, Jie Ren et al.
SdalsNet: Self-Distilled Attention Localization and Shift Network for Unsupervised Camouflaged Object Detection
Peiyao Shou, Yixiu Liu, Wei Wang et al.
OGP-Net: Optical Guidance Meets Pixel-Level Contrastive Distillation for Robust Multi-Modal and Missing Modality Segmentation
Aniruddh Sikdar, Jayant Teotia, Suresh Sundaram
Fine-Grained Perception in Panoramic Scenes: A Novel Task, Dataset, and Method for Object Importance Ranking
Jia Song, Chenglizhao Chen, Xu Yu et al.
CtrlAvatar: Controllable Avatars Generation via Disentangled Invertible Networks
Wenfeng Song, Yang Ding, Fei Hou et al.
ERL-MPP: Evolutionary Reinforcement Learning with Multi-head Puzzle Perception for Solving Large-scale Jigsaw Puzzles of Eroded Gaps
Xingke Song, Xiaoying Yang, Chenglin Yao et al.
Temporal Coherent Object Flow for Multi-Object Tracking
Zikai Song, Run Luo, Lintao Ma et al.
Toward Improving Robustness and Accuracy in Unsupervised Domain Adaptation
Aishwarya Soni, Tanima Dutta
Hierarchical Vector Quantization for Unsupervised Action Segmentation
Federico Spurio, Emad Bahrami, Gianpiero Francesca et al.
Can We Get Rid of Handcrafted Feature Extractors? SparseViT: Nonsemantics-Centered, Parameter-Efficient Image Manipulation Localization Through Spare-Coding Transformer
Lei Su, Xiaochen Ma, Xuekang Zhu et al.
EigenSR: Eigenimage-Bridged Pre-Trained RGB Learners for Single Hyperspectral Image Super-Resolution
Xi Su, Xiangfei Shen, Mingyang Wan et al.
Dual-branch Graph Feature Learning for NLOS Imaging
Xiongfei Su, Tianyi Zhu, Lina Liu et al.
Explicit Relational Reasoning Network for Scene Text Detection
Yuchen Su, Zhineng Chen, Yongkun Du et al.
3D Annotation-Free Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving
Boyi Sun, Yuhang Liu, Xingxia Wang et al.
NeuralFlix: A Simple While Effective Framework for Semantic Decoding of Videos from Non-invasive Brain Recordings
Jingyuan Sun, Mingxiao Li, Marie-Francine Moens
Guided and Variance-Corrected Fusion with One-shot Style Alignment for Large-Content Image Generation
Shoukun Sun, Min Xian, Tiankai Yao et al.
M2Flow: A Motion Information Fusion Framework for Enhanced Unsupervised Optical Flow Estimation in Autonomous Driving
Xunpei Sun, Gang Chen, Zuoxun Hou
Leveraging Large Vision-Language Model as User Intent-Aware Encoder for Composed Image Retrieval
Zelong Sun, Dong Jing, Guoxing Yang et al.
C2P-CLIP: Injecting Category Common Prompt in CLIP to Enhance Generalization in Deepfake Detection
Chuangchuang Tan, Renshuai Tao, Huan Liu et al.
Neighbor Does Matter: Density-Aware Contrastive Learning for Medical Semi-supervised Segmentation
Feilong Tang, Zhongxing Xu, Ming Hu et al.
MUSE: Mamba Is Efficient Multi-scale Learner for Text-video Retrieval
Haoran Tang, Meng Cao, Jinfa Huang et al.
BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving
Tao Tang, Dafeng Wei, Zhengyu Jia et al.
More Text, Less Point: Towards 3D Data-Efficient Point-Language Understanding
Yuan Tang, Xu Han, Xianzhi Li et al.
RAGG: Retrieval-Augmented Grasp Generation Model
Zhenhua Tang, Bin Zhu, Yanbin Hao et al.
From Representation Space to Prognostic Insights: Whole Slide Image Generation with Hierarchical Diffusion Model for Survival Prediction
Zhihao Tang, Xi Zhang, Chaozhuo Li
3D²-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling
Zichen Tang, Hongyu Yang, Hanchen Zhang et al.
Stitch, Contrast, and Segment: Learning a Human Action Segmentation Model Using Trimmed Skeleton Videos
Haitao Tian, Pierre Payeur
Unsupervised Self-Prior Embedding Neural Representation for Iterative Sparse-View CT Reconstruction
Xuanyu Tian, Lixuan Chen, Qing Wu et al.
AI-generated Image Quality Assessment in Visual Communication
Yu Tian, Yixuan Li, Baoliang Chen et al.
G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o
Tony Cheng Tong, Sirui He, Zhiwen Shao et al.
Memory-Augmented Re-Completion for 3D Semantic Scene Completion
Yu-Wen Tseng, Sheng-Ping Yang, Jhih-Ciang Wu et al.
VOILA: Complexity-Aware Universal Segmentation of CT Images by Voxel Interacting with Language
Zishuo Wan, Yu Gao, Wanyuan Pang et al.
ParGo: Bridging Vision-Language with Partial and Global Views
An-Lan Wang, Bin Shan, Wei Shi et al.
RA-GAR: A Richly Annotated Benchmark for Gait Attribute Recognition
Chenye Wang, Saihui Hou, Aoqi Li et al.
Towards Efficient Object Re-Identification with a Novel Cloud-Edge Collaborative Framework
Chuanming Wang, Yuxin Yang, Mengshi Qi et al.
Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled Guidance
Cunzheng Wang, Ziyuan Guo, Yuxuan Duan et al.
A Black-Box Evaluation Framework for Semantic Robustness in Bird’s Eye View Detection
Fu Wang, Yanghao Zhang, Xiangyu Yin et al.
Scene Graph-Grounded Image Generation
Fuyun Wang, Tong Zhang, Yuanzhi Wang et al.
S³-Mamba: Small-Size-Sensitive Mamba for Lesion Segmentation
Gui Wang, Yuexiang Li, Wenting Chen et al.
BLS-GAN: A Deep Layer Separation Framework for Eliminating Bone Overlap in Conventional Radiographs
Haolin Wang, Yafei Ou, Prasoon Ambalathankandy et al.
EMControl: Adding Conditional Control to Text-to-Image Diffusion Models via Expectation-Maximization
He Wang, Longquan Dai, Jinhui Tang
M2OST: Many-to-one Regression for Predicting Spatial Transcriptomics from Digital Pathology Images
Hongyi Wang, Xiuju Du, Jing Liu et al.
RAP-SR: RestorAtion Prior Enhancement in Diffusion Models for Realistic Image Super-Resolution
Jiangang Wang, Qingnan Fan, Jinwei Chen et al.
MM-Mixing: Multi-Modal Mixing Alignment for 3D Understanding
Jiaze Wang, Yi Wang, Ziyu Guo et al.
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision
Junjie Wang, Bin Chen, Bin Kang et al.
InpDiffusion: Image Inpainting Localization via Conditional Diffusion Models
Kai Wang, Shaozhang Niu, Qixian Hao et al.
Tracking Everything Everywhere across Multiple Cameras
Li-Heng Wang, YuJu Cheng, Tyng-Luh Liu
VLScene: Vision-Language Guidance Distillation for Camera-Based 3D Semantic Scene Completion
Meng Wang, Huilong Pi, Ruihui Li et al.
Deep Multi-modal Graph Clustering via Graph Transformer Network
Qianqian Wang, Haiming Xu, Zihao Zhang et al.
The Parables of the Mustard Seed and the Yeast: Extremely Low-Budget, High-Performance Nighttime Semantic Segmentation
Shiqin Wang, Xin Xu, Haoyang Chen et al.
GFlow: Recovering 4D World from Monocular Video
Shizun Wang, Xingyi Yang, Qiuhong Shen et al.
Imagine: Image-Guided 3D Part Assembly with Structure Knowledge Graph
Weihao Wang, Yu Lan, Mingyu You et al.
MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences
Weitao Wang, Haoran Xu, Yuxiao Yang et al.
FreeGen: Bridging Visual-Linguistic Discrepancies Towards Diffusion-based Pixel-level Data Synthesis
Wenzhuang Wang, Mingcan Ma, Yong Chen et al.
DCTMamba: Advancing JPEG Image Restoration Through Long-Sequence Modeling and Adaptive Frequency Strategy
Xi Wang, Xueyang Fu, Liang Li et al.
From 2D CAD Drawings to 3D Parametric Models: A Vision-Language Approach
Xilin Wang, Jia Zheng, Yuanchao Hu et al.
Lifting Scheme-Based Implicit Disentanglement of Emotion-Related Facial Dynamics in the Wild
Xingjian Wang, Li Chai
MIMTrack: In-Context Tracking via Masked Image Modeling
Xingmei Wang, Guohao Nie, Jiaxiang Meng et al.
From Coarse to Fine: A Matching and Alignment Framework for Unsupervised Cross-View Geo-Localization
Xueyi Wang, Lele Zhang, Zheng Fan et al.
RefDetector: A Simple Yet Effective Matching-based Method for Referring Expression Comprehension
Yabing Wang, Zhuotao Tian, Zheng Qin et al.
Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression Comprehension
Yaxian Wang, Henghui Ding, Shuting He et al.
Breaking Barriers in Physical-World Adversarial Examples: Improving Robustness and Transferability via Robust Feature
Yichen Wang, Yuxuan Chou, Ziqi Zhou et al.
Capturing the Unseen: Vision-Free Facial Motion Capture Using Inertial Measurement Units
Youjia Wang, Yiwen Wu, Hengan Zhou et al.
Re-Attentional Controllable Video Diffusion Editing
Yuanzhi Wang, Yong Li, Mengyi Liu et al.
MambaPro: Multi-Modal Object Re-identification with Mamba Aggregation and Synergistic Prompt
Yuhao Wang, Xuehu Liu, Tianyu Yan et al.
IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Primary Word Emphasis
Yuji Wang, Jingchen Ni, Yong Liu et al.
Target Scanpath-Guided 360-Degree Image Enhancement
Yujia Wang, Fang-Lue Zhang, Neil A. Dodgson
DualNet: Robust Self-Supervised Stereo Matching with Pseudo-Label Supervision
Yun Wang, Jiahao Zheng, Chenghao Zhang et al.
Mamba YOLO: A Simple Baseline for Object Detection with State Space Model
Zeyu Wang, Chen Li, Huiying Xu et al.
Style Nursing with Spatial and Semantic Guidance for Zero-Shot Traffic Scene Style Transfer
Zhen Wang, Zihang Lin, Meng Yuan et al.
Thermal-Aware Low-Light Image Enhancement: A Real-World Benchmark and a New Light-Weight Model
Zhen Wang, Yaozu Wu, Dongyuan Li et al.
Attention-Imperceptible Backdoor Attacks on Vision Transformers
Zhishen Wang, Rui Wang, Lihua Jing
LLM-RG4: Flexible and Factual Radiology Report Generation Across Diverse Input Contexts
Zhuhao Wang, Yihua Sun, Zihan Li et al.
MSV-PCT: Multi-Sparse-View Enhanced Transformer Framework for Salient Object Detection in Point Clouds
Zihao Wang, Yiming Huang, Gengyu Lyu et al.
GlyphSR: A Simple Glyph-Aware Framework for Scene Text Image Super-Resolution
Baole Wei, Yuxuan Zhou, Liangcai Gao et al.
Power of Diversity: Enhancing Data-Free Black-Box Attack with Domain-Augmented Learning
Yang Wei, Jingyu Tan, Guowen Xu et al.
Achieving Lightweight Super-Resolution for Real-Time Computer Graphics
Yu Wen, Chen Zhang, Chenhao Xie et al.
Multi-axis Prompt and Multi-dimension Fusion Network for All-in-one Weather-degraded Image Restoration
Yuanbo Wen, Tao Gao, Jing Zhang et al.
USDRL: Unified Skeleton-Based Dense Representation Learning with Multi-Grained Feature Decorrelation
Wanjiang Weng, Hongsong Wang, Junbo Wang et al.
Spin: Diffusion-based Semantic Image Painting Through Independent Information Injection
Dantong Wu, Zhiqiang Chen, Tianjiao Du et al.
Structural Pruning via Spatial-aware Information Redundancy for Semantic Segmentation
Dongyue Wu, Zilin Guo, Li Yu et al.
SVRMamba: Slice-to-Volume Reconstruction from Multiple MRI Stacks with Slice Sequence Guided Mamba
Jiangjie Wu, Hongjiang Wei, Yuyao Zhang
VarCMP: Adapting Cross-Modal Pre-Training Models for Video Anomaly Retrieval
Peng Wu, Wanshun Su, Xiangteng He et al.
Realistic Noise Synthesis with Diffusion Models
Qi Wu, Mingyan Han, Ting Jiang et al.
PanAdapter: Two-Stage Fine-Tuning with Spatial-Spectral Priors Injecting for Pansharpening
RuoCheng Wu, Zien Zhang, Shangqi Deng et al.
CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities
Tao Wu, Yong Zhang, Xintao Wang et al.
Deconfound Semantic Shift and Incompleteness in Incremental Few-shot Semantic Segmentation
Yirui Wu, Yuhang Xia, Hao Li et al.
Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark
Yongliang Wu, Wenbo Zhu, Jiawang Cao et al.
MUCD: Unsupervised Point Cloud Change Detection via Masked Consistency
Yue Wu, Zhipeng Wang, Yongzhe Yuan et al.
Unified Knowledge Maintenance Pruning and Progressive Recovery with Weight Recalling for Large Vision-Language Models
Zimeng Wu, Jiaxin Chen, Yunhong Wang
RETRACTED: GEONet: Global Enhancement and Optimization Network for Lane Detection
Suyang Xi, Yunhao Liu, Hong Ding et al.
PlaNet: Learning to Mitigate Atmospheric Turbulence in Planetary Images
Yifei Xia, Chu Zhou, Chengxuan Zhu et al.
CA-Edit: Causality-Aware Condition Adapter for High-Fidelity Local Facial Attribute Editing
Xiaole Xian, Xilin He, Zenghao Niu et al.
SMR-Net: Semantic-Guided Mutually Reinforcing Network for Cross-Modal Image Fusion and Salient Object Detection
Guobao Xiao, Xinyu Liu, Zebin Lin et al.
Boosting Vision State Space Model with Fractal Scanning
Haoke Xiao, Lv Tang, Peng-tao Jiang et al.
Text Proxy: Decomposing Retrieval from a 1-to-N Relationship into N 1-to-1 Relationships for Text-Video Retrieval
Jian Xiao, Zhenzhen Hu, Jia Li et al.
Cross-modulated Attention Transformer for RGBT Tracking
Yun Xiao, Jiacong Zhao, Andong Lu et al.
Omni-Query Active Learning for Source-Free Domain Adaptive Cross-Modality 3D Semantic Segmentation
Jianxiang Xie, Yao Wu, Yachao Zhang et al.
TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning
Jingjing Xie, Yuxin Zhang, Jun Peng et al.
Discrete Prior-Based Temporal-Coherent Content Prediction for Blind Face Video Restoration
Lianxin Xie, Bingbing Zheng, Wen Xue et al.
Expand VSR Benchmark for VLLM to Expertize in Spatial Rules
Peijin Xie, Lin Sun, Bingquan Liu et al.
PointTalk: Audio-Driven Dynamic Lip Point Cloud for 3D Gaussian-based Talking Head Synthesis
Yifan Xie, Tao Feng, Xin Zhang et al.
HieraFashDiff: Hierarchical Fashion Design with Multi-stage Diffusion Models
Zhifeng Xie, Hao Li, Huiming Ding et al.
Few-Shot Incremental Learning via Foreground Aggregation and Knowledge Transfer for Audio-Visual Semantic Segmentation
Jingqiao Xiu, Mengze Li, Zongxin Yang et al.
DiffScene: Diffusion-Based Safety-Critical Scenario Generation for Autonomous Vehicles
Chejian Xu, Aleksandr Petiushko, Ding Zhao et al.
FR²Seg: Continual Segmentation Across Multiple Sites via Fourier Style Replay and Adaptive Consistency Regularization
Cheng Xu, Weiwen Zhang, Hongrui Zhang et al.
Less Is More: Token Context-Aware Learning for Object Tracking
Chenlong Xu, Bineng Zhong, Qihua Liang et al.
3DHumanEdit: Multi-modal Body Part-aware Conditioning Information Integration for 3D Human Manipulation
FeiFan Xu, Tianyi Chen, Fan Yang et al.
Motion Artifact Removal in Pixel-Frequency Domain via Alternate Masks and Diffusion Model
Jiahua Xu, Dawei Zhou, Lei Hu et al.
OmniSR: Shadow Removal Under Direct and Indirect Lighting
Jiamin Xu, Zelong Li, Yuxin Zheng et al.
Multiple Feature Refining Network for Visual Emotion Distribution Learning
Qinfu Xu, Shaozu Yuan, Yiwei Wei et al.
SCKD: Semi-Supervised Cross-Modality Knowledge Distillation for 4D Radar Object Detection
Ruoyu Xu, Zhiyu Xiang, Chenwei Zhang et al.
LiON: Learning Point-Wise Abstaining Penalty for LiDAR Outlier DetectioN Using Diverse Synthetic Data
Shaocong Xu, Pengfei Li, Qianpu Sun et al.
Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models
Yifang Xu, Yunzhuo Sun, Benxiang Zhai et al.
HOIMamba: Efficient Mamba-based Disentangled Progressive Learning for HOI Detection
Yongchao Xu, Jiawei Liu, Sen Tao et al.
OOTDiffusion: Outfitting Fusion Based Latent Diffusion for Controllable Virtual Try-On
Yuhao Xu, Tao Gu, Weifeng Chen et al.
FLAME: Learning to Navigate with Multimodal LLM in Urban Environments
Yunzhe Xu, Yiyuan Pan, Zhe Liu et al.
FATE: Feature-Adapted Parameter Tuning for Vision-Language Models
Zhengqin Xu, Zelin Peng, Xiaokang Yang et al.
Toward Modality Gap: Vision Prototype Learning for Weakly-supervised Semantic Segmentation with CLIP
Zhongxing Xu, Feilong Tang, Zhe Chen et al.
RetouchGPT: LLM-based Interactive High-Fidelity Face Retouching via Imperfection Prompting
Wen Xue, Chun Ding, Ruotao Xu et al.
Physical Marker: Revealing Invisible Hyperlinks Hidden in Printed Trademarks
Yuliang Xue, Lei Tan, Guobiao Li et al.
Towards Universal Rainy Image Restoration: Benchmark and Baseline
Hujie Yan
SGTC: Semantic-Guided Triplet Co-training for Sparsely Annotated Semi-Supervised Medical Image Segmentation
Ke Yan, Qing Cai, Fan Zhang et al.
Data-Free Universal Attack by Exploiting the Intrinsic Vulnerability of Deep Models
YangTian Yan, Jinyu Tian
Robust Image Hashing Based on Contrastive Masked Autoencoder with Weak-Strong Augmentation Alignment
Cundian Yang, Guibo Luo, Yuesheng Zhu et al.
PlanLLM: Video Procedure Planning with Refinable Large Language Models
Dejie Yang, Zijing Zhao, Yang Liu
3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection
Enquan Yang, Peng Xing, Hanyang Sun et al.
Diffusion Prior Interpolation for Flexibility Real-World Face Super-Resolution
Jiarui Yang, Tao Dai, Yufei Zhu et al.
SMamba: Sparse Mamba for Event-based Object Detection
Nan Yang, Yang Wang, Zhanwen Liu et al.
One-Shot Reference-based Structure-Aware Image to Sketch Synthesis
Rui Yang, Honghong Yang, Li Zhao et al.
LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding
Senqiao Yang, Jiaming Liu, Renrui Zhang et al.
Asymmetric Hierarchical Difference-aware Interaction Network for Event-guided Motion Deblurring
Wen Yang, Jinjian Wu, Leida Li et al.
Dual Information Purification for Lightweight SAR Object Detection
Xi Yang, Jiachen Sun, Songsong Duan et al.
DriveGazen: Event-Based Driving Status Recognition Using Conventional Camera
Xiaoyin Yang, Xin Yang
Semantic Segmentation on Raindrop Degraded Images Using Two-Stage Dual Teacher-Student Learning
Xin Yang, Wending Yan, Yuan Yuan et al.
ERF: A Benchmark Dataset for Robust Semantic Segmentation Under Extreme Rainfall Conditions
Xin Yang, Xin Zhang, Xinchao Wang
FreqTS: Frequency-Aware Token Selection for Accelerating Diffusion Models
Xinye Yang, Yuxin Yang, Haoran Pang et al.
Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving
Yu Yang, Jianbiao Mei, Yukai Ma et al.
UAWTrack: Universal 3D Single Object Tracking in Adverse Weather
Yuxiang Yang, Hongjie Gu, Yingqi Deng et al.
RealPortrait: Realistic Portrait Animation with Diffusion Transformers
Zejun Yang, Huawei Wei, Zhisheng Wang
Single Image Rolling Shutter Removal with Diffusion Models
Zhanglei Yang, Haipeng Li, Mingbo Hong et al.
MMGDreamer: Mixed-Modality Graph for Geometry-Controllable 3D Indoor Scene Generation
Zhifei Yang, Keyang Lu, Chao Zhang et al.
MoRe: Class Patch Attention Needs Regularization for Weakly Supervised Semantic Segmentation
Zhiwei Yang, Yucong Meng, Kexue Fu et al.
MM-Tracker: Motion Mamba for UAV-platform Multiple Object Tracking
Mufeng Yao, Jinlong Peng, Qingdong He et al.
As Pseudo-Label Free as Possible: Leveraging Adaptive Feature Generation for Sparsely Annotated Object Detection
Shuilian Yao, Yu Liu, Qi Jia et al.
Towards Open-Vocabulary Remote Sensing Image Semantic Segmentation
Chengyang Ye, Yunzhi Zhuge, Pingping Zhang
VersaFusion: A Versatile Diffusion-Based Framework for Fine-Grained Image Editing and Enhancement
Haocun Ye, Xinlong Jiang, Chenlong Gao et al.
PromptHaze: Prompting Real-world Dehazing via Depth Anything Model
Tian Ye, Sixiang Chen, Haoyu Chen et al.
Optimized Gradient Clipping for Noisy Label Learning
Xichen Ye, Yifan Wu, Weizhong Zhang et al.
Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language
Jeong Hun Yeo, Chae Won Kim, Hyunjun Kim et al.
FlexDataset: Crafting Annotated Dataset Generation for Diverse Applications
Ellen Yi-Ge, Leo Shawn
ImagePiece: Content-aware Re-tokenization for Efficient Image Recognition
Seungdong Yoa, Seungjun Lee, Hye-Seung Cho et al.
FOCUS: Towards Universal Foreground Segmentation
Zuyao You, Lingyu Kong, Lingchen Meng et al.
SGFormer: Semantic-Geometry Fusion Transformer for Multi-modal 3D Panoptic Segmentation
Hongqi Yu, Sixian Chan, Xiaolong Zhou et al.
Separating the Wheat from the Chaff: Spatio-Temporal Transformer with View-interweaved Attention for Photon-Efficient Depth Sensing
Letian Yu, Jiaxi Yang, Bo Dong et al.
ReMoGPT: Part-Level Retrieval-Augmented Motion-Language Models
Qing Yu, Mikihiro Tanaka, Kent Fujiwara
STGC-NeRF: Spatial-Temporal Geometric Consistency for LiDAR Neural Radiance Fields in Dynamic Scenes
Shangshu Yu, Xiaotian Sun, Wen Li et al.
Fine-grained Adaptive Visual Prompt for Generative Medical Visual Question Answering
Ting Yu, Zixuan Tong, Jun Yu et al.
OTPNet: ODE-inspired Tuning-free Proximal Network for Remote Sensing Image Fusion
Wei Yu, Zonglin Li, Qinglin Liu et al.
Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective
Xinmiao Yu, Xiaocheng Feng, Yun Li et al.
Building a Multi-modal Spatiotemporal Expert for Zero-shot Action Recognition with CLIP
Yating Yu, Congqi Cao, Yueran Zhang et al.
OLMD: Orientation-aware Long-term Motion Decoupling for Continuous Sign Language Recognition
Yiheng Yu, Sheng Liu, Yuan Feng et al.
Where Precision Meets Efficiency: Transformation Diffusion Model for Point Cloud Registration
Yongzhe Yuan, Yue Wu, Xiaolong Fan et al.
Efficient Neural Network Encoding for 3D Color Lookup Tables
Vahid Zehtab, David B. Lindell, Marcus A. Brubaker et al.
Gaze Label Alignment: Alleviating Domain Shift for Gaze Estimation
Guanzhong Zeng, Jingjing Wang, Zefu Xu et al.
TGFormer: Transformer with Track Query Group for Multi-Object Tracking
Rui Zeng, Yuanzhou Huang, Songwei Pei
World Knowledge-Enhanced Reasoning Using Instruction-Guided Interactor in Autonomous Driving
Mingliang Zhai, Cheng Li, Zengyuan Guo et al.
DetRF: Detachable Novel Views Synthesis of Dynamic Scenes Using Backdrop-Driven Neural Radiance Fields
Boyu Zhang, Zheng Zhu, Wenbo Xu
Training-Free and Hardware-Friendly Acceleration for Diffusion Models via Similarity-based Token Pruning
Evelyn Zhang, Jiayi Tang, Xuefei Ning et al.
When Open-Vocabulary Visual Question Answering Meets Causal Adapter: Benchmark and Approach
Feifei Zhang, Zhaoyi Zhang, Xi Zhang et al.
DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming
Jiaxin Zhang, Wentao Yang, Songxuan Lai et al.
Just a Few Glances: Open-Set Visual Perception with Image Prompt Paradigm
Jinrong Zhang, Penghui Wang, Chunxiao Liu et al.
R^2-Art: Category-Level Articulation Pose Estimation from Single RGB Image via Cascade Render Strategy
Li Zhang, Haonan Jiang, Yukang Huo et al.
Common Sense Bias Modeling for Classification Tasks
Miao Zhang, Zee Fryer, Ben Colman et al.
IRMamba: Pixel Difference Mamba with Layer Restoration for Infrared Small Target Detection
Mingjin Zhang, Xiaolong Li, Fei Gao et al.
MOCID: Motion Context and Displacement Information Learning for Moving Infrared Small Target Detection
Mingjin Zhang, Yuanjun Ouyang, Fei Gao et al.
Decoupling Scattering: Pseudo-Label Guided NeRF for Scenes with Scattering Media
Mingyang Zhang, Junkang Zhang, Faming Fang et al.