Most Cited ICCV "differentiable structural information" Papers
2,701 papers found • Page 12 of 14
Conference
Federated Domain Generalization with Domain-specific Soft Prompts Generation
Jianhan Wu, Xiaoyang Qu, Zhangcheng Huang et al.
ForgeLens: Data-Efficient Forgery Focus for Generalizable Forgery Image Detection
Yingjian Chen, Lei Zhang, Yakun Niu
Incremental Few-Shot Semantic Segmentation via Multi-Level Switchable Visual Prompts
Maoxian Wan, Kaige Li, Qichuan Geng et al.
Embodied Representation Alignment with Mirror Neurons
Wentao Zhu, Zhining Zhang, Yuwei Ren et al.
Selective Contrastive Learning for Weakly Supervised Affordance Grounding
WonJun Moon, Hyun Seok Seong, Jae-Pil Heo
EVOLVE: Event-Guided Deformable Feature Transfer and Dual-Memory Refinement for Low-Light Video Object Segmentation
Jong Hyeon Baek, Jiwon oh, Yeong Jun Koh
AG2aussian: Anchor-Graph Structured Gaussian Splatting for Instance-Level 3D Scene Understanding and Editing
Zhaonan Wang, Manyi Li, Changhe Tu
InterGSEdit: Interactive 3D Gaussian Splatting Editing with 3D Geometry-Consistent Attention Prior
Minghao Wen, Shengjie Wu, Kangkan Wang et al.
Benchmarking Multimodal Large Language Models Against Image Corruptions
Xinkuan Qiu, Meina Kan, Yongbin Zhou et al.
Deterministic Object Pose Confidence Region Estimation
Jinghao Wang, Zhang Li, Zi Wang et al.
Decoupled Multi-Predictor Optimization for Inference-Efficient Model Tuning
Liwei Luo, Shuaitengyuan Li, Dongwei Ren et al.
ACAM-KD: Adaptive and Cooperative Attention Masking for Knowledge Distillation
Qizhen Lan, Qing Tian
MotionCtrl: A Real-time Controllable Vision-Language-Motion Model
Bin Cao, Sipeng Zheng, Ye Wang et al.
SALAD -- Semantics-Aware Logical Anomaly Detection
Matic Fučka, Vitjan Zavrtanik, Danijel Skocaj
VLR-Driver: Large Vision-Language-Reasoning Models for Embodied Autonomous Driving
Fanjie Kong, Yitong Li, Weihuang Chen et al.
Vid-Group: Temporal Video Grounding Pretraining from Unlabeled Videos in the Wild
Peijun Bao, Chenqi Kong, SIYUAN YANG et al.
Temperature in Cosine-based Softmax Loss
Takumi Kobayashi
Multi-modal Segment Anything Model for Camouflaged Scene Segmentation
Guangyu Ren, Hengyan Liu, Michalis Lazarou et al.
Can We Achieve Efficient Diffusion Without Self-Attention? Distilling Self-Attention into Convolutions
ZiYi Dong, Chengxing Zhou, Weijian Deng et al.
Ultra-Precision 6DoF Pose Estimation Using 2-D Interpolated Discrete Fourier Transform
Guowei Shi, Zian Mao, Peisen Huang
AMDANet: Attention-Driven Multi-Perspective Discrepancy Alignment for RGB-Infrared Image Fusion and Segmentation
Haifeng Zhong, Fan Tang, Zhuo Chen et al.
Prompt Guidance and Human Proximal Perception for HOT Prediction with Regional Joint Loss
Yuxiao Wang, Yu Lei, Zhenao WEI et al.
Coupling the Generator with Teacher for Effective Data-Free Knowledge Distillation
Xu Chen, Yang Li, Yahong Han et al.
Towards a Universal Image Degradation Model via Content-Degradation Disentanglement
Wenbo Yang, Zhongling Wang, Zhou Wang
Know Your Attention Maps: Class-specific Token Masking for Weakly Supervised Semantic Segmentation
Joëlle Hanna, Damian Borth
Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective
Yingyu Liang, Zhizhou Sha, Zhenmei Shi et al.
FDPT: Federated Discrete Prompt Tuning for Black-Box Visual-Language Models
Jiaqi Wu, Simin Chen, Jing Tang et al.
A Tiny Change, A Giant Leap: Long-Tailed Class-Incremental Learning via Geometric Prototype Alignment
xinyi lai, Luojun Lin, Weijie Chen et al.
Sparfels: Fast Reconstruction from Sparse Unposed Imagery
Shubhendu Jena, Amine Ouasfi, Mae Younes et al.
Underwater Visual SLAM with Depth Uncertainty and Medium Modeling
Rui Liu, Sheng Fan, Wenguan Wang et al.
LangBridge: Interpreting Image as a Combination of Language Embeddings
Jiaqi Liao, Yuwei Niu, Fanqing Meng et al.
Embodied Navigation with Auxiliary Task of Action Description Prediction
Haru Kondoh, Asako Kanezaki
Contrastive Flow Matching
George Stoica, Vivek Ramanujan, Xiang Fan et al.
HOLa: Zero-Shot HOI Detection with Low-Rank Decomposed VLM Feature Adaptation
Qinqian Lei, Bo Wang, Robby Tan
AllGCD: Leveraging All Unlabeled Data for Generalized Category Discovery
Xinzi Cao, Ke Chen, Feidiao Yang et al.
Towards Long-Horizon Vision-Language-Action System: Reasoning, Acting and Memory
Daixun Li, Yusi Zhang, Mingxiang Cao et al.
UniFuse: A Unified All-in-One Framework for Multi-Modal Medical Image Fusion Under Diverse Degradations and Misalignments
Dayong Su, Yafei Zhang, Huafeng Li et al.
CopyrightShield: Enhancing Diffusion Model Security Against Copyright Infringement Attacks
Zhixiang Guo, Siyuan Liang, Aishan Liu et al.
Learnable Logit Adjustment for Imbalanced Semi-Supervised Learning under Class Distribution Mismatch
lee hyuck, Taemin Park, Heeyoung Kim
DiffPCI: Large Motion Point Cloud frame Interpolation with Diffusion Model
tianyu zhang, Haobo Jiang, jian Yang et al.
Local Dense Logit Relations for Enhanced Knowledge Distillation
Liuchi Xu, Kang Liu, Jinshuai Liu et al.
HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding
JIAHE ZHAO, RuiBing Hou, zejie tian et al.
Soft Local Completeness: Rethinking Completeness in XAI
Ziv Weiss Haddad, Oren Barkan, Yehonatan Elisha et al.
PBFG: A New Physically-Based Dataset and Removal of Lens Flares and Glares
Jie Zhu, Sungkil Lee
Correspondence as Video: Test-Time Adaption on SAM2 for Reference Segmentation in the Wild
Haoran Wang, Zekun Li, Jian Zhang et al.
An Information-Theoretic Regularizer for Lossy Neural Image Compression
ZHANG YINGWEN, Meng Wang, Xihua Sheng et al.
Controllable Feature Whitening for Hyperparameter-Free Bias Mitigation
Yooshin Cho, Hanbyel Cho, Janghyeon Lee et al.
KV-Edit: Training-Free Image Editing for Precise Background Preservation
Tianrui Zhu, Shiyi Zhang, Jiawei Shao et al.
FusionPhys: A Flexible Framework for Fusing Complementary Sensing Modalities in Remote Physiological Measurement
Chenhang Ying, Huiyu Yang, Jieyi Ge et al.
DiffVSR: Revealing an Effective Recipe for Taming Robust Video Super-Resolution Against Complex Degradations
Xiaohui Li, Yihao Liu, Shuo Cao et al.
Power of Cooperative Supervision: Multiple Teachers Framework for Advanced 3D Semi-Supervised Object Detection
Jin-Hee Lee, Jae-keun Lee, Jeseok Kim et al.
Adapting In-Domain Few-Shot Segmentation to New Domains without Source Domain Retraining
Qi Fan, Kaiqi Liu, Nian Liu et al.
ASGS: Single-Domain Generalizable Open-Set Object Detection via Adaptive Subgraph Searching
Yuxuan Yuan, Luyao Tang, Chaoqi Chen et al.
COVTrack: Continuous Open-Vocabulary Tracking via Adaptive Multi-Cue Fusion
Zekun Qian, Ruize Han, Zhixiang Wang et al.
CasP: Improving Semi-Dense Feature Matching Pipeline Leveraging Cascaded Correspondence Priors for Guidance
Peiqi Chen, Lei Yu, Yi Wan et al.
MMAIF: Multi-task and Multi-degradation All-in-One for Image Fusion with Language Guidance
Zihan Cao, Yu Zhong, Ziqi Wang et al.
Blind Video Super-Resolution based on Implicit Kernels
Qiang Zhu, Yuxuan Jiang, Shuyuan Zhu et al.
Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts
Chiao-An Yang, Kuan-Chuan Peng, Raymond A. Yeh
Adversarial Robustness of Discriminative Self-Supervised Learning in Vision
Ömer Veysel Çağatan, Ömer TAL, M. Emre Gursoy
HPSv3: Towards Wide-Spectrum Human Preference Score
Yuhang Ma, Keqiang Sun, Xiaoshi Wu et al.
UNIS: A Unified Framework for Achieving Unbiased Neural Implicit Surfaces in Volume Rendering
Junkai Deng, Hanting Niu, Jiaze Li et al.
IntrinsicControlNet: Cross-distribution Image Generation with Real and Unreal
Jiayuan Lu, Rengan Xie, Zixuan Xie et al.
Advancing Text-to-3D Generation with Linearized Lookahead Variational Score Distillation
Yu Lei, Bingde Liu, Qingsong Xie et al.
Steering Guidance for Personalized Text-to-Image Diffusion Models
Sunghyun Park, Seokeon Choi, Hyoungwoo Park et al.
ONLY: One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Models
Zifu Wan, Ce Zhang, Silong Yong et al.
Domain-aware Category-level Geometry Learning Segmentation for 3D Point Clouds
Pei He, Lingling Li, Licheng Jiao et al.
Event-aided Dense and Continuous Point Tracking: Everywhere and Anytime
Zhexiong Wan, Jianqin Luo, Yuchao Dai et al.
Context-Aware Academic Emotion Dataset and Benchmark
Luming Zhao, Jingwen Xuan, Jiamin Lou et al.
FlowSeek: Optical Flow Made Easier with Depth Foundation Models and Motion Bases
Matteo Poggi, Fabio Tosi
TPG-INR: Target Prior-Guided Implicit 3D CT Reconstruction for Enhanced Sparse-view Imaging
QingleiCao QingleiCao, Ziyao Tang, Xiaoqin Tang
Efficient Visual Place Recognition Through Multimodal Semantic Knowledge Integration
Sitao Zhang, Hongda Mao, Qingshuang Chen et al.
COME: Dual Structure-Semantic Learning with Collaborative MoE for Universal Lesion Detection Across Heterogeneous Ultrasound Datasets
Lingyu Chen, Yawen Zeng, Yue Wang et al.
NATRA: Noise-Agnostic Framework for Trajectory Prediction with Noisy Observations
Rongqing Li, Changsheng Li, Ruilin Lv et al.
MS3D: High-Quality 3D Generation via Multi-Scale Representation Modeling
Guan Luo, Jianfeng Zhang
UniDxMD: Towards Unified Representation for Cross-Modal Unsupervised Domain Adaptation in 3D Semantic Segmentation
Zhengyin Liang, Hui Yin, Min Liang et al.
PLAN: Proactive Low-Rank Allocation for Continual Learning
XIEQUN WANG, Zhan Zhuang, Yu Zhang
Leveraging Spatial Invariance to Boost Adversarial Transferability
Zihan Zhou, LI LI, Yanli Ren et al.
TerraMind: Large-Scale Generative Multimodality for Earth Observation
Johannes Jakubik, Felix Yang, Benedikt Blumenstiel et al.
SD2Actor: Continuous State Decomposition via Diffusion Embeddings for Robotic Manipulation
lijiayi jiayi
Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis
Xinyu Hou, Zongsheng Yue, Xiaoming Li et al.
Scene Graph Guided Generation: Enable Accurate Relations Generation in Text-to-Image Models via Textural Rectification
Guibao SHEN, Luozhou Wang, Jiantao Lin et al.
ReMP-AD: Retrieval-enhanced Multi-modal Prompt Fusion for Few-Shot Industrial Visual Anomaly Detection
Hongchi Ma, Guanglei Yang, Debin Zhao et al.
TimeFormer: Capturing Temporal Relationships of Deformable 3D Gaussians for Robust Reconstruction
Dadong Jiang, Zhi Hou, Zhihui Ke et al.
Backdoor Mitigation by Distance-Driven Detoxification
Shaokui Wei, Jiayin Liu, Hongyuan Zha
UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI
Fangwei Zhong, Kui Wu, Churan Wang et al.
HFD-Teacher: High-Frequency Depth Distillation from Depth Foundation Models for Enhanced Depth Completion
Zhiyuan Yang, Anqi Cheng, Haiyue Zhu et al.
Height-Fidelity Dense Global Fusion for Multi-modal 3D Object Detection
Hanshi Wang, Jin Gao, Weiming Hu et al.
SMSTracker: Tri-path Score Mask Sigma Fusion for Multi-Modal Tracking
Sixian Chan, Zedong Li, Xiaoqin Zhang et al.
Two Losses, One Goal: Balancing Conflict Gradients for Semi-supervised Semantic Segmentation
Rui Sun, Huayu Mai, Wangkai Li et al.
Region-based Cluster Discrimination for Visual Representation Learning
Yin Xie, Kaicheng Yang, Xiang An et al.
CMB-ML: A Cosmic Microwave Background Dataset for the Oldest Possible Computer Vision Task
James Amato, Yunan Xie, Leonel Medina-Varela et al.
Shape of Motion: 4D Reconstruction from a Single Video
Qianqian Wang, Vickie Ye, Hang Gao et al.
EditCLIP: Representation Learning for Image Editing
Qian Wang, Aleksandar Cvejic, Abdelrahman Eldesokey et al.
MOVE: Motion-Guided Few-Shot Video Object Segmentation
Kaining Ying, Hengrui Hu, Henghui Ding
CorrCLIP: Reconstructing Patch Correlations in CLIP for Open-Vocabulary Semantic Segmentation
Dengke Zhang, Fagui Liu, Quan Tang
mmCooper: A Multi-agent Multi-stage Communication-efficient and Collaboration-robust Cooperative Perception Framework
Bingyi Liu, Jian Teng, Hongfei Xue et al.
FreqPDE: Rethinking Positional Depth Embedding for Multi-View 3D Object Detection Transformers
Junjie Zhang, Haisheng Su, Feixiang Song et al.
RealCam-I2V: Real-World Image-to-Video Generation with Interactive Complex Camera Control
Teng Li, Guangcong Zheng, Rui Jiang et al.
VAGUE: Visual Contexts Clarify Ambiguous Expressions
Heejeong Nam, Jinwoo Ahn, Keummin Ka et al.
What's Making That Sound Right Now? Video-centric Audio-Visual Localization
hahyeon choi, Junhoo Lee, Nojun Kwak
RARE: Refine Any Registration of Pairwise Point Clouds via Zero-Shot Learning
Chengyu Zheng, Honghua Chen, Jin Huang et al.
OV-SCAN: Semantically Consistent Alignment for Novel Object Discovery in Open-Vocabulary 3D Object Detection
Adrian Chow, Evelien Riddell, Yimu Wang et al.
SC-Lane: Slope-aware and Consistent Road Height Estimation Framework for 3D Lane Detection
Chaesong Park, Eunbin Seo, JihyeonHwang JihyeonHwang et al.
Exploring the Visual Feature Space for Multimodal Neural Decoding
Weihao Xia, Cengiz Oztireli
Backdoor Defense via Enhanced Splitting and Trap Isolation
Hongrui Yu, Lu Qi, Wanyu Lin et al.
ATAS: Any-to-Any Self-Distillation for Enhanced Open-Vocabulary Dense Prediction
Soonwoo Cha, Jiwoo Song, Juan Yeo et al.
D3: Training-Free AI-Generated Video Detection Using Second-Order Features
Chende Zheng, Ruiqi suo, Chenhao Lin et al.
Overcoming Dual Drift for Continual Long-Tailed Visual Question Answering
Feifei Zhang, Zhihao Wang, Xi Zhang et al.
χ: Symmetry Understanding of 3D Shapes via Chirality Disentanglement
Weikang Wang, Tobias Weißberg, Nafie El Amrani et al.
VideoAuteur: Towards Long Narrative Video Generation
Junfei Xiao, Feng Cheng, Lu Qi et al.
Robust and Efficient 3D Gaussian Splatting for Urban Scene Reconstruction
Zhensheng Yuan, Haozhi Huang, Zhen Xiong et al.
Neural Architecture Search Driven by Locally Guided Diffusion for Personalized Federated Learning
PENG LIAO, Xilu Wang, Yaochu Jin et al.
Bridging Local Inductive Bias and Long-Range Dependencies with Pixel-Mamba for End-to-end Whole Slide Image Analysis
Zhongwei Qiu, Hanqing Chao, Tiancheng Lin et al.
Neuroverse3D: Developing In-Context Learning Universal Model for Neuroimaging in 3D
Jiesi Hu, Hanyang Peng, Yanwu Yang et al.
Taming Flow Matching with Unbalanced Optimal Transport into Fast Pansharpening
Zihan Cao, Yu Zhong, Liang-Jian Deng
ZeroKey: Point-Level Reasoning and Zero-Shot 3D Keypoint Detection from Large Language Models
Bingchen Gong, Diego Gomez, Abdullah Hamdi et al.
How Do Multimodal Large Language Models Handle Complex Multimodal Reasoning? Placing Them in An Extensible Escape Game
Ziyue Wang, Yurui Dong, Fuwen Luo et al.
Towards Human-like Virtual Beings: Simulating Human Behavior in 3D Scenes
CHEN LIANG, Wenguan Wang, Yi Yang
S3R-GS: Streamlining the Pipeline for Large-Scale Street Scene Reconstruction
Guangting Zheng, Jiajun Deng, Xiaomeng Chu et al.
The Source Image is the Best Attention for Infrared and Visible Image Fusion
Song Wang, Xie Han, Liqun Kuang et al.
Video2BEV: Transforming Drone Videos to BEVs for Video-based Geo-localization
Hao Ju, Shaofei Huang, Si Liu et al.
CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic Segmentation
Lin Sun, Jiale Cao, Jin Xie et al.
Wave-MambaAD: Wavelet-driven State Space Model for Multi-class Unsupervised Anomaly Detection
Qiao Zhang, Mingwen Shao, Xinyuan Chen et al.
Scendi Score: Prompt‑Aware Diversity Evaluation via Schur Complement of CLIP Embeddings
Azim Ospanov, Mohammad Jalali, Farzan Farnia
Scaling Laws for Native Multimodal Models
Mustafa Shukor, Enrico Fini, Victor Guilherme Turrisi da Costa et al.
VoxelKP: A Voxel-based Network Architecture for Human Keypoint Estimation in LiDAR Data
Jian Shi, Peter Wonka
A View-consistent Sampling Method for Regularized Training of Neural Radiance Fields
Aoxiang Fan, Corentin Dumery, Nicolas Talabot et al.
Autoregressive Denoising Score Matching is a Good Video Anomaly Detector
hanwen Zhang, Congqi Cao, Qinyi Lv et al.
IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation
YINWEI WU, Xianpan Zhou, bing ma et al.
A Constrained Optimization Approach for Gaussian Splatting from Coarsely-posed Images and Noisy Lidar Point Clouds
Jizong Peng, Tze Ho Elden Tse, Kai Xu et al.
Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting
Xingyu Miao, Haoran Duan, Quanhao Qian et al.
EYE3:Turn Anything into Naked-eye 3D
Yingde Song, Zongyuan Yang, Baolin Liu et al.
C2MIL: Synchronizing Semantic and Topological Causalities in Multiple Instance Learning for Robust and Interpretable Survival Analysis
Min Cen, Zhenfeng Zhuang, Yuzhe Zhang et al.
SEGA: A Stepwise Evolution Paradigm for Content-Aware Layout Generation with Design Prior
Bo Zhao, Haoran Wang, Jinghui Wang et al.
TryOn-Refiner: Conditional Rectified-flow-based TryOn Refiner for More Accurate Detail Reconstruction
Wen Qian
Scoring, Remember, and Reference: Catching Camouflaged Objects in Videos
Yuang Feng, Shuyong Gao, Fuzhen Yan et al.
Recognizing Actions from Robotic View for Natural Human-Robot Interaction
Ziyi Wang, Peiming Li, Hong Liu et al.
Addressing Text Embedding Leakage in Diffusion-based Image Editing
Sunung Mun, Jinhwan Nam, Sunghyun Cho et al.
Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning
Weitai Kang, Haifeng Huang, Yuzhang Shang et al.
FRET: Feature Redundancy Elimination for Test Time Adaptation
Linjing You, Jiabao Lu, Xiayuan Huang et al.
Motion-2-to-3: Leveraging 2D Motion Data for 3D Motion Generations
Ruoxi Guo, Huaijin Pi, Zehong Shen et al.
A₀ : An Affordance-Aware Hierarchical Model for General Robotic Manipulation
Rongtao Xu, Jian Zhang, Minghao Guo et al.
PVMamba: Parallelizing Vision Mamba via Dynamic State Aggregation
Fei Xie, Zhongdao Wang, Weijia Zhang et al.
Controllable and Expressive One-Shot Video Head Swapping
Chaonan Ji, Jinwei Qi, Peng Zhang et al.
Adversarial Training for Probabilistic Robustness
YI ZHANG, Yuhang Chen, Zhen Chen et al.
Learning to See Inside Opaque Liquid Containers using Speckle Vibrometry
Matan Kichler, Shai Bagon, Mark Sheinin
LightBSR: Towards Lightweight Blind Super-Resolution via Discriminative Implicit Degradation Representation Learning
Jiang Yuan, ji ma, Bo Wang et al.
When Pixel Difference Patterns Meet ViT: PiDiViT for Few-Shot Object Detection
Hongliang Zhou, Yongxiang Liu, Canyu Mo et al.
SPD: Shallow Backdoor Protecting Deep Backdoor Against Backdoor Detection
Shunjie Yuan, Xinghua Li, Xuelin Cao et al.
Rethinking DPO-style Diffusion Aligning Frameworks
XUN WU, Shaohan Huang, Lingjie Jiang et al.
Ensemble Foreground Management for Unsupervised Object Discovery
Ziling Wu, Armaghan Moemeni, Praminda Caleb-Solly
Hierarchical Variational Test-Time Prompt Generation for Zero-Shot Generalization
Zhaoyang Wu, Fang Liu, Licheng Jiao et al.
OCSplats: Observation Completeness Quantification and Label Noise Separation in 3DGS
Han Ling, Yinghui Sun, Xian Xu et al.
GWM: Towards Scalable Gaussian World Models for Robotic Manipulation
Guanxing Lu, Baoxiong Jia, Puhao Li et al.
Boosting Multimodal Learning via Disentangled Gradient Learning
Shicai Wei, Chunbo Luo, Yang Luo
TAG-WM: Tamper-Aware Generative Image Watermarking via Diffusion Inversion Sensitivity
Yuzhuo Chen, Zehua Ma, Han Fang et al.
HORT: Monocular Hand-held Objects Reconstruction with Transformers
Zerui Chen, Rolandos Alexandros Potamias, Shizhe Chen et al.
CaliMatch: Adaptive Calibration for Improving Safe Semi-supervised Learning
Jinsoo Bae, Seoung Bum Kim, Hyungrok Do
Reminiscence Attack on Residuals: Exploiting Approximate Machine Unlearning for Privacy
Yaxin Xiao, Qingqing Ye, Li Hu et al.
Tensor-aggregated LoRA in Federated Fine-tuning
Zhixuan Li, Binqian Xu, Xiangbo Shu et al.
QR-LoRA: Efficient and Disentangled Fine-tuning via QR Decomposition for Customized Generation
Jiahui Yang, Yongjia Ma, Donglin Di et al.
Self-Supervised Sparse Sensor Fusion for Long Range Perception
Edoardo Palladin, Samuel Brucker, Filippo Ghilotti et al.
AccidentalGS: 3D Gaussian Splatting from Accidental Camera Motion
Mao Mao, Xujie Shen, Guyuan Chen et al.
Competitive Distillation: A Simple Learning Strategy for Improving Visual Classification
Daqian Shi, Xiaolei Diao, Xu Chen et al.
Unified Adversarial Augmentation for Improving Palmprint Recognition
Jianlong Jin, Chenglong Zhao, Ruixin Zhang et al.
Adding Additional Control to One-Step Diffusion with Joint Distribution Matching
Yihong Luo, Tianyang Hu, Yifan Song et al.
Unified Multi-Agent Trajectory Modeling with Masked Trajectory Diffusion
songru Yang, Zhenwei Shi, Zhengxia Zou
Bridging Class Imbalance and Partial Labeling via Spectral-Balanced Energy Propagation for Skeleton-based Action Recognition
Yandan Wang, Chenqi Guo, Yinglong Ma et al.
ForeSight: Multi-View Streaming Joint Object Detection and Trajectory Forecasting
Sandro Papais, Letian Wang, Brian Cheong et al.
Dual Domain Control via Active Learning for Remote Sensing Domain Incremental Object Detection
Jiachen Sun, De Cheng, Xi Yang et al.
Enpowering Your Pansharpening Models with Generalizability: Unified Distribution is All You Need
Yongchuan Cui, Peng Liu, HUI ZHANG
Beyond Low-Rank Tuning: Model Prior-Guided Rank Allocation for Effective Transfer in Low-Data and Large-Gap Regimes.
Chuyan Zhang, Kefan Wang, Yun Gu
OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic Typography
Li Caoshuo, Zengmao Ding, Xiaobin Hu et al.
COSMO: Combination of Selective Memorization for Low-cost Vision-and-Language Navigation
Siqi Zhang, Yanyuan Qiao, Qunbo Wang et al.
CoStoDet-DDPM: Collaborative Training of Stochastic and Deterministic Models Improves Surgical Workflow Anticipation and Recognition
Kaixiang Yang, Xin Li, Qiang Li et al.
Exploring Weather-aware Aggregation and Adaptation for Semantic Segmentation under Adverse Conditions
Yuwen Pan, Rui Sun, Wangkai Li et al.
Transparent Vision: A Theory of Hierarchical Invariant Representations
Shuren Qi, Yushu Zhang, CHAO WANG et al.
TemCoCo: Temporally Consistent Multi-modal Video Fusion with Visual-Semantic Collaboration
Gong Meiqi, Hao Zhang, Xunpeng Yi et al.
RetinexMCNet: A Memory Controller Dominated Network for Low-Light Video Enhancement Based on Retinex
Meiao Wang, Xuejing Kang, Yaxi Lu et al.
Sliced Wasserstein Bridge for Open-Vocabulary Video Instance Segmentation
Zheyun Qin, Deng Yu, Chuanchen Luo et al.
Frequency-Aware Autoregressive Modeling for Efficient High-Resolution Image Synthesis
Zhuokun Chen, Jugang Fan, Zhuowei Yu et al.
Lightweight Gradient-Aware Upscaling of 3D Gaussian Splatting Images
Simon Niedermayr, Christoph Neuhauser, Rüdiger Westermann
RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation
Kaidong Zhang, Rongtao Xu, Ren Pengzhen et al.
3D Gaussian Splatting Driven Multi-View Robust Physical Adversarial Camouflage Generation
Tianrui Lou, Xiaojun Jia, Siyuan Liang et al.
LLM-Assisted Semantic Guidance for Sparsely Annotated Remote Sensing Object Detection
Wei Liao, Chunyan Xu, Chenxu Wang et al.
DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing
Yang JingYi, Xun Lin, Zitong YU et al.
PacGDC: Label-Efficient Generalizable Depth Completion with Projection Ambiguity and Consistency
Haotian Wang, Aoran Xiao, Xiaoqin Zhang et al.
SuMa: A Subspace Mapping Approach for Robust and Effective Concept Erasure in Text-to-Image Diffusion Models
Kien Nguyen, Anh Tran, Cuong Pham
Recovering Parametric Scenes from Very Few Time-of-Flight Pixels
Carter Sifferman, Yiquan Li, Yiming Li et al.
MCAM: Multimodal Causal Analysis Model for Ego-Vehicle-Level Driving Video Understanding
Tongtong Cheng, Rongzhen Li, Yixin Xiong et al.
Engage for All: Making Ordinary Image Descriptions Appealing Again!
Yuyan Chen, Yifan Jiang, Li Zhou et al.
HiGarment: Cross-modal Harmony Based Diffusion Model for Flat Sketch to Realistic Garment Image
Junyi Guo, Jingxuan Zhang, Fangyu Wu et al.
Geometry Distributions
Biao Zhang, Jing Ren, Peter Wonka
Towards Effective Foundation Model Adaptation for Extreme Cross-Domain Few-Shot Learning
Fei Zhou, Peng Wang, Lei Zhang et al.
Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy
Yiting Yang, Hao Luo, Yuan Sun et al.
ConsistentCity: Semantic Flow-guided Occupancy DiT for Temporally Consistent Driving Scene Synthesis
Benjin Zhu, Xiaogang Wang, Hongsheng Li
Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis
Bowen Zhang, Sicheng Xu, Chuxin Wang et al.
Neurons: Emulating the Human Visual Cortex Improves Fidelity and Interpretability in fMRI-to-Video Reconstruction
Haonan Wang, Qixiang ZHANG, Lehan Wang et al.
Outdoor Monocular SLAM with Global Scale-Consistent 3D Gaussian Pointmaps
Chong Cheng, Sicheng Yu, Zijian Wang et al.
RogSplat: Robust Gaussian Splatting via Generative Priors
Hanyang Kong, Xingyi Yang, Xinchao Wang