Most Cited CVPR "active regression" Papers
5,589 papers found • Page 10 of 28
Conference
Panorama Generation From NFoV Image Done Right
Dian Zheng, Cheng Zhang, Xiao-Ming Wu et al.
Enhancing 3D Gaze Estimation in the Wild using Weak Supervision with Gaze Following Labels
Pierre Vuillecard, Jean-marc Odobez
Monocular Identity-Conditioned Facial Reflectance Reconstruction
Xingyu Ren, Jiankang Deng, Yuhao Cheng et al.
A General Adaptive Dual-level Weighting Mechanism for Remote Sensing Pansharpening
Jie Huang, Haorui Chen, Jiaxuan Ren et al.
MI-DETR: An Object Detection Model with Multi-time Inquiries Mechanism
Zhixiong Nan, Xianghong Li, Tao Xiang et al.
Benchmarking Segmentation Models with Mask-Preserved Attribute Editing
Zijin Yin, Kongming Liang, Bing Li et al.
AVF-MAE++: Scaling Affective Video Facial Masked Autoencoders via Efficient Audio-Visual Self-Supervised Learning
Xuecheng Wu, Heli Sun, Yifan Wang et al.
Detail-Preserving Latent Diffusion for Stable Shadow Removal
Jiamin Xu, Yuxin Zheng, Zelong Li et al.
PixelRNN: In-pixel Recurrent Neural Networks for End-to-end–optimized Perception with Neural Sensors
Haley So, Laurie Bose, Piotr Dudek et al.
Towards Generalizable Scene Change Detection
Jae-Woo KIM, Ue-Hwan Kim
DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models
Radu Alexandru Rosu, Keyu Wu, Yao Feng et al.
HarmonySet: A Comprehensive Dataset for Understanding Video-Music Semantic Alignment and Temporal Synchronization
Zitang Zhou, Ke Mei, Yu Lu et al.
Progress-Aware Video Frame Captioning
Zihui Xue, Joungbin An, Xitong Yang et al.
A Generative Approach for Wikipedia-Scale Visual Entity Recognition
Mathilde Caron, Ahmet Iscen, Alireza Fathi et al.
DiET-GS: Diffusion Prior and Event Stream-Assisted Motion Deblurring 3D Gaussian Splatting
Seungjun Lee, Gim Hee Lee
CLIPtone: Unsupervised Learning for Text-based Image Tone Adjustment
Hyeongmin Lee, Kyoungkook Kang, Jungseul Ok et al.
InteractAnything: Zero-shot Human Object Interaction Synthesis via LLM Feedback and Object Affordance Parsing
Jinlu Zhang, Yixin Chen, Zan Wang et al.
Gradient-Guided Annealing for Domain Generalization
Aristotelis Ballas, Christos Diou
FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D Object Placement
Ian Huang, Yanan Bao, Karen Truong et al.
G3DR: Generative 3D Reconstruction in ImageNet
Pradyumna Reddy, Ismail Elezi, Jiankang Deng
SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization
Hongrui Jia, Chaoya Jiang, Haiyang Xu et al.
Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation
Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.
Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation
Yuying Ge, Yizhuo Li, Yixiao Ge et al.
SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance
Peishan Cong, Ziyi Wang, Yuexin Ma et al.
FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion
Haosen Yang, Adrian Bulat, Isma Hadji et al.
Instance-based Max-margin for Practical Few-shot Recognition
Minghao Fu, Ke Zhu
When Visual Grounding Meets Gigapixel-level Large-scale Scenes: Benchmark and Approach
TAO MA, Bing Bai, Haozhe Lin et al.
TIMotion: Temporal and Interactive Framework for Efficient Human-Human Motion Generation
Yabiao Wang, Shuo Wang, Jiangning Zhang et al.
InterAct: Advancing Large-Scale Versatile 3D Human-Object Interaction Generation
Sirui Xu, Dongting Li, Yucheng Zhang et al.
STPro: Spatial and Temporal Progressive Learning for Weakly Supervised Spatio-Temporal Grounding
Aaryan Garg, Akash Kumar, Yogesh S. Rawat
Towards Backward-Compatible Continual Learning of Image Compression
Zhihao Duan, Ming Lu, Justin Yang et al.
L-MAGIC: Language Model Assisted Generation of Images with Coherence
zhipeng cai, Matthias Mueller, Reiner Birkl et al.
Physics-Aware Hand-Object Interaction Denoising
Haowen Luo, Yunze Liu, Li Yi
PBR-NeRF: Inverse Rendering with Physics-Based Neural Fields
Sean Wu, Shamik Basu, Tim Broedermann et al.
ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer
Jiayi Gao, Zijin Yin, Changcheng Hua et al.
Robust Multimodal Survival Prediction with Conditional Latent Differentiation Variational AutoEncoder
Junjie Zhou, Jiao Tang, Yingli Zuo et al.
Object-Shot Enhanced Grounding Network for Egocentric Video
Yisen Feng, Haoyu Zhang, Meng Liu et al.
PFStorer: Personalized Face Restoration and Super-Resolution
Tuomas Varanka, Tapani Toivonen, Soumya Tripathy et al.
Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Bolin Lai, Felix Juefei-Xu, Miao Liu et al.
COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training
Sanghwan Kim, Rui Xiao, Iuliana Georgescu et al.
CL-LoRA: Continual Low-Rank Adaptation for Rehearsal-Free Class-Incremental Learning
Jiangpeng He, Zhihao Duan, Fengqing Zhu
Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation
Shuling Zhao, Fa-Ting Hong, Xiaoshui Huang et al.
ManiVideo: Generating Hand-Object Manipulation Video with Dexterous and Generalizable Grasping
Youxin Pang, Ruizhi Shao, Jiajun Zhang et al.
Adaptive Part Learning for Fine-Grained Generalized Category Discovery: A Plug-and-Play Enhancement
Qiyuan Dai, Hanzhuo Huang, Yu Wu et al.
Contextual AD Narration with Interleaved Multimodal Sequence
Hanlin Wang, Zhan Tong, Kecheng Zheng et al.
Geometric Knowledge-Guided Localized Global Distribution Alignment for Federated Learning
Yanbiao Ma, Wei Dai, Wenke Huang et al.
M3amba: Memory Mamba is All You Need for Whole Slide Image Classification
Tingting Zheng, Kui Jiang, Yi Xiao et al.
LEAD: Exploring Logit Space Evolution for Model Selection
Zixuan Hu, Xiaotong Li, SHIXIANG TANG et al.
StdGEN: Semantic-Decomposed 3D Character Generation from Single Images
Yuze He, Yanning Zhou, Wang Zhao et al.
KITRO: Refining Human Mesh by 2D Clues and Kinematic-tree Rotation
Fengyuan Yang, Kerui Gu, Angela Yao
Inversion Circle Interpolation: Diffusion-based Image Augmentation for Data-scarce Classification
Yanghao Wang, Long Chen
GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks
Haoqiang Kang, Enna Sachdeva, Piyush Gupta et al.
A Simple yet Effective Layout Token in Large Language Models for Document Understanding
Zhaoqing Zhu, Chuwei Luo, Zirui Shao et al.
Unsupervised Deep Unrolling Networks for Phase Unwrapping
Zhile Chen, Yuhui Quan, Hui Ji
Enhancing Privacy-Utility Trade-offs to Mitigate Memorization in Diffusion Models
Chen Chen, Daochang Liu, Mubarak Shah et al.
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li, Cristiano Saltori, Fabio Poiesi et al.
Querying as Prompt: Parameter-Efficient Learning for Multimodal Language Model
Tian Liang, Jing Huang, Ming Kong et al.
Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model
Shengjun Zhang, Jinzhao Li, Xin Fei et al.
Scene Map-based Prompt Tuning for Navigation Instruction Generation
Sheng Fan, Rui Liu, Wenguan Wang et al.
Driving by the Rules: A Benchmark for Integrating Traffic Sign Regulations into Vectorized HD Map
Xinyuan Chang, Maixuan Xue, Xinran Liu et al.
Multiscale Vision Transformers Meet Bipartite Matching for Efficient Single-stage Action Localization
Ioanna Ntinou, Enrique Sanchez, Georgios Tzimiropoulos
LIRM: Large Inverse Rendering Model for Progressive Reconstruction of Shape, Materials and View-dependent Radiance Fields
Zhengqin Li, Dilin Wang, Ka chen et al.
EEE-Bench: A Comprehensive Multimodal Electrical And Electronics Engineering Benchmark
Ming Li, Jike Zhong, Tianle Chen et al.
Zero-Shot Novel View and Depth Synthesis with Multi-View Geometric Diffusion
Vitor Guizilini, Muhammad Zubair Irshad, Dian Chen et al.
Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion
Zexin He, Tengfei Wang, Xin Huang et al.
LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living
Dominick Reilly, Rajatsubhra Chakraborty, Arkaprava Sinha et al.
Fun with Flags: Robust Principal Directions via Flag Manifolds
Tolga Birdal, Nathan Mankovich
DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model
Lirui Zhao, Yue Yang, Kaipeng Zhang et al.
LORS: Low-rank Residual Structure for Parameter-Efficient Network Stacking
Jialin Li, Qiang Nie, Weifu Fu et al.
CrossOver: 3D Scene Cross-Modal Alignment
Sayan Deb Sarkar, Ondrej Miksik, Marc Pollefeys et al.
Noise Modeling in One Hour: Minimizing Preparation Efforts for Self-supervised Low-Light RAW Image Denoising
Feiran Li, Haiyang Jiang, Daisuke Iso
AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models
Xinghui Li, Qichao Sun, Pengze Zhang et al.
Polarization Wavefront Lidar: Learning Large Scene Reconstruction from Polarized Wavefronts
Dominik Scheuble, Chenyang Lei, Mario Bijelic et al.
Neural Lineage
Runpeng Yu, Xinchao Wang
Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics
Lee Chae-Yeon, Oh Hyun-Bin, Han EunGi et al.
ChatHuman: Chatting about 3D Humans with Tools
Jing Lin, Yao Feng, Weiyang Liu et al.
ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting
Chengyou Jia, Changliang Xia, Zhuohang Dang et al.
BodyMAP - Jointly Predicting Body Mesh and 3D Applied Pressure Map for People in Bed
Abhishek Tandon, Anujraaj Goyal, Henry M. Clever et al.
3D Gaussian Head Avatars with Expressive Dynamic Appearances by Compact Tensorial Representations
yating wang, Xuan Wang, Ran Yi et al.
Extreme Point Supervised Instance Segmentation
Hyeonjun Lee, Sehyun Hwang, Suha Kwak
GauSTAR: Gaussian Surface Tracking and Reconstruction
Chengwei Zheng, Lixin Xue, Juan Jose Zarate et al.
SEAL: Semantic Attention Learning for Long Video Representation
Lan Wang, Yujia Chen, Wen-Sheng Chu et al.
POp-GS: Next Best View in 3D-Gaussian Splatting with P-Optimality
Joey Wilson, Marcelino M. de Almeida, Sachit Mahajan et al.
EvEnhancer: Empowering Effectiveness, Efficiency and Generalizability for Continuous Space-Time Video Super-Resolution with Events
Shuoyan Wei, Feng Li, Shengeng Tang et al.
Exploring Historical Information for RGBE Visual Tracking with Mamba
Chuanyu Sun, Jiqing Zhang, Yang Wang et al.
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
Junjie Wang, BIN CHEN, Yulin Li et al.
DA-VPT: Semantic-Guided Visual Prompt Tuning for Vision Transformers
Li Ren, Chen Chen, Liqiang Wang et al.
VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models
Chi-Pin Huang, Yen-Siang Wu, Hung-Kai Chung et al.
CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoor Object Detection from Multi-view Images
Guanlin Shen, Jingwei Huang, Zhihua Hu et al.
ROICtrl: Boosting Instance Control for Visual Generation
Yuchao Gu, Yipin Zhou, Yunfan Ye et al.
ModeSeq: Taming Sparse Multimodal Motion Prediction with Sequential Mode Modeling
Zikang Zhou, Hengjian Zhou, Haibo Hu et al.
MAR-3D: Progressive Masked Auto-regressor for High-Resolution 3D Generation
Jinnan Chen, Lingting Zhu, Zeyu HU et al.
FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification
Zhengrui Guo, Conghao Xiong, Jiabo MA et al.
SAM2-LOVE: Segment Anything Model 2 in Language-aided Audio-Visual Scenes
Yuji Wang, Haoran Xu, Yong Liu et al.
Real-time High-fidelity Gaussian Human Avatars with Position-based Interpolation of Spatially Distributed MLPs
Youyi Zhan, Tianjia Shao, Yin Yang et al.
UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation
Lunhao Duan, Shanshan Zhao, Wenjun Yan et al.
Linear Attention Modeling for Learned Image Compression
Donghui Feng, Zhengxue Cheng, Shen Wang et al.
Semantic Shield: Defending Vision-Language Models Against Backdooring and Poisoning via Fine-grained Knowledge Alignment
Alvi Md Ishmam, Chris Thomas
MATCHA: Towards Matching Anything
Fei Xue, Sven Elflein, Laura Leal-Taixe et al.
Fully Geometric Panoramic Localization
Junho Kim, Jiwon Jeong, Young Min Kim
Point Clouds Meets Physics: Dynamic Acoustic Field Fitting Network for Point Cloud Understanding
Changshuo Wang, Shuting He, Xiang Fang et al.
Cross-view and Cross-pose Completion for 3D Human Understanding
Matthieu Armando, Salma Galaaoui, Fabien Baradel et al.
Language-conditioned Detection Transformer
Jang Hyun Cho, Philipp Krähenbühl
Weak-to-Strong 3D Object Detection with X-Ray Distillation
Alexander Gambashidze, Aleksandr Dadukin, Maksim Golyadkin et al.
LaTexBlend: Scaling Multi-concept Customized Generation with Latent Textual Blending
Jian Jin, Zhenbo Yu, Yang Shen et al.
A Unified Model for Compressed Sensing MRI Across Undersampling Patterns
Armeet Singh Jatyani, Jiayun Wang, Aditi Chandrashekar et al.
Reconstructing Humans with a Biomechanically Accurate Skeleton
Yan Xia, Xiaowei Zhou, Etienne Vouga et al.
Incorporating Geo-Diverse Knowledge into Prompting for Increased Geographical Robustness in Object Recognition
Kyle Buettner, Sina Malakouti, Xiang Li et al.
Semantic-Aware Multi-Label Adversarial Attacks
Hassan Mahmood, Ehsan Elhamifar
Learning to Produce Semi-dense Correspondences for Visual Localization
Khang Truong Giang, Soohwan Song, Sungho Jo
DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models
Yongqi Huang, Peng Ye, Chenyu Huang et al.
Unveiling the Unknown: Unleashing the Power of Unknown to Known in Open-Set Source-Free Domain Adaptation
Fuli Wan, Han Zhao, Xu Yang et al.
Reference-Based 3D-Aware Image Editing with Triplanes
Bahri Batuhan Bilecen, Yiğit Yalın, Ning Yu et al.
Hearing Anywhere in Any Environment
Xiulong Liu, Anurag Kumar, Paul Calamia et al.
ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding
Guangda Ji, Silvan Weder, Francis Engelmann et al.
ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions
Tomas Soucek, Prajwal Gatti, Michael Wray et al.
OpticalDR: A Deep Optical Imaging Model for Privacy-Protective Depression Recognition
Yuchen Pan, Junjun Jiang, Kui Jiang et al.
Language-Guided Audio-Visual Learning for Long-Term Sports Assessment
Huangbiao Xu, Xiao Ke, Huanqi Wu et al.
In-Context Matting
He Guo, Zixuan Ye, Zhiguo Cao et al.
Denoising Functional Maps: Diffusion Models for Shape Correspondence
Aleksei Zhuravlev, Zorah Lähner, Vladislav Golyanik
Binarized Mamba-Transformer for Lightweight Quad Bayer HybridEVS Demosaicing
Shiyang Zhou, Haijin Zeng, Yunfan Lu et al.
NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images
Lingen Li, Zhaoyang Zhang, Yaowei Li et al.
Functionality Understanding and Segmentation in 3D Scenes
Jaime Corsetti, Francesco Giuliari, Alice Fasoli et al.
Accurate Training Data for Occupancy Map Prediction in Automated Driving Using Evidence Theory
Jonas Kälble, Sascha Wirges, Maxim Tatarchenko et al.
Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation
Shivam Duggal, Yushi Hu, Oscar Michel et al.
SimMotionEdit: Text-Based Human Motion Editing with Motion Similarity Prediction
Zhengyuan Li, Kai Cheng, Anindita Ghosh et al.
AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios
Ziming Huang, Xurui Li, Haotian Liu et al.
Fixed Point Diffusion Models
Luke Melas-Kyriazi, Xingjian Bai
Exploring Visual Vulnerabilities via Multi-Loss Adversarial Search for Jailbreaking Vision-Language Models
Shuyang Hao, Bryan Hooi, Jun Liu et al.
Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
Andong Deng, Zhongpai Gao, Anwesa Choudhuri et al.
Exploring Simple Open-Vocabulary Semantic Segmentation
Zihang Lai
Chat-based Person Retrieval via Dialogue-Refined Cross-Modal Alignment
Yang Bai, Yucheng Ji, Min Cao et al.
Lift3D Policy: Lifting 2D Foundation Models for Robust 3D Robotic Manipulation
Yueru Jia, Jiaming Liu, Sixiang Chen et al.
Defense without Forgetting: Continual Adversarial Defense with Anisotropic & Isotropic Pseudo Replay
Yuhang Zhou, Zhongyun Hua
Order-Robust Class Incremental Learning: Graph-Driven Dynamic Similarity Grouping
Guannan Lai, Yujie Li, Xiangkun Wang et al.
MultiVENT 2.0: A Massive Multilingual Benchmark for Event-Centric Video Retrieval
Reno Kriz, Kate Sanders, David Etter et al.
3D-Aware Face Editing via Warping-Guided Latent Direction Learning
Yuhao Cheng, Zhuo Chen, Xingyu Ren et al.
AutoURDF: Unsupervised Robot Modeling from Point Cloud Frames Using Cluster Registration
Jiong Lin, Lechen Zhang, Kwansoo Lee et al.
FedAWA: Adaptive Optimization of Aggregation Weights in Federated Learning Using Client Vectors
Changlong Shi, He Zhao, Bingjie Zhang et al.
CLIP Under the Microscope: A Fine-Grained Analysis of Multi-Object Representation
Reza Abbasi, Ali Nazari, Aminreza Sefid et al.
CMA: A Chromaticity Map Adapter for Robust Detection of Screen-Recapture Document Images
Changsheng Chen, Liangwei Lin, Yongqi Chen et al.
MIRE: Matched Implicit Neural Representations
Dhananjaya Jayasundara, Heng Zhao, Demetrio Labate et al.
From Sparse to Dense: Camera Relocalization with Scene-Specific Detector from Feature Gaussian Splatting
Zhiwei Huang, Hailin Yu, Yichun Shentu et al.
Visual Persona: Foundation Model for Full-Body Human Customization
Jisu Nam, Soowon Son, Zhan Xu et al.
MOS: Modeling Object-Scene Associations in Generalized Category Discovery
Zhengyuan Peng, Jinpeng Ma, Zhimin Sun et al.
3D-MVP: 3D Multiview Pretraining for Manipulation
Shengyi Qian, Kaichun Mo, Valts Blukis et al.
POT: Prototypical Optimal Transport for Weakly Supervised Semantic Segmentation
Jian Wang, Tianhong Dai, Bingfeng Zhang et al.
Uncertain Multimodal Intention and Emotion Understanding in the Wild
Qu Yang, QingHongYa Shi, Tongxin Wang et al.
Keyframe-Guided Creative Video Inpainting
Yuwei Guo, Ceyuan Yang, Anyi Rao et al.
VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment
Darshana Saravanan, Varun Gupta, Darshan Singh S et al.
ExtraNeRF: Visibility-Aware View Extrapolation of Neural Radiance Fields with Diffusion Models
Meng-Li Shih, Wei-Chiu Ma, Lorenzo Boyice et al.
Combining Frame and GOP Embeddings for Neural Video Representation
Jens Eirik Saethre, Roberto Azevedo, Christopher Schroers
Improving Gaussian Splatting with Localized Points Management
Haosen Yang, Chenhao Zhang, Wenqing Wang et al.
Generating Multimodal Driving Scenes via Next-Scene Prediction
Yanhao Wu, Haoyang Zhang, Tianwei Lin et al.
Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation
Reza Qorbani, Gianluca Villani, Theodoros Panagiotakopoulos et al.
FreqDebias: Towards Generalizable Deepfake Detection via Consistency-Driven Frequency Debiasing
Hossein Kashiani, Niloufar Alipour Talemi, Fatemeh Afghah
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization
Siyuan Li, Luyuan Zhang, Zedong Wang et al.
Improving Out-of-Distribution Generalization in Graphs via Hierarchical Semantic Environments
Yinhua Piao, Sangseon Lee, Yijingxiu Lu et al.
RCL: Reliable Continual Learning for Unified Failure Detection
Fei Zhu, Zhen Cheng, Xu-Yao Zhang et al.
TCFG: Tangential Damping Classifier-free Guidance
Mingi Kwon, Shin seong Kim, Jaeseok Jeong et al.
Motion Modes: What Could Happen Next?
Karran Pandey, Yannick Hold-Geoffroy, Matheus Gadelha et al.
Logits DeConfusion with CLIP for Few-Shot Learning
Shuo Li, Fang Liu, Zehua Hao et al.
PICD: Versatile Perceptual Image Compression with Diffusion Rendering
Tongda Xu, Jiahao Li, Bin Li et al.
Real-IAD D³: A Real-World 2D/Pseudo-3D/3D Dataset for Industrial Anomaly Detection
wenbing zhu, Lidong Wang, Ziqing Zhou et al.
FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation
Zhuguanyu Wu, Shihe Wang, Jiayi Zhang et al.
Golden Cudgel Network for Real-Time Semantic Segmentation
Guoyu Yang, Yuan Wang, Daming Shi et al.
Flexible Depth Completion for Sparse and Varying Point Densities
Jinhyung Park, Yu-Jhe Li, Kris Kitani
DVHGNN: Multi-Scale Dilated Vision HGNN for Efficient Vision Recognition
Caoshuo Li, Tanzhe Li, Xiaobin Hu et al.
Prompt Augmentation for Self-supervised Text-guided Image Manipulation
Rumeysa Bodur, Binod Bhattarai, Tae-Kyun Kim
EVOS: Efficient Implicit Neural Training via EVOlutionary Selector
Weixiang Zhang, Shuzhao Xie, Chengwei Ren et al.
Rethinking Spiking Self-Attention Mechanism: Implementing α-XNOR Similarity Calculation in Spiking Transformers
Yichen Xiao, Shuai Wang, Dehao Zhang et al.
HybridGS: Decoupling Transients and Statics with 2D and 3D Gaussian Splatting
Jingyu Lin, Jiaqi Gu, Lubin Fan et al.
Efficient Multitask Dense Predictor via Binarization
Yuzhang Shang, Dan Xu, Gaowen Liu et al.
Show and Tell: Visually Explainable Deep Neural Nets via Spatially-Aware Concept Bottleneck Models
Itay Benou, Tammy Riklin Raviv
Geometry-aware Reconstruction and Fusion-refined Rendering for Generalizable Neural Radiance Fields
Tianqi Liu, Xinyi Ye, Min Shi et al.
Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection
Marc-Antoine Lavoie, Anas Mahmoud, Steven L. Waslander
MMTL-UniAD: A Unified Framework for Multimodal and Multi-Task Learning in Assistive Driving Perception
Wenzhuo Liu, Wenshuo Wang, Yicheng Qiao et al.
Prompt-CAM: Making Vision Transformers Interpretable for Fine-Grained Analysis
Arpita Chowdhury, Dipanjyoti Paul, Zheda Mai et al.
RePerformer: Immersive Human-centric Volumetric Videos from Playback to Photoreal Reperformance
Yuheng Jiang, Zhehao Shen, Chengcheng Guo et al.
IMFine: 3D Inpainting via Geometry-guided Multi-view Refinement
Zhihao Shi, Dong Huo, Yuhongze Zhou et al.
Scene-Centric Unsupervised Panoptic Segmentation
Oliver Hahn, Christoph Reich, Nikita Araslanov et al.
RANGE: Retrieval Augmented Neural Fields for Multi-Resolution Geo-Embeddings
Aayush Dhakal, Srikumar Sastry, Subash Khanal et al.
Dual-Enhanced Coreset Selection with Class-wise Collaboration for Online Blurry Class Incremental Learning
Yutian Luo, Shiqi Zhao, Haoran Wu et al.
Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach
Beichen Zhang, Xiaoxing Wang, Xiaohan Qin et al.
SaMam: Style-aware State Space Model for Arbitrary Image Style Transfer
Hongda Liu, Longguang Wang, Ye Zhang et al.
Neuro-Symbolic Evaluation of Text-to-Video Models using Formal Verification
S P Sharan, Minkyu Choi, Sahil Shah et al.
Stable-SCore: A Stable Registration-based Framework for 3D Shape Correspondence
Haolin Liu, Xiaohang Zhan, Zizheng Yan et al.
GaussHDR: High Dynamic Range Gaussian Splatting via Learning Unified 3D and 2D Local Tone Mapping
Jinfeng Liu, Lingtong Kong, Bo Li et al.
KAC: Kolmogorov-Arnold Classifier for Continual Learning
Yusong Hu, Zichen Liang, Fei Yang et al.
Multi-Session SLAM with Differentiable Wide-Baseline Pose Optimization
Lahav Lipson, Jia Deng
SparseAlign: a Fully Sparse Framework for Cooperative Object Detection
Yunshuang Yuan, Yan Xia, Daniel Cremers et al.
GCE-Pose: Global Context Enhancement for Category-level Object Pose Estimation
Weihang Li, Hongli XU, Junwen Huang et al.
Towards In-the-wild 3D Plane Reconstruction from a Single Image
Jiachen Liu, Rui Yu, Sili Chen et al.
CNC-Net: Self-Supervised Learning for CNC Machining Operations
Mohsen Yavartanoo, Sangmin Hong, Reyhaneh Neshatavar et al.
EventSplat: 3D Gaussian Splatting from Moving Event Cameras for Real-time Rendering
Toshiya Yura, Ashkan Mirzaei, Igor Gilitschenski
Towards Realistic Example-based Modeling via 3D Gaussian Stitching
Xinyu Gao, Ziyi Yang, Bingchen Gong et al.
VasTSD: Learning 3D Vascular Tree-state Space Diffusion Model for Angiography Synthesis
Zhifeng Wang, Renjiao Yi, Xin Wen et al.
StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer
ruojun xu, Weijie Xi, Xiaodi Wang et al.
Generative Sparse-View Gaussian Splatting
Hanyang Kong, Xingyi Yang, Xinchao Wang
Weakly Supervised Temporal Action Localization via Dual-Prior Collaborative Learning Guided by Multimodal Large Language Models
Quan Zhang, Jinwei Fang, Rui Yuan et al.