Most Cited CVPR "long-tailed distribution" Papers
5,589 papers found • Page 26 of 28
Conference
VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding
Syed Talal Wasim, Muzammal Naseer, Salman Khan et al.
Investigating the Role of Weight Decay in Enhancing Nonconvex SGD
Tao Sun, Yuhao Huang, Li Shen et al.
Descriptor-In-Pixel : Point-Feature Tracking For Pixel Processor Arrays
Laurie Bose, Piotr Dudek, Jianing Chen
Sheared Backpropagation for Fine-tuning Foundation Models
Zhiyuan Yu, Li Shen, Liang Ding et al.
CrossSDF: 3D Reconstruction of Thin Structures From Cross-Sections
Thomas Walker, Salvatore Esposito, Daniel Rebain et al.
ALIEN: Implicit Neural Representations for Human Motion Prediction under Arbitrary Latency
Dong Wei, Xiaoning Sun, Xizhan Gao et al.
GIF: Generative Inspiration for Face Recognition at Scale
Mohammad Saadabadi Saadabadi, Sahar Rahimi Malakshan, Ali Dabouei et al.
CLIB-FIQA: Face Image Quality Assessment with Confidence Calibration
Fu-Zhao Ou, Chongyi Li, Shiqi Wang et al.
Viewpoint Rosetta Stone: Unlocking Unpaired Ego-Exo Videos for View-invariant Representation Learning
Mi Luo, Zihui Xue, Alex Dimakis et al.
Differentiable Micro-Mesh Construction
Yishun Dou, Zhong Zheng, Qiaoqiao Jin et al.
AdMiT: Adaptive Multi-Source Tuning in Dynamic Environments
Xiangyu Chang, Fahim Faisal Niloy, Sk Miraj Ahmed et al.
Fortifying Federated Learning Towards Trustworthiness via Auditable Data Valuation and Verifiable Client Contribution
Naveen Kumar Kummari, Ranjeet Ranjan Jha, Krishna Mohan Chalavadi et al.
DIV-FF: Dynamic Image-Video Feature Fields For Environment Understanding in Egocentric Videos
Lorenzo Mur-Labadia, Jose J. Guerrero, Ruben Martinez-Cantin
Can’t Make an Omelette Without Breaking Some Eggs: Plausible Action Anticipation Using Large Video-Language Models
Himangi Mittal, Nakul Agarwal, Shao-Yuan Lo et al.
Unsupervised 3D Structure Inference from Category-Specific Image Collections
Weikang Wang, Dongliang Cao, Florian Bernard
Video2Game: Real-time Interactive Realistic and Browser-Compatible Environment from a Single Video
Hongchi Xia, Chih-Hao Lin, Wei-Chiu Ma et al.
AnyMap: Learning a General Camera Model for Structure-from-Motion with Unknown Distortion in Dynamic Scenes
Andrea Porfiri Dal Cin, Georgi Dikov, Jihong Ju et al.
Are Conventional SNNs Really Efficient? A Perspective from Network Quantization
Guobin Shen, Dongcheng Zhao, Tenglong Li et al.
RILA: Reflective and Imaginative Language Agent for Zero-Shot Semantic Audio-Visual Navigation
Zeyuan Yang, LIU JIAGENG, Peihao Chen et al.
Sharingan: A Transformer Architecture for Multi-Person Gaze Following
Samy Tafasca, Anshul Gupta, Jean-marc Odobez
FASTer: Focal token Acquiring-and-Scaling Transformer for Long-term 3D Objection Detection
Chenxu Dang, Pei An, Xinmin Zhang et al.
Seeing Speech and Sound: Distinguishing and Locating Audio Sources in Visual Scenes
Hyeonggon Ryu, Seongyu Kim, Joon Chung et al.
Dynamic Support Information Mining for Category-Agnostic Pose Estimation
Pengfei Ren, Yuanyuan Gao, Haifeng Sun et al.
MART: Masked Affective RepresenTation Learning via Masked Temporal Distribution Distillation
Zhicheng Zhang, Pancheng Zhao, Eunil Park et al.
Once-Tuning-Multiple-Variants: Tuning Once and Expanded as Multiple Vision-Language Model Variants
Chong Yu, Tao Chen, Zhongxue Gan
Seeing is Not Believing: Adversarial Natural Object Optimization for Hard-Label 3D Scene Attacks
Daizong Liu, Wei Hu
HomoGen: Enhanced Video Inpainting via Homography Propagation and Diffusion
Ding Ding, Yueming Pan, Ruoyu Feng et al.
CrossMAE: Cross-Modality Masked Autoencoders for Region-Aware Audio-Visual Pre-Training
Yuxin Guo, Siyang Sun, Shuailei Ma et al.
Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection
Yicheng Xiao, Zhuoyan Luo, Yong Liu et al.
VS: Reconstructing Clothed 3D Human from Single Image via Vertex Shift
Leyuan Liu, Yuhan Li, Yunqi Gao et al.
Efficient Depth Estimation for Unstable Stereo Camera Systems on AR Glasses
Yongfan Liu, Hyoukjun Kwon
Learned Binocular-Encoding Optics for RGBD Imaging Using Joint Stereo and Focus Cues
Yuhui Liu, Liangxun Ou, Qiang Fu et al.
Point Transformer V3: Simpler Faster Stronger
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang et al.
Query Efficient Black-Box Visual Prompting with Subspace Learning
Haozhen Zhang, Zhaogeng Liu, Hualin Zhang et al.
Fingerprinting Denoising Diffusion Probabilistic Models
Huan Teng, Yuhui Quan, Chengyu Wang et al.
AdaDARE-gamma: Balancing Stability and Plasticity in Multi-modal LLMs through Efficient Adaptation
Jingyi Xie, Jintao Yang, Zhunchen Luo et al.
SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling
Qi Zhu, Jiangwei Lao, Deyi Ji et al.
EVS-assisted Joint Deblurring Rolling-Shutter Correction and Video Frame Interpolation through Sensor Inverse Modeling
Rui Jiang, Fangwen Tu, Yixuan Long et al.
Empowering Resampling Operation for Ultra-High-Definition Image Enhancement with Model-Aware Guidance
Yu, Jie Huang, Li et al.
READ: Retrieval-Enhanced Asymmetric Diffusion for Motion Planning
Takeru Oba, Matthew Walter, Norimichi Ukita
MeshPose: Unifying DensePose and 3D Body Mesh Reconstruction
Eric-Tuan Le, Antonios Kakolyris, Petros Koutras et al.
MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation
Xiaolong Deng, Huisi Wu, Runhao Zeng et al.
iG-6DoF: Model-free 6DoF Pose Estimation for Unseen Object via Iterative 3D Gaussian Splatting
Tuo Cao, Fei LUO, Jiongming Qin et al.
UNIALIGN: Scaling Multimodal Alignment within One Unified Model
bo zhou, Liulei Li, Yujia Wang et al.
Hyperspectral Pansharpening via Diffusion Models with Iteratively Zero-Shot Guidance
Jin-Liang Xiao, Ting-Zhu Huang, Liang-Jian Deng et al.
Large-scale Multi-view Tensor Clustering with Implicit Linear Kernels
Jiyuan Liu, Xinwang Liu, chuankun Li et al.
TAET: Two-Stage Adversarial Equalization Training on Long-Tailed Distributions
Wang Yu-Hang, Junkang Guo, Aolei Liu et al.
The Impact Label Noise and Choice of Threshold has on Cross-Entropy and Soft-Dice in Image Segmentation
Marcus Nordström, Atsuto Maki, Henrik Hult
TIGER: Time-Varying Denoising Model for 3D Point Cloud Generation with Diffusion Process
Zhiyuan Ren, Minchul Kim, Feng Liu et al.
EfficientLLaVA: Generalizable Auto-Pruning for Large Vision-language Models
Yinan Liang, Ziwei Wang, Xiuwei Xu et al.
Learning Continual Compatible Representation for Re-indexing Free Lifelong Person Re-identification
Zhenyu Cui, Jiahuan Zhou, Xun Wang et al.
Motions as Queries: One-Stage Multi-Person Holistic Human Motion Capture
Kenkun Liu, Yurong Fu, Weihao Yuan et al.
Rethinking Noisy Video-Text Retrieval via Relation-aware Alignment
Huakai Lai, Guoxin Xiong, Huayu Mai et al.
Reconstructing Close Human Interaction with Appearance and Proxemics Reasoning
Buzhen Huang, Chen Li, Chongyang Xu et al.
M3GYM: A Large-Scale Multimodal Multi-view Multi-person Pose Dataset for Fitness Activity Understanding in Real-world Settings
Qingzheng Xu, Ru Cao, Xin Shen et al.
Star with Bilinear Mapping
Zelin Peng, Yu Huang, Zhengqin Xu et al.
Z*: Zero-shot Style Transfer via Attention Reweighting
Yingying Deng, Xiangyu He, Fan Tang et al.
HOT: Hadamard-based Optimized Training
Seonggon Kim, Juncheol Shin, Seung-taek Woo et al.
Spike-guided Motion Deblurring with Unknown Modal Spatiotemporal Alignment
Jiyuan Zhang, Shiyan Chen, Yajing Zheng et al.
ConCon-Chi: Concept-Context Chimera Benchmark for Personalized Vision-Language Tasks
Andrea Rosasco, Stefano Berti, Giulia Pasquale et al.
Instance-aware Contrastive Learning for Occluded Human Mesh Reconstruction
Mi-Gyeong Gwon, Gi-Mun Um, Won-Sik Cheong et al.
Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Large Model Enhancement
Qianhan Feng, Wenshuo Li, Tong Lin et al.
Learning Textual Prompts for Open-World Semi-Supervised Learning
Yuxin Fan, Junbiao Cui, Jiye Liang
UniMODE: Unified Monocular 3D Object Detection
Zhuoling Li, Xiaogang Xu, Ser-Nam Lim et al.
Enhancing Few-Shot Class-Incremental Learning via Training-Free Bi-Level Modality Calibration
Yiyang Chen, Tianyu Ding, Lei Wang et al.
Animate and Sound an Image
Xihua Wang, Ruihua Song, Chongxuan Li et al.
Towards Explainable and Unprecedented Accuracy in Matching Challenging Finger Crease Patterns
Zhenyu Zhou, Chengdong Dong, Ajay Kumar
Investigating Compositional Challenges in Vision-Language Models for Visual Grounding
Yunan Zeng, Yan Huang, Jinjin Zhang et al.
Exploring Scene Affinity for Semi-Supervised LiDAR Semantic Segmentation
Chuandong Liu, Xingxing Weng, Shuguo Jiang et al.
Less is More: Efficient Image Vectorization with Adaptive Parameterization
Kaibo Zhao, Liang Bao, Yufei Li et al.
Accept the Modality Gap: An Exploration in the Hyperbolic Space
Sameera Ramasinghe, Violetta Shevchenko, Gil Avraham et al.
MirageRoom: 3D Scene Segmentation with 2D Pre-trained Models by Mirage Projection
Haowen Sun, Yueqi Duan, Juncheng Yan et al.
Joint Scheduling of Causal Prompts and Tasks for Multi-Task Learning
Chaoyang Li, Jianyang Qin, Jinhao Cui et al.
DynScene: Scalable Generation of Dynamic Robotic Manipulation Scenes for Embodied AI
Sangmin Lee, Sungyong Park, Heewon Kim
VideoGEM: Training-free Action Grounding in Videos
Felix Vogel, Walid Bousselham, Anna Kukleva et al.
Event-Equalized Dense Video Captioning
Kangyi Wu, Pengna Li, Jingwen Fu et al.
GazeGene: Large-scale Synthetic Gaze Dataset with 3D Eyeball Annotations
Yiwei Bao, Zhiming Wang, Feng Lu
Feature Information Driven Position Gaussian Distribution Estimation for Tiny Object Detection
Jinghao Bian, Mingtao Feng, Weisheng Dong et al.
PRaDA: Projective Radial Distortion Averaging
Daniil Sinitsyn, Linus Härenstam-Nielsen, Daniel Cremers
Random Entangled Tokens for Adversarially Robust Vision Transformer
Huihui Gong, Minjing Dong, Siqi Ma et al.
Structure from Collision
Takuhiro Kaneko
DYSON: Dynamic Feature Space Self-Organization for Online Task-Free Class Incremental Learning
Yuhang He, YingJie Chen, Yuhan Jin et al.
Learned Trajectory Embedding for Subspace Clustering
Yaroslava Lochman, Christopher Zach, Carl Olsson
Language-Guided Salient Object Ranking
Fang Liu, Yuhao Liu, Ke Xu et al.
Weakly Supervised Video Individual Counting
Xinyan Liu, Guorong Li, Yuankai Qi et al.
Beyond Generation: A Diffusion-based Low-level Feature Extractor for Detecting AI-generated Images
Nan Zhong, Haoyu Chen, Yiran Xu et al.
S2D-LFE: Sparse-to-Dense Light Field Event Generation
Yutong Liu, Wenming Weng, Yueyi Zhang et al.
MTADiffusion: Mask Text Alignment Diffusion Model for Object Inpainting
jun huang, Ting Liu, Yihang Wu et al.
PolarNeXt: Rethink Instance Segmentation with Polar Representation
Jiacheng Sun, Xinghong Zhou, Yiqiang Wu et al.
Label Shift Meets Online Learning: Ensuring Consistent Adaptation with Universal Dynamic Regret
Yucong Dai, Shilin Gu, Ruidong Fan et al.
A Pedestrian is Worth One Prompt: Towards Language Guidance Person Re-Identification
Zexian Yang, Dayan Wu, Chenming Wu et al.
ROLL: Robust Noisy Pseudo-label Learning for Multi-View Clustering with Noisy Correspondence
Yuan Sun, Yongxiang Li, Zhenwen Ren et al.
Bridging Viewpoint Gaps: Geometric Reasoning Boosts Semantic Correspondence
Qiyang Qian, Hansheng Chen, Masayoshi Tomizuka et al.
From SAM to CAMs: Exploring Segment Anything Model for Weakly Supervised Semantic Segmentation
Hyeokjun Kweon, Kuk-Jin Yoon
Asynchronous Collaborative Graph Representation for Frames and Events
Dianze Li, Jianing Li, Xu Liu et al.
Theory-Inspired Deep Multi-View Multi-Label Learning with Incomplete Views and Noisy Labels
Quanjiang Li, Tingjin Luo, Jiahui Liao
Improving Semi-Supervised Semantic Segmentation with Sliced-Wasserstein Feature Alignment and Uniformity
Chen Yi Lu, Kasra Derakhshandeh, Somali Chaterji
Hierarchical Adaptive Filtering Network for Text Image Specular Highlight Removal
Zhi Jiang, Jingbo Hu, Ling Zhang et al.
R-Cyclic Diffuser: Reductive and Cyclic Latent Diffusion for 3D Clothed Human Digitalization
Kennard Chan, Fayao Liu, Guosheng Lin et al.
HERA: Hybrid Explicit Representation for Ultra-Realistic Head Avatars
Hongrui Cai, Yuting Xiao, Xuan Wang et al.
Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention Alignment and Prompt Tuning
Leslie Ching Ow Tiong, Dick Sigmund, Chen-Hui Chan et al.
SGSST: Scaling Gaussian Splatting Style Transfer
Bruno Galerne, Jianling WANG, Lara Raad et al.
Unified Medical Lesion Segmentation via Self-referring Indicator
Shijie Chang, Xiaoqi Zhao, Lihe Zhang et al.
Class Incremental Learning with Multi-Teacher Distillation
Haitao Wen, Lili Pan, Yu Dai et al.
Parameter Efficient Self-Supervised Geospatial Domain Adaptation
Linus Scheibenreif, Michael Mommert, Damian Borth
Beyond Seen Primitive Concepts and Attribute-Object Compositional Learning
Nirat Saini, Khoi Pham, Abhinav Shrivastava
GRAE-3DMOT: Geometry Relation-Aware Encoder for Online 3D Multi-Object Tracking
Hyunseop Kim, Hyo-Jun Lee, Yonguk Lee et al.
Gazing at Rewards: Eye Movements as a Lens into Human and AI Decision-Making in Hybrid Visual Foraging
Bo Wang, Dingwei Tan, Yen-Ling Kuo et al.
Navigating the Unseen: Zero-shot Scene Graph Generation via Capsule-Based Equivariant Features
Wenhuan Huang, Yi JI, guiqian zhu et al.
Non-Natural Image Understanding with Advancing Frequency-based Vision Encoders
Wang Lin, Qingsong Wang, Yueying Feng et al.
Hunyuan-Portrait: Implicit Condition Control for Enhanced Portrait Animation
Zunnan Xu, Zhentao Yu, Zixiang Zhou et al.
AnyScene: Customized Image Synthesis with Composited Foreground
Ruidong Chen, Lanjun Wang, Weizhi Nie et al.
DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers
Mert Bülent Sarıyıldız, Philippe Weinzaepfel, Thomas Lucas et al.
Task-Aware Clustering for Prompting Vision-Language Models
Fusheng Hao, Fengxiang He, Fuxiang Wu et al.
Endow SAM with Keen Eyes: Temporal-spatial Prompt Learning for Video Camouflaged Object Detection
Wenjun Hui, Zhenfeng Zhu, Shuai Zheng et al.
NICE: Neurogenesis Inspired Contextual Encoding for Replay-free Class Incremental Learning
Mustafa B Gurbuz, Jean Moorman, Constantine Dovrolis
Data-Free Group-Wise Fully Quantized Winograd Convolution via Learnable Scales
Shuokai Pan, Gerti Tuzi, Sudarshan Sreeram et al.
Beyond Words: Augmenting Discriminative Richness via Diffusions in Unsupervised Prompt Learning
Hairui Ren, Fan Tang, He Zhao et al.
Activating Sparse Part Concepts for 3D Class Incremental Learning
Zhenya Tian, Jun Xiao, Liu lupeng et al.
Model Diagnosis and Correction via Linguistic and Implicit Attribute Editing
Xuanbai Chen, Xiang Xu, Zhihua Li et al.
Noisy One-point Homographies are Surprisingly Good
Yaqing Ding, Jonathan Astermark, Magnus Oskarsson et al.
PS-EIP: Robust Photometric Stereo Based on Event Interval Profile
Kazuma Kitazawa, Takahito Aoto, Satoshi Ikehata et al.
Three-view Focal Length Recovery From Homographies
Yaqing Ding, Viktor Kocur, Zuzana Berger Haladova et al.
Visual Representation Learning through Causal Intervention for Controllable Image Editing
Shanshan Huang, Haoxuan Li, Chunyuan Zheng et al.
Dynamic Content Prediction with Motion-aware Priors for Blind Face Video Restoration
Lianxin Xie, csbingbing zheng, Si Wu et al.
Self-Calibrating Vicinal Risk Minimisation for Model Calibration
Jiawei Liu, Changkun Ye, Ruikai Cui et al.
CORE-MPI: Consistency Object Removal with Embedding MultiPlane Image
Donggeun Yoon, Donghyeon Cho
ScoreHypo: Probabilistic Human Mesh Estimation with Hypothesis Scoring
Yuan Xu, Xiaoxuan Ma, Jiajun Su et al.
BilevelPruning: Unified Dynamic and Static Channel Pruning for Convolutional Neural Networks
Shangqian Gao, Yanfu Zhang, Feihu Huang et al.
Pattern Analogies: Learning to Perform Programmatic Image Edits by Analogy
Aditya Ganeshan, Thibault Groueix, Paul Guerrero et al.
Device-Wise Federated Network Pruning
Shangqian Gao, Junyi Li, Zeyu Zhang et al.
Harnessing Global-Local Collaborative Adversarial Perturbation for Anti-Customization
Long Xu, Jiakai Wang, Haojie Hao et al.
Plug-and-Play PPO: An Adaptive Point Prompt Optimizer Making SAM Greater
Xueyu Liu, Rui Wang, Yexin Lai et al.
Progress-Aware Online Action Segmentation for Egocentric Procedural Task Videos
Yuhan Shen, Ehsan Elhamifar
Learning to Segment Referred Objects from Narrated Egocentric Videos
Yuhan Shen, Huiyu Wang, Xitong Yang et al.
A Unified Approach to Interpreting Self-supervised Pre-training Methods for 3D Point Clouds via Interactions
Qiang Li, Jian Ruan, Fanghao Wu et al.
Open Set Label Shift with Test Time Out-of-Distribution Reference
Changkun Ye, Russell Tsuchida, Lars Petersson et al.
Inlier Confidence Calibration for Point Cloud Registration
Yongzhe Yuan, Yue Wu, Xiaolong Fan et al.
PDFactor: Learning Tri-Perspective View Policy Diffusion Field for Multi-Task Robotic Manipulation
Jingyi Tian, Le Wang, Sanping Zhou et al.
Incomplete Multi-View Multi-label Learning via Disentangled Representation and Label Semantic Embedding
Xu Yan, Jun Yin, Jie Wen
CocoER: Aligning Multi-Level Feature by Competition and Coordination for Emotion Recognition
Xuli Shen, Hua Cai, Weilin Shen et al.
Brain-Inspired Spiking Neural Networks for Energy-Efficient Object Detection
Ziqi Li, Tao Gao, Yisheng An et al.
PointSR: Self-Regularized Point Supervision for Drone-View Object Detection
Weizhuo Li, Yue Xi, Wenjing Jia et al.
Camouflage Anything: Learning to Hide using Controlled Out-painting and Representation Engineering
Biplab Das, Viswanath Gopalakrishnan
Leveraging Temporal Cues for Semi-Supervised Multi-View 3D Object Detection
Jinhyung Park, Navyata Sanghvi, Hiroki Adachi et al.
Compositional Targeted Multi-Label Universal Perturbations
Hassan Mahmood, Ehsan Elhamifar
ODA-GAN: Orthogonal Decoupling Alignment GAN Assisted by Weakly-supervised Learning for Virtual Immunohistochemistry Staining
Tong Wang, Mingkang Wang, Zhongze Wang et al.
Your Transferability Barrier is Fragile: Free-Lunch for Transferring the Non-Transferable Learning
Ziming Hong, Li Shen, Tongliang Liu
MaxQ: Multi-Axis Query for N:M Sparsity Network
Jingyang Xiang, Siqi Li, Junhao Chen et al.
ViKIENet: Towards Efficient 3D Object Detection with Virtual Key Instance Enhanced Network
Zhuochen Yu, Bijie Qiu, Andy W. H. Khong
Efficient Scene Recovery Using Luminous Flux Prior
ZhongYu Li, Lei Zhang
Revisiting Global Translation Estimation with Feature Tracks
Peilin Tao, Hainan Cui, Mengqi Rong et al.
Beyond Single-Modal Boundary: Cross-Modal Anomaly Detection through Visual Prototype and Harmonization
Kai Mao, Ping Wei, Yiyang Lian et al.
Task-Specific Gradient Adaptation for Few-Shot One-Class Classification
Yunlong Li, Xiabi Liu, Liyuan Pan et al.
Text Augmented Correlation Transformer For Few-shot Classification & Segmentation
Srinivasa Rao Nandam, Sara Atito, Zhenhua Feng et al.
LAMP: Learn A Motion Pattern for Few-Shot Video Generation
Rui-Qi Wu, Liangyu Chen, Tong Yang et al.
TAGA: Self-supervised Learning for Template-free Animatable Gaussian Articulated Model
Zhichao Zhai, Guikun Chen, Wenguan Wang et al.
Towards CLIP-driven Language-free 3D Visual Grounding via 2D-3D Relational Enhancement and Consistency
Yuqi Zhang, Han Luo, Yinjie Lei
All-Day Multi-Camera Multi-Target Tracking
Huijie Fan, Yu Qiao, Yihao Zhen et al.
Neural Fields as Distributions: Signal Processing Beyond Euclidean Space
Daniel Rebain, Soroosh Yazdani, Kwang Moo Yi et al.
Task-aware Cross-modal Feature Refinement Transformer with Large Language Models for Visual Grounding
Wenbo Chen, Zhen Xu, Ruotao Xu et al.
Global and Hierarchical Geometry Consistency Priors for Few-shot NeRFs in Indoor Scenes
Xiaotian Sun, Qingshan Xu, Xinjie Yang et al.
Hazy Low-Quality Satellite Video Restoration Via Learning Optimal Joint Degradation Patterns and Continuous-Scale Super-Resolution Reconstruction
Ning Ni, Libao Zhang
The STVchrono Dataset: Towards Continuous Change Recognition in Time
Yanjun Sun, Yue Qiu, Mariia Khan et al.
Unleashing Channel Potential: Space-Frequency Selection Convolution for SAR Object Detection
Ke Li, Di Wang, Zhangyuan Hu et al.
ADD: Attribution-Driven Data Augmentation Framework for Boosting Image Super-Resolution
Zeyu Mi, Yu-Bin Yang
SASep: Saliency-Aware Structured Separation of Geometry and Feature for Open Set Learning on Point Clouds
Jinfeng Xu, Xianzhi Li, Yuan Tang et al.
Pixel-Aligned Language Model
Jiarui Xu, Xingyi Zhou, Shen Yan et al.
All-Optical Nonlinear Diffractive Deep Network for Ultrafast Image Denoising
Xiaoling Zhou, Zhemg Lee, Wei Ye et al.
DejaVid: Encoder-Agnostic Learned Temporal Matching for Video Classification
Darryl Ho, Samuel Madden
Hierarchical Knowledge Prompt Tuning for Multi-task Test-Time Adaptation
Qiang Zhang, Mengsheng Zhao, Jiawei Liu et al.
A Focused Human Body Model for Accurate Anthropometric Measurements Extraction
Shuhang Chen, Xianliang Huang, Zhizhou Zhong et al.
CAMEL: CAusal Motion Enhancement Tailored for Lifting Text-driven Video Editing
Guiwei Zhang, Tianyu Zhang, Guanglin Niu et al.
Be More Specific: Evaluating Object-centric Realism in Synthetic Images
Anqi Liang, Ciprian Adrian Corneanu, Qianli Feng et al.
GPVK-VL: Geometry-Preserving Virtual Keyframes for Visual Localization under Large Viewpoint Changes
Yunxuan Li, Lei Fan, Xiaoying Xing et al.
A Physics-informed Low-rank Deep Neural Network for Blind and Universal Lens Aberration Correction
Jin Gong, Runzhao Yang, Weihang Zhang et al.
NAPGuard: Towards Detecting Naturalistic Adversarial Patches
Siyang Wu, Jiakai Wang, Jiejie Zhao et al.
Bootstrapping SparseFormers from Vision Foundation Models
Ziteng Gao, Zhan Tong, Kevin Qinghong Lin et al.
Layered Motion Fusion: Lifting Motion Segmentation to 3D in Egocentric Videos
Vadim Tschernezki, Diane Larlus, Andrea Vedaldi et al.
Adapting Pre-trained 3D Models for Point Cloud Video Understanding via Cross-frame Spatio-temporal Perception
Baixuan Lv, Yaohua Zha, Tao Dai et al.
Cross-Modal Distillation for 2D/3D Multi-Object Discovery from 2D Motion
Saad Lahlali, Sandra Kara, Hejer AMMAR et al.
Rethinking Reconstruction and Denoising in the Dark: New Perspective, General Architecture and Beyond
Long Ma, Tengyu Ma, Ziye Li et al.
Generating Handwritten Mathematical Expressions From Symbol Graphs: An End-to-End Pipeline
Yu chen, Fei Gao, YanguangZhang et al.
Retaining Knowledge and Enhancing Long-Text Representations in CLIP through Dual-Teacher Distillation
Yuheng Feng, Changsong Wen, Zelin Peng et al.
Domain Separation Graph Neural Networks for Saliency Object Ranking
Zijian Wu, Jun Lu, Jing Han et al.
GeoAvatar: Geometrically-Consistent Multi-Person Avatar Reconstruction from Sparse Multi-View Videos
Soohyun Lee, SeoYeon Kim, HeeKyung Lee et al.
Breaking the Memory Barrier of Contrastive Loss via Tile-Based Strategy
Zesen Cheng, Hang Zhang, Kehan Li et al.
GeoMM: On Geodesic Perspective for Multi-modal Learning
Shibin Mei, Hang Wang, Bingbing Ni
Resource-Efficient Transformer Pruning for Finetuning of Large Models
Fatih Ilhan, Gong Su, Selim Tekin et al.
Stabilizing and Accelerating Autofocus with Expert Trajectory Regularized Deep Reinforcement Learning
Shouhang Zhu, Chenglin Li, Yuankun Jiang et al.
Font-Agent: Enhancing Font Understanding with Large Language Models
Yingxin Lai, Cuijie Xu, Haitian Shi et al.
Deep-TROJ: An Inference Stage Trojan Insertion Algorithm through Efficient Weight Replacement Attack
Sabbir Ahmed, RANYANG ZHOU, Shaahin Angizi et al.
Multi-Modal Contrastive Masked Autoencoders: A Two-Stage Progressive Pre-training Approach for RGBD Datasets
Muhammad Abdullah Jamal, Omid Mohareri
STINR: Deciphering Spatial Transcriptomics via Implicit Neural Representation
Yisi Luo, Xile Zhao, Kai Ye et al.
3D-SLNR: A Super Lightweight Neural Representation for Large-scale 3D Mapping
Chenhui Shi, Fulin Tang, Ning An et al.
Language-aware Visual Semantic Distillation for Video Question Answering
Bo Zou, Chao Yang, Yu Qiao et al.
DiLiGenRT: A Photometric Stereo Dataset with Quantified Roughness and Translucency
Heng Guo, Jieji Ren, Feishi Wang et al.
StyLitGAN: Image-Based Relighting via Latent Control
Anand Bhattad, James Soole, David Forsyth
Label-Efficient Group Robustness via Out-of-Distribution Concept Curation
Yiwei Yang, Anthony Liu, Robert Wolfe et al.
Batch Normalization Alleviates the Spectral Bias in Coordinate Networks
Zhicheng Cai, Hao Zhu, Qiu Shen et al.