Most Cited ECCV "event-level predictions" Papers
2,387 papers found • Page 11 of 12
Conference
CoherentGS: Sparse Novel View Synthesis with Coherent 3D Gaussians
Avinash Paliwal, Wei Ye, Jinhui Xiong et al.
Linearly Controllable GAN: Unsupervised Feature Categorization and Decomposition for Image Generation and Manipulation
Sehyung Lee, Mijung Kim, Yeongnam Chae et al.
Learning Quantized Adaptive Conditions for Diffusion Models
Yuchen Liang, Yuchuan Tian, Lei Yu et al.
Learn to Optimize Denoising Scores: A Unified and Improved Diffusion Prior for 3D Generation
Xiaofeng Yang, Yiwen Chen, Cheng Chen et al.
Discovering Unwritten Visual Classifiers with Large Language Models
Mia Chiquier, Utkarsh Mall, Carl Vondrick
D-SCo: Dual-Stream Conditional Diffusion for Monocular Hand-Held Object Reconstruction
Bowen Fu, Gu Wang, Chenyangguang Zhang et al.
Decoupling Common and Unique Representations for Multimodal Self-supervised Learning
Yi Wang, Conrad M Albrecht, Nassim Ait Ali Braham et al.
Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach
Taolin Zhang, Jiawang Bai, Zhihe Lu et al.
EgoPoser: Robust Real-Time Egocentric Pose Estimation from Sparse and Intermittent Observations Everywhere
Jiaxi Jiang, Paul Streli, Manuel Meier et al.
Synchronous Diffusion for Unsupervised Smooth Non-Rigid 3D Shape Matching
Dongliang Cao, Zorah Laehner, Florian Bernard
MemBN: Robust Test-Time Adaptation via Batch Norm with Statistics Memory
Juwon Kang, Nayeong Kim, Jungseul Ok et al.
SparseSSP: 3D Subcellular Structure Prediction from Sparse-View Transmitted Light Images
Jintu Zheng, Yi Ding, Qizhe Liu et al.
Projecting Points to Axes: Oriented Object Detection via Point-Axis Representation
Zeyang Zhao, Qilong Xue, Yifan Bai et al.
On the Approximation Risk of Few-Shot Class-Incremental Learning
Xuan Wang, Zhong Ji, Xiyao Liu et al.
In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation
Dahyun Kang, Minsu Cho
RISurConv: Rotation Invariant Surface Attention-Augmented Convolutions for 3D Point Cloud Classification and Segmentation
Zhiyuan Zhang, Licheng Yang, Zhiyu Xiang
Progressive Classifier and Feature Extractor Adaptation for Unsupervised Domain Adaptation on Point Clouds
Zicheng Wang, Zhen Zhao, Yiming Wu et al.
CLIP-Guided Generative Networks for Transferable Targeted Adversarial Attacks
Hao Fang, Jiawei Kong, Bin Chen et al.
An Economic Framework for 6-DoF Grasp Detection
Xiao-Ming Wu, Jia-Feng Cai, Jian-Jian Jiang et al.
URS-NeRF: Unordered Rolling Shutter Bundle Adjustment for Neural Radiance Fields
Bo Xu, Liu Ziao, Mengqi GUO et al.
Hierarchically Structured Neural Bones for Reconstructing Animatable Objects from Casual Videos
Subin Jeon, In Cho, Minsu Kim et al.
SEDiff: Structure Extraction for Domain Adaptive Depth Estimation via Denoising Diffusion Models
Dongseok Shim, Hyoun Jin Kim
3DSA:Multi-View 3D Human Pose Estimation With 3D Space Attention Mechanisms
Po Han Chen, Chia-Chi Tsai
Gaze Target Detection Based on Head-Local-Global Coordination
Yaokun Yang, Feng Lu
Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models
Nishad Singhi, Jae Myung Kim, Karsten Roth et al.
Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis
Yuanhao Cai, Yixun Liang, Jiahao Wang et al.
TCC-Det: Temporarily consistent cues for weakly-supervised 3D detection
Jan Skvrna, Lukas Neumann
Protecting NeRFs' Copyright via Plug-And-Play Watermarking Base Model
Qi Song, Ziyuan Luo, Ka Chun Cheung et al.
MOD-UV: Learning Mobile Object Detectors from Unlabeled Videos
Yihong Sun, Bharath Hariharan
V-Trans4Style: Visual Transition Recommendation for Video Production Style Adaptation
Pooja Guhan, Tsung-Wei Huang, Guan-Ming Su et al.
Zero-Shot Detection of AI-Generated Images
Davide Cozzolino, GIovanni Poggi, Matthias Niessner et al.
WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation
Jiachen Lu, Ze Huang, Zeyu Yang et al.
EAFormer: Scene Text Segmentation with Edge-Aware Transformers
Haiyang Yu, Teng Fu, Bin Li et al.
Uncertainty-Driven Spectral Compressive Imaging with Spatial-Frequency Transformer
Lintao Peng, Siyu Xie, Liheng Bian
SUMix: Mixup with Semantic and Uncertain Information
Huafeng Qin, Xin Jin, Hongyu Zhu et al.
Weakly-Supervised Spatio-Temporal Video Grounding with Variational Cross-Modal Alignment
Yang Jin, Yadong Mu
Learning Exhaustive Correlation for Spectral Super-Resolution: Where Spatial-Spectral Attention Meets Linear Dependence
Hongyuan Wang, Lizhi Wang, Jiang Xu et al.
Chronologically Accurate Retrieval for Temporal Grounding of Motion-Language Models
Kent Fujiwara, Mikihiro Tanaka, Qing Yu
Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis
Qi Sun, Hang Zhou, Wengang Zhou et al.
SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow
Yuanzhi Zhu, Xingchao Liu, Qiang Liu
Domain Reduction Strategy for Non-Line-of-Sight Imaging
Hyunbo Shim, In Cho, Daekyu Kwon et al.
Learning to Enhance Aperture Phasor Field for Non-Line-of-Sight Imaging
In Cho, Hyunbo Shim, Seon Joo Kim
Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose Estimation
Mengchen Zhang, Tong Wu, Tai Wang et al.
High-Fidelity 3D Textured Shapes Generation by Sparse Encoding and Adversarial Decoding
Qi Zuo, Xiaodong Gu, Yuan Dong et al.
FRI-Net: Floorplan Reconstruction via Room-wise Implicit Representation
Honghao Xu, Juzhan Xu, Zeyu Huang et al.
Fully Sparse 3D Occupancy Prediction
Haisong Liu, Yang Chen, Haiguang Wang et al.
Text to Layer-wise 3D Clothed Human Generation
Junting Dong, Qi Fang, Zehuan Huang et al.
A Framework for Efficient Model Evaluation through Stratification, Sampling, and Estimation
Riccardo Fogliato, Pratik Patil, Mathew Monfort et al.
DECIDER: Leveraging Foundation Model Priors for Improved Model Failure Detection and Explanation
Rakshith Subramanyam, Kowshik Thopalli, Vivek Sivaraman Narayanaswamy et al.
SemTrack: A Large-scale Dataset for Semantic Tracking in the Wild
Pengfei Wang, Xiaofei Hui, Jing Wu et al.
ExMatch: Self-guided Exploitation for Semi-Supervised Learning with Scarce Labeled Samples
Noo-ri Kim, Jin-Seop Lee, Jee-Hyong LEE
Distribution Alignment for Fully Test-Time Adaptation with Dynamic Online Data Streams
Ziqiang Wang, Zhixiang Chi, Yanan Wu et al.
CaesarNeRF: Calibrated Semantic Representation for Few-Shot Generalizable Neural Rendering
Haidong Zhu, Tianyu Ding, Tianyi Chen et al.
Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis
Basile Van Hoorick, Rundi Wu, Ege Ozguroglu et al.
Open-Vocabulary RGB-Thermal Semantic Segmentation
Guoqiang Zhao, JunJie Huang, Xiaoyun Yan et al.
SIGMA: Sinkhorn-Guided Masked Video Modeling
Mohammadreza Salehi, Michael Dorkenwald, Fida Mohammad Thoker et al.
3DFG-PIFu: 3D Feature Grids for Human Digitization from Sparse Views
Kennard Yanting Chan, Fayao Liu, Guosheng Lin et al.
UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening
Siyuan Cheng, Guangyu Shen, Kaiyuan Zhang et al.
Elevating All Zero-Shot Sketch-Based Image Retrieval Through Multimodal Prompt Learning
Mainak Singha, Ankit Jha, Divyam Gupta et al.
Unsupervised Moving Object Segmentation with Atmospheric Turbulence
Dehao Qin, Ripon Saha, Woojeh Chung et al.
Modeling Label Correlations with Latent Context for Multi-Label Recognition
Zhao-Min Chen, Quan Cui, Ruoxi Deng et al.
FreestyleRet: Retrieving Images from Style-Diversified Queries
Hao Li, Yanhao Jia, Peng Jin et al.
PreLAR: World Model Pre-training with Learnable Action Representation
Lixuan Zhang, Meina Kan, Shiguang Shan et al.
GENIXER: Empowering Multimodal Large Language Models as a Powerful Data Generator
Hengyuan Zhao, Pan Zhou, Mike Zheng Shou
Eliminating Feature Ambiguity for Few-Shot Segmentation
Qianxiong Xu, Guosheng Lin, Chen Change Loy et al.
Towards Reliable Advertising Image Generation Using Human Feedback
Zhenbang Du, Wei Feng, Haohan Wang et al.
The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt Interpretation
Yi Yao, Chan-Feng Hsu, Jhe-Hao Lin et al.
Weak-to-Strong Compositional Learning from Generative Models for Language-based Object Detection
Kwanyong Park, Kuniaki Saito, Donghyun Kim
TurboEdit: Real-time text-based disentangled real image editing
Zongze Wu, Nicholas I Kolkin, Jonathan Brandt et al.
The Role of Masking for Efficient Supervised Knowledge Distillation of Vision Transformers
Seungwoo Son, Jegwang Ryu, Namhoon Lee et al.
Improving Vision and Language Concepts Understanding with Multimodal Counterfactual Samples
Chengen Lai, Shengli Song, Sitong Yan et al.
Functional Transform-Based Low-Rank Tensor Factorization for Multi-Dimensional Data Recovery
Jian-Li Wang, Xi-Le Zhao
G2fR: Frequency Regularization in Grid-based Feature Encoding Neural Radiance Fields
Shuxiang Xie, Shuyi Zhou, Ken Sakurada et al.
Clean & Compact: Efficient Data-Free Backdoor Defense with Model Compactness
Huy Phan, Jinqi Xiao, Yang Sui et al.
Restoring Images in Adverse Weather Conditions via Histogram Transformer
Shangquan Sun, Wenqi Ren, Xinwei Gao et al.
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding
Yue Fan, Xiaojian Ma, Rujie Wu et al.
Language-Assisted Skeleton Action Understanding for Skeleton-Based Temporal Action Segmentation
Haoyu Ji, Bowen Chen, Xinglong Xu et al.
Resilience of Entropy Model in Distributed Neural Networks
Milin Zhang, Mohammad Abdi, Shahriar Rifat et al.
ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders
Carlos Hinojosa, Shuming Liu, Bernard Ghanem
Cocktail Universal Adversarial Attack on Deep Neural Networks
Shaoxin Li, Xiaofeng Liao, Xin Che et al.
Momentum Auxiliary Network for Supervised Local Learning
Junhao Su, Changpeng Cai, Feiyu Zhu et al.
Binomial Self-compensation for Motion Error in Dynamic 3D Scanning
Geyou Zhang, Ce Zhu, Kai Liu
A Geometric Distortion Immunized Deep Watermarking Framework with Robustness Generalizability
Linfeng Ma, Han Fang, Tianyi Wei et al.
Free-Viewpoint Video of Outdoor Sports Using a Drone
Zhengdong Hong
Blind image deblurring with noise-robust kernel estimation
Chanseok Lee, Jeongsol Kim, Seungmin Lee et al.
BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos
Pilhyeon Lee, Hyeran Byun
LetsMap: Unsupervised Representation Learning for Label-Efficient Semantic BEV Mapping
Nikhil Gosala, Kürsat Petek, B Ravi Kiran et al.
VideoAgent: Long-form Video Understanding with Large Language Model as Agent
Xiaohan Wang, Yuhui Zhang, Orr Zohar et al.
Disentangling Masked Autoencoders for Unsupervised Domain Generalization
An Zhang, Han Wang, Xiang Wang et al.
AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting
Yu Wang, Xiaogeng Liu, Yu Li et al.
L-DiffER: Single Image Reflection Removal with Language-based Diffusion Model
Yuchen Hong, Haofeng Zhong, Shuchen Weng et al.
MoE-DiffIR: Task-customized Diffusion Priors for Universal Compressed Image Restoration
Yulin Ren, Xin Li, Bingchen Li et al.
Train Till You Drop: Towards Stable and Robust Source-free Unsupervised 3D Domain Adaptation
Bjoern Michele, Alexandre Boulch, Tuan Hung Vu et al.
Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing
Vadim Titov, Madina Khalmatova, Alexandra Ivanova et al.
Audio-driven Talking Face Generation with Stabilized Synchronization Loss
Dogucan Yaman, Fevziye Irem Eyiokur Yaman, Leonard Bärmann et al.
Adaptive Human Trajectory Prediction via Latent Corridors
Neerja Thakkar, Karttikeya Mangalam, Andrea Bajcsy et al.
Generalizable Facial Expression Recognition
Yuhang Zhang, Xiuqi Zheng, Chenyi Liang et al.
RS-NeRF: Neural Radiance Fields from Rolling Shutter Images
Muyao Niu, Tong Chen, Yifan Zhan et al.
MARs: Multi-view Attention Regularizations for Patch-based Feature Recognition of Space Terrain
Timothy Chase, Karthik Dantu
Learning to Generate Conditional Tri-plane for 3D-aware Expression Controllable Portrait Animation
Taekyung Ki, Dongchan Min, Gyeongsu Chae
How Video Meetings Change Your Expression
Sumit Sarin, Utkarsh Mall, Purva Tendulkar et al.
Grid-Attention: Enhancing Computational Efficiency of Large Vision Models without Fine-Tuning
Pengyu Li, Biao Wang, Tianchu Guo et al.
SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding
Zixu Cheng, Yujiang Pu, Shaogang Gong et al.
SkyMask: Attack-agnostic Robust Federated Learning with Fine-grained Learnable Masks
Peishen Yan, Hao Wang, Tao Song et al.
Enhanced Motion Forecasting with Visual Relation Reasoning
Sungjune Kim, Hadam Baek, Seunggwan Lee et al.
Few-shot Class Incremental Learning with Attention-Aware Self-Adaptive Prompt
Chenxi Liu, Zhenyi Wang, Tianyi Xiong et al.
Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization
Jiajun Hu, Jian Zhang, Lei Qi et al.
DSA: Discriminative Scatter Analysis for Early Smoke Segmentation
Lujian Yao, Haitao Zhao, Jingchao Peng et al.
DualBEV: Unifying Dual View Transformation with Probabilistic Correspondences
Peidong Li, Wancheng Shen, Qihao Huang et al.
Continuous SO(3) Equivariant Convolution for 3D Point Cloud Analysis
Jaein Kim, HEE BIN YOO, Dong-Sig Han et al.
Rethinking Deep Unrolled Model for Accelerated MRI Reconstruction
Bingyu Xin, Meng Ye, Leon Axel et al.
MedRAT: Unpaired Medical Report Generation via Auxiliary Tasks
Elad Hirsch, Gefen Dawidowicz, Ayellet Tal
Towards Unified Representation of Invariant-Specific Features in Missing Modality Face Anti-Spoofing
Guanghao Zheng, Yuchen Liu, Wenrui Dai et al.
Retrieval Robust to Object Motion Blur
Rong Zou, Marc Pollefeys, Denys Rozumnyi
Augmented Neural Fine-tuning for Efficient Backdoor Purification
Md Nazmul Karim, Abdullah Al Arafat, Umar Khalid et al.
OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web
Raghav Kapoor, Yash Parag Butala, Melisa A Russak et al.
TPA3D: Triplane Attention for Fast Text-to-3D Generation
Bin-Shih Wu, HONG-EN CHEN, Sheng-Yu Huang et al.
Self-Supervised Underwater Caustics Removal and Descattering via Deep Monocular SLAM
Jonathan Sauder, Devis TUIA
Scalable Group Choreography via Variational Phase Manifold Learning
Nhat Le, Khoa Do, Xuan Bui et al.
Bi-directional Contextual Attention for 3D Dense Captioning
Minjung Kim, Hyung Suk Lim, Soonyoung Lee et al.
I Can't Believe It's Not Scene Flow!
Ishan Khatri, Kyle Vedder, Neehar Peri et al.
MambaIR: A Simple Baseline for Image Restoration with State-Space Model
Hang Guo, Jinmin Li, Tao Dai et al.
Decomposition Betters Tracking Everything Everywhere
Rui Li, Dong Liu
SCAPE: A Simple and Strong Category-Agnostic Pose Estimator
Yujia Liang, Zixuan Ye, Wenze Liu et al.
MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo
Tianqi Liu, Guangcong Wang, Shoukang Hu et al.
Efficient and Versatile Robust Fine-Tuning of Zero-shot Models
Sungyeon Kim, Boseung Jeong, Donghyun Kim et al.
Image-to-Lidar Relational Distillation for Autonomous Driving Data
Anas Mahmoud, Ali Harakeh, Steven Waslander
Invertible Neural Warp for NeRF
Shin-Fang Chng, Ravi Garg, Hemanth Saratchandran et al.
T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning
Weijie Wei, Fatemeh Karimi Nejadasl, Theo Gevers et al.
R3DS: Reality-linked 3D Scenes for Panoramic Scene Understanding
Qirui Wu, Sonia Raychaudhuri, Daniel Ritchie et al.
IGNORE: Information Gap-based False Negative Loss Rejection for Single Positive Multi-Label Learning
Gyeong Ryeol Song, Noo-ri Kim, Jin-Seop Lee et al.
G3R: Gradient Guided Generalizable Reconstruction
Yun Chen, Jingkang Wang, Ze Yang et al.
Uni3DL: A Unified Model for 3D Vision-Language Understanding
Xiang Li, Jian Ding, Zhaoyang Chen et al.
Analytic-Splatting: Anti-Aliased 3D Gaussian Splatting via Analytic Integration
Zhihao Liang, Qi Zhang, WENBO HU et al.
COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation
Jiefeng Li, Ye Yuan, Davis Rempe et al.
Learning to Robustly Reconstruct Dynamic Scenes from Low-light Spike Streams
Liwen Hu, gang ding, Mianzhi Liu et al.
CoLA: Conditional Dropout and Language-driven Robust Dual-modal Salient Object Detection
Shuang Hao, Chunlin Zhong, He Tang
Siamese Vision Transformers are Scalable Audio-visual Learners
Yan-Bo Lin, Gedas Bertasius
MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection
Kuo Wang, Lechao Cheng, Weikai Chen et al.
Rethinking Image Super Resolution from Training Data Perspectives
Go Ohtani, Ryu Tadokoro, Ryosuke Yamada et al.
CIC-BART-SSA: : Controllable Image Captioning with Structured Semantic Augmentation
Kalliopi Basioti, Mohamed A Abdelsalam, Federico Fancellu et al.
Visual Relationship Transformation
Xiaoyu Xu, Jiayan Qiu, Baosheng Yu et al.
Scene-aware Human Motion Forecasting via Mutual Distance Prediction
Chaoyue Xing, Wei Mao, Miaomiao LIU
SA-DVAE: Improving Zero-Shot Skeleton-Based Action Recognition by Disentangled Variational Autoencoders
Sheng-Wei Li, Zi-Xiang Wei, Wei-Jie Jack Chen et al.
Human Hair Reconstruction with Strand-Aligned 3D Gaussians
Egor Zakharov, Vanessa Sklyarova, Michael J. Black et al.
Elysium: Exploring Object-level Perception in Videos through Semantic Integration Using MLLMs
Han Wang, Yanjie Wang, Ye Yongjie et al.
Learning 3D-aware GANs from Unposed Images with Template Feature Field
XINYA CHEN, Hanlei Guo, Yanrui Bin et al.
Soft Shadow Diffusion (SSD): Physics-inspired Learning for 3D Computational Periscopy
Fadlullah Raji, John Murray-Bruce
General and Task-Oriented Video Segmentation
Mu Chen, Liulei Li, Wenguan Wang et al.
Rethinking Data Bias: Dataset Copyright Protection via Embedding Class-wise Hidden Bias
Jinhyeok Jang, ByungOk Han, Jaehong Kim et al.
Federated Learning with Local Openset Noisy Labels
Zonglin Di, Zhaowei Zhu, Xiaoxiao Li et al.
Match-Stereo-Videos: Bidirectional Alignment for Consistent Dynamic Stereo Matching
Junpeng Jing, Ye Mao, Krystian Mikolajczyk
FisherRF: Active View Selection and Mapping with Radiance Fields using Fisher Information
Wen Jiang, BOSHU LEI, Kostas Daniilidis
Towards Multi-modal Transformers in Federated Learning
Guangyu Sun, Matias Mendieta, Aritra Dutta et al.
Scaling Backwards: Minimal Synthetic Pre-training?
Ryo Nakamura, Ryu Tadokoro, Ryosuke Yamada et al.
Adaptive Parametric Activation
Konstantinos P Alexandridis, Jiankang Deng, Anh Nguyen et al.
LayeredFlow: A Real-World Benchmark for Non-Lambertian Multi-Layer Optical Flow
Hongyu Wen, Erich Liang, Jia Deng
IRGen: Generative Modeling for Image Retrieval
Yidan Zhang, Ting Zhang, DONG CHEN et al.
RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
Bowen Zhang, Yiji Cheng, Chunyu Wang et al.
Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance
Kuan-Chih Huang, Yi-Hsuan Tsai, Ming-Hsuan Yang
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
Omer Dahary, Or Patashnik, Kfir Aberman et al.
Towards Open-Ended Visual Recognition with Large Language Models
Qihang Yu, Xiaohui Shen, Liang-Chieh Chen
Efficient Bias Mitigation Without Privileged Information
Mateo Espinosa Zarlenga, Sankaranarayanan, Jerone Andrews et al.
GAURA: Generalizable Approach for Unified Restoration and Rendering of Arbitrary Views
Vinayak Gupta, Rongali Simhachala Venkata Girish, Mukund Varma T et al.
PoseSOR: Human Pose Can Guide Our Attention
Huankang Guan, Rynson W.H. Lau
AEDNet: Adaptive Embedding and Multiview-Aware Disentanglement for Point Cloud Completion
Zhiheng Fu, Longguang Wang, Lian Xu et al.
SpeedUpNet: A Plug-and-Play Adapter Network for Accelerating Text-to-Image Diffusion Models
Weilong Chai, Dandan Zheng, Jiajiong Cao et al.
Camera Height Doesn't Change: Unsupervised Training for Metric Monocular Road-Scene Depth Estimation
Genki Kinoshita, Ko Nishino
Rethinking Tree-Ring Watermarking for Enhanced Multi-Key Identification
Hai Ci, Pei Yang, Yiren Song et al.
GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing
Jing Wu, Jiawang Bian, Xinghui Li et al.
Think before Placement: Common Sense Enhanced Transformer for Object Placement
Yaxuan Qin, Jiayu Xu, Ruiping Wang et al.
Dual-stage Hyperspectral Image Classification Model with Spectral Supertoken
Peifu Liu, Tingfa Xu, Jie Wang et al.
Optimal Transport of Diverse Unsupervised Tasks for Robust Learning from Noisy Few-Shot Data
Xiaofan Que, Qi Yu
Label-anticipated Event Disentanglement for Audio-Visual Video Parsing
Jinxing Zhou, Dan Guo, Yuxin Mao et al.
WaSt-3D: Wasserstein-2 Distance for Scene-to-Scene Stylization on 3D Gaussians
Dmytro Kotovenko, Olga Grebenkova, Nikolaos Sarafianos et al.
LITA: Language Instructed Temporal-Localization Assistant
De-An Huang, Shijia Liao, Subhashree Radhakrishnan et al.
DreamDrone: Text-to-Image Diffusion Models are Zero-shot Perpetual View Generators
Hanyang Kong, Dongze Lian, Michael Bi Mi et al.
SpaceJAM: a Lightweight and Regularization-free Method for Fast Joint Alignment of Images
Nir Barel, Ron Aharon Shapira Weber, Nir Mualem et al.
BurstM: Deep Burst Multi-scale SR using Fourier Space with Optical Flow
EungGu Kang, Byeonghun Lee, Sunghoon Im et al.
Co-synthesis of Histopathology Nuclei Image-Label Pairs using a Context-Conditioned Joint Diffusion Model
Seonghui Min, Hyun-Jic Oh, Won-Ki Jeong
Unsupervised Dense Prediction using Differentiable Normalized Cuts
Yanbin Liu, Stephen Gould
Look Around and Learn: Self-Training Object Detection by Exploration
Gianluca Scarpellini, Stefano Rosa, Pietro Morerio et al.
uCAP: An Unsupervised Prompting Method for Vision-Language Models
A. Tuan Nguyen, Kai Sheng Tai, Bor-Chun Chen et al.
Flexible Distribution Alignment: Towards Long-tailed Semi-supervised Learning with Proper Calibration
Emanuel Sanchez Aimar, Nathaniel D Helgesen, Yonghao Xu et al.
Collaborative Control for Geometry-Conditioned PBR Image Generation
Shimon Vainer, Mark Boss, Mathias Parger et al.
Diffusion-Generated Pseudo-Observations for High-Quality Sparse-View Reconstruction
Xinhang Liu, Jiaben Chen, Shiu-Hong Kao et al.
Efficient Frequency-Domain Image Deraining with Contrastive Regularization
Ning Gao, xingyu jiang, Xiuhui Zhang et al.
MyVLM: Personalizing VLMs for User-Specific Queries
Yuval Alaluf, Elad Richardson, Sergey Tulyakov et al.
Deep Cost Ray Fusion for Sparse Depth Video Completion
Jungeon Kim, Soongjin Kim, Jaesik Park et al.
TrackNeRF: Bundle Adjusting NeRF from Sparse and Noisy Views via Feature Tracks
Jinjie Mai, Wenxuan Zhu, Sara Rojas Martinez et al.
SSL-Cleanse: Trojan Detection and Mitigation in Self-Supervised Learning
Mengxin Zheng, Jiaqi Xue, Zihao Wang et al.
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation
Seung Hyun Lee, Yinxiao Li, Junjie Ke et al.
AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering
Xiuyuan Chen, Yuan Lin, Yuchen Zhang et al.
Norma: A Noise Robust Memory-Augmented Framework for Whole Slide Image Classification
Yu Bai, Bo Zhang, Zheng Zhang et al.
Adaptive High-Frequency Transformer for Diverse Wildlife Re-Identification
Chenyue Li, Shuoyi Chen, Mang Ye
Pathology-knowledge Enhanced Multi-instance Prompt Learning for Few-shot Whole Slide Image Classification
Linhao Qu, Dingkang Yang, Dan Huang et al.
DCDM: Diffusion-Conditioned-Diffusion Model for Scene Text Image Super-Resolution
Shrey Singh, Prateek Keserwani, Masakazu Iwamura et al.
Local All-Pair Correspondence for Point Tracking
Seokju Cho, Jiahui Huang, Jisu Nam et al.
Integrating Markov Blanket Discovery into Causal Representation Learning for Domain Generalization
Naiyu Yin, Hanjing Wang, Yue Yu et al.
An accurate detection is not all you need to combat label noise in web-noisy datasets
Paul Albert, Kevin McGuinness, Eric Arazo et al.