Most Cited ECCV "tradeoff analysis" Papers
2,387 papers found • Page 9 of 12
Conference
Collaborative Control for Geometry-Conditioned PBR Image Generation
Shimon Vainer, Mark Boss, Mathias Parger et al.
Open-set Domain Adaptation via Joint Error based Multi-class Positive and Unlabeled Learning
Dexuan Zhang, Thomas Westfechtel, Tatsuya Harada
Look Around and Learn: Self-Training Object Detection by Exploration
Gianluca Scarpellini, Stefano Rosa, Pietro Morerio et al.
Co-synthesis of Histopathology Nuclei Image-Label Pairs using a Context-Conditioned Joint Diffusion Model
Seonghui Min, Hyun-Jic Oh, Won-Ki Jeong
On the Vulnerability of Skip Connections to Model Inversion Attacks
Jun Hao Koh, Sy-Tuyen Ho, Ngoc-Bao Nguyen et al.
Adaptive Human Trajectory Prediction via Latent Corridors
Neerja Thakkar, Karttikeya Mangalam, Andrea Bajcsy et al.
Generalizable Symbolic Optimizer Learning
Xiaotian Song, Peng Zeng, Yanan Sun et al.
FreestyleRet: Retrieving Images from Style-Diversified Queries
Hao Li, Yanhao Jia, Peng Jin et al.
AEDNet: Adaptive Embedding and Multiview-Aware Disentanglement for Point Cloud Completion
Zhiheng Fu, Longguang Wang, Lian Xu et al.
Efficient Bias Mitigation Without Privileged Information
Mateo Espinosa Zarlenga, Sankaranarayanan, Jerone Andrews et al.
Towards Open-Ended Visual Recognition with Large Language Models
Qihang Yu, Xiaohui Shen, Liang-Chieh Chen
RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
Bowen Zhang, Yiji Cheng, Chunyu Wang et al.
IRGen: Generative Modeling for Image Retrieval
Yidan Zhang, Ting Zhang, DONG CHEN et al.
LayeredFlow: A Real-World Benchmark for Non-Lambertian Multi-Layer Optical Flow
Hongyu Wen, Erich Liang, Jia Deng
Adaptive Parametric Activation
Konstantinos P Alexandridis, Jiankang Deng, Anh Nguyen et al.
Towards Multi-modal Transformers in Federated Learning
Guangyu Sun, Matias Mendieta, Aritra Dutta et al.
GENIXER: Empowering Multimodal Large Language Models as a Powerful Data Generator
Hengyuan Zhao, Pan Zhou, Mike Zheng Shou
FisherRF: Active View Selection and Mapping with Radiance Fields using Fisher Information
Wen Jiang, BOSHU LEI, Kostas Daniilidis
Soft Shadow Diffusion (SSD): Physics-inspired Learning for 3D Computational Periscopy
Fadlullah Raji, John Murray-Bruce
Learning 3D-aware GANs from Unposed Images with Template Feature Field
XINYA CHEN, Hanlei Guo, Yanrui Bin et al.
The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt Interpretation
Yi Yao, Chan-Feng Hsu, Jhe-Hao Lin et al.
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Agneet Chatterjee, Gabriela Ben Melech Stan, Estelle Guez Aflalo et al.
CIC-BART-SSA: : Controllable Image Captioning with Structured Semantic Augmentation
Kalliopi Basioti, Mohamed A Abdelsalam, Federico Fancellu et al.
MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection
Kuo Wang, Lechao Cheng, Weikai Chen et al.
Learning to Robustly Reconstruct Dynamic Scenes from Low-light Spike Streams
Liwen Hu, gang ding, Mianzhi Liu et al.
Restoring Images in Adverse Weather Conditions via Histogram Transformer
Shangquan Sun, Wenqi Ren, Xinwei Gao et al.
COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation
Jiefeng Li, Ye Yuan, Davis Rempe et al.
Resilience of Entropy Model in Distributed Neural Networks
Milin Zhang, Mohammad Abdi, Shahriar Rifat et al.
Analytic-Splatting: Anti-Aliased 3D Gaussian Splatting via Analytic Integration
Zhihao Liang, Qi Zhang, WENBO HU et al.
Generalizable Facial Expression Recognition
Yuhang Zhang, Xiuqi Zheng, Chenyi Liang et al.
Invertible Neural Warp for NeRF
Shin-Fang Chng, Ravi Garg, Hemanth Saratchandran et al.
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Shilong Liu, Hao Cheng, Haotian Liu et al.
Efficient Frequency-Domain Image Deraining with Contrastive Regularization
Ning Gao, xingyu jiang, Xiuhui Zhang et al.
Align before Collaborate: Mitigating Feature Misalignment for Robust Multi-Agent Perception
Dingkang Yang, Ke Li, Dongling Xiao et al.
MambaIR: A Simple Baseline for Image Restoration with State-Space Model
Hang Guo, Jinmin Li, Tao Dai et al.
I Can't Believe It's Not Scene Flow!
Ishan Khatri, Kyle Vedder, Neehar Peri et al.
Bi-directional Contextual Attention for 3D Dense Captioning
Minjung Kim, Hyung Suk Lim, Soonyoung Lee et al.
Scalable Group Choreography via Variational Phase Manifold Learning
Nhat Le, Khoa Do, Xuan Bui et al.
RS-NeRF: Neural Radiance Fields from Rolling Shutter Images
Muyao Niu, Tong Chen, Yifan Zhan et al.
Retrieval Robust to Object Motion Blur
Rong Zou, Marc Pollefeys, Denys Rozumnyi
Binomial Self-compensation for Motion Error in Dynamic 3D Scanning
Geyou Zhang, Ce Zhu, Kai Liu
Free-Viewpoint Video of Outdoor Sports Using a Drone
Zhengdong Hong
Blind image deblurring with noise-robust kernel estimation
Chanseok Lee, Jeongsol Kim, Seungmin Lee et al.
How Video Meetings Change Your Expression
Sumit Sarin, Utkarsh Mall, Purva Tendulkar et al.
An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation
Zhiyu Tan, Mengping Yang, Luozheng Qin et al.
LetsMap: Unsupervised Representation Learning for Label-Efficient Semantic BEV Mapping
Nikhil Gosala, Kürsat Petek, B Ravi Kiran et al.
AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting
Yu Wang, Xiaogeng Liu, Yu Li et al.
Regulating Model Reliance on Non-Robust Features by Smoothing Input Marginal Density
Peiyu Yang, Naveed Akhtar, Shah Mubarak et al.
ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders
Carlos Hinojosa, Shuming Liu, Bernard Ghanem
Train Till You Drop: Towards Stable and Robust Source-free Unsupervised 3D Domain Adaptation
Bjoern Michele, Alexandre Boulch, Tuan Hung Vu et al.
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding
Yue Fan, Xiaojian Ma, Rujie Wu et al.
Audio-driven Talking Face Generation with Stabilized Synchronization Loss
Dogucan Yaman, Fevziye Irem Eyiokur Yaman, Leonard Bärmann et al.
G2fR: Frequency Regularization in Grid-based Feature Encoding Neural Radiance Fields
Shuxiang Xie, Shuyi Zhou, Ken Sakurada et al.
Eliminating Feature Ambiguity for Few-Shot Segmentation
Qianxiong Xu, Guosheng Lin, Chen Change Loy et al.
PreLAR: World Model Pre-training with Learnable Action Representation
Lixuan Zhang, Meina Kan, Shiguang Shan et al.
SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding
Zixu Cheng, Yujiang Pu, Shaogang Gong et al.
SkyMask: Attack-agnostic Robust Federated Learning with Fine-grained Learnable Masks
Peishen Yan, Hao Wang, Tao Song et al.
Elevating All Zero-Shot Sketch-Based Image Retrieval Through Multimodal Prompt Learning
Mainak Singha, Ankit Jha, Divyam Gupta et al.
Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization
Jiajun Hu, Jian Zhang, Lei Qi et al.
Learning Representation for Multitask Learning through Self-Supervised Auxiliary Learning
Seokwon Shin, Hyungrok Do, Youngdoo Son
SemTrack: A Large-scale Dataset for Semantic Tracking in the Wild
Pengfei Wang, Xiaofei Hui, Jing Wu et al.
Fully Sparse 3D Occupancy Prediction
Haisong Liu, Yang Chen, Haiguang Wang et al.
Rethinking Deep Unrolled Model for Accelerated MRI Reconstruction
Bingyu Xin, Meng Ye, Leon Axel et al.
EAFormer: Scene Text Segmentation with Edge-Aware Transformers
Haiyang Yu, Teng Fu, Bin Li et al.
Zero-Shot Detection of AI-Generated Images
Davide Cozzolino, GIovanni Poggi, Matthias Niessner et al.
Augmented Neural Fine-tuning for Efficient Backdoor Purification
Md Nazmul Karim, Abdullah Al Arafat, Umar Khalid et al.
MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo
Tianqi Liu, Guangcong Wang, Shoukang Hu et al.
Efficient and Versatile Robust Fine-Tuning of Zero-shot Models
Sungyeon Kim, Boseung Jeong, Donghyun Kim et al.
T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning
Weijie Wei, Fatemeh Karimi Nejadasl, Theo Gevers et al.
MARs: Multi-view Attention Regularizations for Patch-based Feature Recognition of Space Terrain
Timothy Chase, Karthik Dantu
Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models
Nishad Singhi, Jae Myung Kim, Karsten Roth et al.
G3R: Gradient Guided Generalizable Reconstruction
Yun Chen, Jingkang Wang, Ze Yang et al.
Gaze Target Detection Based on Head-Local-Global Coordination
Yaokun Yang, Feng Lu
An Economic Framework for 6-DoF Grasp Detection
Xiao-Ming Wu, Jia-Feng Cai, Jian-Jian Jiang et al.
Uni3DL: A Unified Model for 3D Vision-Language Understanding
Xiang Li, Jian Ding, Zhaoyang Chen et al.
Rethinking Image Super Resolution from Training Data Perspectives
Go Ohtani, Ryu Tadokoro, Ryosuke Yamada et al.
Progressive Classifier and Feature Extractor Adaptation for Unsupervised Domain Adaptation on Point Clouds
Zicheng Wang, Zhen Zhao, Yiming Wu et al.
SA-DVAE: Improving Zero-Shot Skeleton-Based Action Recognition by Disentangled Variational Autoencoders
Sheng-Wei Li, Zi-Xiang Wei, Wei-Jie Jack Chen et al.
SparseSSP: 3D Subcellular Structure Prediction from Sparse-View Transmitted Light Images
Jintu Zheng, Yi Ding, Qizhe Liu et al.
Human Hair Reconstruction with Strand-Aligned 3D Gaussians
Egor Zakharov, Vanessa Sklyarova, Michael J. Black et al.
Grid-Attention: Enhancing Computational Efficiency of Large Vision Models without Fine-Tuning
Pengyu Li, Biao Wang, Tianchu Guo et al.
General and Task-Oriented Video Segmentation
Mu Chen, Liulei Li, Wenguan Wang et al.
MemBN: Robust Test-Time Adaptation via Batch Norm with Statistics Memory
Juwon Kang, Nayeong Kim, Jungseul Ok et al.
StereoGlue: Joint Feature Matching and Robust Estimation
Daniel Barath, Dmytro Mishkin, Luca Cavalli et al.
Scaling Backwards: Minimal Synthetic Pre-training?
Ryo Nakamura, Ryu Tadokoro, Ryosuke Yamada et al.
Enhanced Motion Forecasting with Visual Relation Reasoning
Sungjune Kim, Hadam Baek, Seunggwan Lee et al.
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
Omer Dahary, Or Patashnik, Kfir Aberman et al.
Unleashing the Power of Prompt-driven Nucleus Instance Segmentation
Zhongyi Shui, Yunlong Zhang, Kai Yao et al.
Domain-adaptive Video Deblurring via Test-time Blurring
Jin-Ting He, Fu-Jen Tsai, Jia-Hao Wu et al.
GAURA: Generalizable Approach for Unified Restoration and Rendering of Arbitrary Views
Vinayak Gupta, Rongali Simhachala Venkata Girish, Mukund Varma T et al.
Camera Height Doesn't Change: Unsupervised Training for Metric Monocular Road-Scene Depth Estimation
Genki Kinoshita, Ko Nishino
MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model
Muyao Niu, Xiaodong Cun, Xintao Wang et al.
GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing
Jing Wu, Jiawang Bian, Xinghui Li et al.
Think before Placement: Common Sense Enhanced Transformer for Object Placement
Yaxuan Qin, Jiayu Xu, Ruiping Wang et al.
Label-anticipated Event Disentanglement for Audio-Visual Video Parsing
Jinxing Zhou, Dan Guo, Yuxin Mao et al.
OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models
Kong Zhe, Yong Zhang, Tianyu Yang et al.
WaSt-3D: Wasserstein-2 Distance for Scene-to-Scene Stylization on 3D Gaussians
Dmytro Kotovenko, Olga Grebenkova, Nikolaos Sarafianos et al.
SCP-Diff: Spatial-Categorical Joint Prior for Diffusion Based Semantic Image Synthesis
Huan-ang Gao, Mingju Gao, Jiaju Li et al.
GiT: Towards Generalist Vision Transformer through Universal Language Interface
Haiyang Wang, Hao Tang, Li Jiang et al.
Object-Aware NIR-to-Visible Translation
Yunyi Gao, Lin Gu, Qiankun Liu et al.
Integer-Valued Training and Spike-driven Inference Spiking Neural Network for High-performance and Energy-efficient Object Detection
Xinhao Luo, Man Yao, Yuhong Chou et al.
DreamDrone: Text-to-Image Diffusion Models are Zero-shot Perpetual View Generators
Hanyang Kong, Dongze Lian, Michael Bi Mi et al.
SpaceJAM: a Lightweight and Regularization-free Method for Fast Joint Alignment of Images
Nir Barel, Ron Aharon Shapira Weber, Nir Mualem et al.
Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos
Akshay Paruchuri, Samuel Ehrenstein, Shuxian Wang et al.
Multi-modal Relation Distillation for Unified 3D Representation Learning
Huiqun Wang, Yiping Bao, Panwang Pan et al.
Diffusion-Generated Pseudo-Observations for High-Quality Sparse-View Reconstruction
Xinhang Liu, Jiaben Chen, Shiu-Hong Kao et al.
TrackNeRF: Bundle Adjusting NeRF from Sparse and Noisy Views via Feature Tracks
Jinjie Mai, Wenxuan Zhu, Sara Rojas Martinez et al.
DSA: Discriminative Scatter Analysis for Early Smoke Segmentation
Lujian Yao, Haitao Zhao, Jingchao Peng et al.
Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without Manual Labels
Rui Huang, Songyou Peng, Ayca Takmaz et al.
Efficient Training of Spiking Neural Networks with Multi-Parallel Implicit Stream Architecture
Zhigao Cao, Meng Li, Xiashuang Wang et al.
GarmentAligner: Text-to-Garment Generation via Retrieval-augmented Multi-level Corrections
Shiyue Zhang, Zheng Chong, Xujie Zhang et al.
Continuous SO(3) Equivariant Convolution for 3D Point Cloud Analysis
Jaein Kim, HEE BIN YOO, Dong-Sig Han et al.
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation
Seung Hyun Lee, Yinxiao Li, Junjie Ke et al.
FlexAttention for Efficient High-Resolution Vision-Language Models
Junyan Li, Delin Chen, Tianle Cai et al.
EGIC: Enhanced Low-Bit-Rate Generative Image Compression Guided by Semantic Segmentation
Nikolai Körber, Eduard Kromer, Andreas Siebert et al.
AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering
Xiuyuan Chen, Yuan Lin, Yuchen Zhang et al.
MedRAT: Unpaired Medical Report Generation via Auxiliary Tasks
Elad Hirsch, Gefen Dawidowicz, Ayellet Tal
Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection
Xingyu Peng, Yan Bai, Chen Gao et al.
Pathology-knowledge Enhanced Multi-instance Prompt Learning for Few-shot Whole Slide Image Classification
Linhao Qu, Dingkang Yang, Dan Huang et al.
Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models
Longxiang Tang, Zhuotao Tian, Kai Li et al.
U-COPE: Taking a Further Step to Universal 9D Category-level Object Pose Estimation
li zhang, Weiqing Meng, Yan Zhong et al.
DCDM: Diffusion-Conditioned-Diffusion Model for Scene Text Image Super-Resolution
Shrey Singh, Prateek Keserwani, Masakazu Iwamura et al.
Meta-optimized Angular Margin Contrastive Framework for Video-Language Representation Learning
Thanh Thong Nguyen, Yi Bin, Xiaobao Wu et al.
Integrating Markov Blanket Discovery into Causal Representation Learning for Domain Generalization
Naiyu Yin, Hanjing Wang, Yue Yu et al.
A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties
Junfei Xiao, Ziqi Zhou, Wenxuan Li et al.
Plain-Det: A Plain Multi-Dataset Object Detector
cheng Shi, yuchen zhu, Sibei Yang
Ex2Eg-MAE: A Framework for Adaptation of Exocentric Video Masked Autoencoders for Egocentric Social Role Understanding
Minh Tran, Yelin Kim, Che-Chun Su et al.
Towards High-Quality 3D Motion Transfer with Realistic Apparel Animation
Rong Wang, Wei Mao, Changsheng Lu et al.
iMatching: Imperative Correspondence Learning
Chen Wang, Dasong Gao, Yun-Jou Lin et al.
OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web
Raghav Kapoor, Yash Parag Butala, Melisa A Russak et al.
MTKD: Multi-Teacher Knowledge Distillation for Image Super-Resolution
Yuxuan Jiang, Chen Feng, Fan Zhang et al.
ReMatching: Low-Resolution Representations for Scalable Shape Correspondence
Filippo Maggioli, Daniele Baieri, Emanuele Rodola et al.
ReLoo: Reconstructing Humans Dressed in Loose Garments from Monocular Video in the Wild
Chen Guo, Tianjian Jiang, Manuel Kaufmann et al.
Omni-Recon: Harnessing Image-based Rendering for General-Purpose Neural Radiance Fields
Yonggan Fu, Huaizhi Qu, Zhifan Ye et al.
Syn-to-Real Domain Adaptation for Point Cloud Completion via Part-based Approach
Yunseo Yang, Jihun Kim, Kuk-Jin Yoon
ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities
CHENMING ZHU, Tai Wang, Wenwei Zhang et al.
TrajPrompt: Aligning Color Trajectory with Vision-Language Representations
Li-Wu Tsao, Hao-Tang Tsui, Yu-Rou Tuan et al.
Salience-Based Adaptive Masking: Revisiting Token Dynamics for Enhanced Pre-training
Hyesong Choi, Hyejin Park, Kwang Moo Yi et al.
Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects
Zicong Fan, Takehiko Ohkawa, Linlin Yang et al.
Tree-D Fusion: Simulation-Ready Tree Dataset from Single Images with Diffusion Priors
Jae Joong Lee, Bosheng Li, Sara Beery et al.
LoA-Trans: Enhancing Visual Grounding by Location-Aware Transformers
Ziling Huang, Shin’ichi Satoh
DomainFusion: Generalizing To Unseen Domains with Latent Diffusion Models
Yuyang Huang, Yabo Chen, Yuchen Liu et al.
3D Single-object Tracking in Point Clouds with High Temporal Variation
Qiao Wu, Kun Sun, Pei An et al.
Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion Models
Yasi Zhang, Peiyu Yu, Ying Nian Wu
RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D LiDAR Segmentation
Luis Li, Hubert P. H. Shum, Toby P Breckon
Rejection Sampling IMLE: Designing Priors for Better Few-Shot Image Synthesis
Chirag Vashist, Shichong Peng, Ke Li
Solving Motion Planning Tasks with a Scalable Generative Model
Yihan Hu, Siqi Chai, Zhening Yang et al.
SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model
Armen Avetisyan, Christopher Xie, Henry Howard-Jenkins et al.
Implicit Filtering for Learning Neural Signed Distance Functions from 3D Point Clouds
Shengtao Li, Ge Gao, Yudong Liu et al.
PLOT: Text-based Person Search with Part Slot Attention for Corresponding Part Discovery
Jicheol Park, Dongwon Kim, Boseung Jeong et al.
Finding Meaning in Points: Weakly Supervised Semantic Segmentation for Event Cameras
Hoonhee Cho, Sung-Hoon Yoon, Hyeokjun Kweon et al.
LiDAR-Event Stereo Fusion with Hallucinations
Luca Bartolomei, Matteo Poggi, Andrea Conti et al.
Tensorial template matching for fast cross-correlation with rotations and its application for tomography
Antonio Martinez-Sanchez, Ulrike Homberg, J. M. Almira et al.
Cross-Input Certified Training for Universal Perturbations
Changming Xu, Gagandeep Singh
Motion-Guided Latent Diffusion for Temporally Consistent Real-world Video Super-resolution
Xi Yang, Chenhang He, Jianqi Ma et al.
SAFARI: Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
Sayan Nag, Koustava Goswami, Srikrishna Karanam
BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering
Xinmin Qiu, Congying Han, Zicheng Zhang et al.
A Closer Look at GAN Priors: Exploiting Intermediate Features for Enhanced Model Inversion Attacks
Yixiang Qiu, Hao Fang, Hongyao Yu et al.
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions
Lin Chen, Jinsong Li, Xiaoyi Dong et al.
Text2Place: Affordance-aware Text Guided Human Placement
Rishubh Parihar, Harsh Gupta, Sachidanand VS et al.
Adaptive Multi-task Learning for Few-shot Object Detection
Yan Ren, Yanling Li, Wai-Kin Adams Kong
Textual-Visual Logic Challenge: Understanding and Reasoning in Text-to-Image Generation
Peixi Xiong, Michael A Kozuch, Nilesh Jain
Spectral Subsurface Scattering for Material Classification
Haejoon Lee, Aswin C. Sankaranarayanan
Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition
Masashi Hatano, Ryo Hachiuma, Ryo Fujii et al.
AdvDiff: Generating Unrestricted Adversarial Examples using Diffusion Models
Xuelong Dai, Kaisheng Liang, Bin Xiao
Merlin: Empowering Multimodal LLMs with Foresight Minds
En Yu, liang zhao, YANA WEI et al.
HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects
Xintao Lv, Liang Xu, Yichao Yan et al.
High-Resolution and Few-shot View Synthesis from Asymmetric Dual-lens Inputs
Ruikang Xu, Mingde Yao, Yue Li et al.
DiffPMAE: Diffusion Masked Autoencoders for Point Cloud Reconstruction
YANLONG LI, Chamara Madarasingha, Kanchana Thilakarathna
BRAVE: Broadening the visual encoding of vision-language models
Oguzhan Fatih Kar, Alessio Tonioni, Petra Poklukar et al.
GroupDiff: Diffusion-based Group Portrait Editing
Yuming Jiang, Nanxuan Zhao, Qing Liu et al.
BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos
Pilhyeon Lee, Hyeran Byun
Disentangling Masked Autoencoders for Unsupervised Domain Generalization
An Zhang, Han Wang, Xiang Wang et al.
Geospecific View Generation - Geometry-Context Aware High-resolution Ground View Inference from Satellite Views
Ningli Xu, Rongjun Qin
Co-Student: Collaborating Strong and Weak Students for Sparsely Annotated Object Detection
Lianjun Wu, Jiangxiao Han, Zengqiang Zheng et al.
ProMerge: Prompt and Merge for Unsupervised Instance Segmentation
Dylan Li, Gyungin Shin
GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval
Han Zhou, Wei Dong, Xiaohong Liu et al.
Learning to Generate Conditional Tri-plane for 3D-aware Expression Controllable Portrait Animation
Taekyung Ki, Dongchan Min, Gyeongsu Chae
4D Contrastive Superflows are Dense 3D Representation Learners
Xiang Xu, Lingdong Kong, Hui Shuai et al.
Panel-Specific Degradation Representation for Raw Under-Display Camera Image Restoration
Youngjin Oh, Keuntek Lee, Jooyoung Lee et al.
Diffusion-Guided Weakly Supervised Semantic Segmentation
Sung-Hoon Yoon, Hoyong Kwon, Jaeseok Jeong et al.
On the Approximation Risk of Few-Shot Class-Incremental Learning
Xuan Wang, Zhong Ji, Xiyao Liu et al.
Online Vectorized HD Map Construction using Geometry
Zhixin Zhang, Yiyuan Zhang, Xiaohan Ding et al.
Self-Supervised Underwater Caustics Removal and Descattering via Deep Monocular SLAM
Jonathan Sauder, Devis TUIA
Click-Gaussian: Interactive Segmentation to Any 3D Gaussians
Seokhun Choi, Hyeonseop Song, Jaechul Kim et al.
AdversariaLeak: External Information Leakage Attack Using Adversarial Samples on Face Recognition Systems
Roye Katzav, Amit Giloni, Edita Grolman et al.
MANIKIN: Biomechanically Accurate Neural Inverse Kinematics for Human Motion Estimation
Jiaxi Jiang, Paul Streli, Xuejing Luo et al.
Disentangled Generation and Aggregation for Robust Radiance Fields
Shihe Shen, Huachen Gao, Wangze Xu et al.
Momentum Auxiliary Network for Supervised Local Learning
Junhao Su, Changpeng Cai, Feiyu Zhu et al.
JointDreamer: Ensuring Geometry Consistency and Text Congruence in Text-to-3D Generation via Joint Score Distillation
ChenHan Jiang, Yihan Zeng, Tianyang Hu et al.
Implicit Neural Models to Extract Heart Rate from Video
Pradyumna Chari, Anirudh Bindiganavale Harish, Adnan Armouti et al.
Occupancy as Set of Points
Yiang Shi, Tianheng Cheng, Qian Zhang et al.
Cocktail Universal Adversarial Attack on Deep Neural Networks
Shaoxin Li, Xiaofeng Liao, Xin Che et al.
FLAT: Flux-aware Imperceptible Adversarial Attacks on 3D Point Clouds
Keke Tang, Lujie Huang, Weilong Peng et al.
Progressive Proxy Anchor Propagation for Unsupervised Semantic Segmentation
Hyun Seok Seong, WonJun Moon, SuBeen Lee et al.
AdaDiff: Accelerating Diffusion Models through Step-Wise Adaptive Computation
Shengkun Tang, Yaqing Wang, Caiwen Ding et al.
SCAPE: A Simple and Strong Category-Agnostic Pose Estimator
Yujia Liang, Zixuan Ye, Wenze Liu et al.
FreeMotion: MoCap-Free Human Motion Synthesis with Multimodal Large Language Models
Zhikai Zhang, Yitang Li, Haofeng Huang et al.
Norma: A Noise Robust Memory-Augmented Framework for Whole Slide Image Classification
Yu Bai, Bo Zhang, Zheng Zhang et al.
Controllable Contextualized Image Captioning: Directing the Visual Narrative through User-Defined Highlights
Shunqi Mao, Chaoyi Zhang, Hang Su et al.