Most Cited 2024 "demographic group misclassification" Papers
12,324 papers found • Page 8 of 62
Conference
Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation
Jihyun Kim, Changjae Oh, Hoseok Do et al.
AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning
Yuwei Tang, ZhenYi Lin, Qilong Wang et al.
Improving Agent Behaviors with RL Fine-tuning for Autonomous Driving
Zhenghao Peng, Wenjie Luo, Yiren Lu et al.
Probabilistically Rewired Message-Passing Neural Networks
Chendi Qian, Andrei Manolache, Kareem Ahmed et al.
Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection
Yuanpeng Tu, Boshen Zhang, Liang Liu et al.
Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering
Zaid Khan, Yun Fu
Context-Aware Meta-Learning
Christopher Fifty, Dennis Duan, Ronald Junkins et al.
NodeMixup: Tackling Under-Reaching for Graph Neural Networks
Weigang Lu, Ziyu Guan, Wei Zhao et al.
Catalyst for Clustering-Based Unsupervised Object Re-identification: Feature Calibration
Huafeng Li, Qingsong Hu, Zhanxuan Hu
Template Free Reconstruction of Human-object Interaction with Procedural Interaction Generation
Xianghui Xie, Bharat Lal Bhatnagar, Jan Lenssen et al.
Training-Free Pretrained Model Merging
Zhengqi Xu, Ke Yuan, Huiqiong Wang et al.
AesFA: An Aesthetic Feature
Aware Arbitrary Neural Style Transfer
Quasi-Monte Carlo for 3D Sliced Wasserstein
Khai Nguyen, Nicola Bariletto, Nhat Ho
Facial Affective Behavior Analysis with Instruction Tuning
Yifan Li, Anh Dao, Wentao Bao et al.
Supervised Anomaly Detection for Complex Industrial Images
Aimira Baitieva, David Hurych, Victor Besnier et al.
Garment Recovery with Shape and Deformation Priors
Ren Li, Corentin Dumery, Benoît Guillard et al.
360+x: A Panoptic Multi-modal Scene Understanding Dataset
Hao Chen, Yuqi Hou, Chenyuan Qu et al.
Doodle Your 3D: From Abstract Freehand Sketches to Precise 3D Shapes
Hmrishav Bandyopadhyay, Subhadeep Koley, Ayan Das et al.
Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes
Yaoting Wang, Peiwen Sun, Dongzhan Zhou et al.
MonoDiff: Monocular 3D Object Detection and Pose Estimation with Diffusion Models
Yasiru Ranasinghe, Deepti Hegde, Vishal M. Patel
Contrastive Learning for DeepFake Classification and Localization via Multi-Label Ranking
Cheng-Yao Hong, Yen-Chi Hsu, Tyng-Luh Liu
Object-Aware Adaptive-Positivity Learning for Audio-Visual Question Answering
Zhangbin Li, Jinxing Zhou, Dan Guo et al.
Whole-Song Hierarchical Generation of Symbolic Music Using Cascaded Diffusion Models
Ziyu Wang, Lejun Min, Gus Xia
Instructive Decoding: Instruction-Tuned Large Language Models are Self-Refiner from Noisy Instructions
Taehyeon Kim, JOONKEE KIM, Gihun Lee et al.
Diffusion Time-step Curriculum for One Image to 3D Generation
YI Xuanyu, Zike Wu, Qingshan Xu et al.
FlexiEdit: Frequency-Aware Latent Refinement for Enhanced Non-Rigid Editing
Gwanhyeong Koo, Sunjae Yoon, Ji Woo Hong et al.
EgoGen: An Egocentric Synthetic Data Generator
Gen Li, Kaifeng Zhao, Siwei Zhang et al.
DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification
Wenhui Zhu, Xiwen Chen, Peijie Qiu et al.
Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering
Ruofan Liang, Zan Gojcic, Merlin Nimier-David et al.
Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-identification
Jiaer Xia, Lei Tan, Pingyang Dai et al.
SimDistill: Simulated Multi-Modal Distillation for BEV 3D Object Detection
Haimei Zhao, Qiming Zhang, Shanshan Zhao et al.
Tailored Visions: Enhancing Text-to-Image Generation with Personalized Prompt Rewriting
Zijie Chen, Lichao Zhang, Fangsheng Weng et al.
WSI-VQA: Interpreting Whole Slide Images by Generative Visual Question Answering
Pingyi Chen, Chenglu Zhu, Sunyi Zheng et al.
SCD-Net: Spatiotemporal Clues Disentanglement Network for Self-Supervised Skeleton-Based Action Recognition
Cong Wu, Xiao-Jun Wu, Josef Kittler et al.
Unknown Prompt the only Lacuna: Unveiling CLIP's Potential for Open Domain Generalization
Mainak Singha, Ankit Jha, Shirsha Bose et al.
Runtime Analysis of the SMS-EMOA for Many-Objective Optimization
Weijie Zheng, Benjamin Doerr
EarnHFT: Efficient Hierarchical Reinforcement Learning for High Frequency Trading
Molei Qin, Shuo Sun, Wentao Zhang et al.
VkD: Improving Knowledge Distillation using Orthogonal Projections
Roy Miles, Ismail Elezi, Jiankang Deng
LISO: Lidar-only Self-Supervised 3D Object Detection
Stefan Baur, Frank Moosmann, Andreas Geiger
Semantic Residual Prompts for Continual Learning
Martin Menabue, Emanuele Frascaroli, Matteo Boschini et al.
EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks
Ziming Wang, Ziling Wang, Huaning Li et al.
Enhancing Vectorized Map Perception with Historical Rasterized Maps
Xiaoyu Zhang, Guangwei Liu, Zihao Liu et al.
Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning
Yan Li, Weiwei Guo, Xue Yang et al.
Text-Conditioned Resampler For Long Form Video Understanding
Bruno Korbar, Yongqin Xian, Alessio Tonioni et al.
MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty
Tim Broedermann, David Brüggemann, Christos Sakaridis et al.
Tyche: Stochastic In-Context Learning for Medical Image Segmentation
Marianne Rakic, Hallee Wong, Jose Javier Gonzalez Ortiz et al.
Semantic-Guided Generative Image Augmentation Method with Diffusion Models for Image Classification
Bohan Li, Xiao Xu, Xinghao Wang et al.
OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection
Hu Zhang, xu jianhua, Tao Tang et al.
Would Deep Generative Models Amplify Bias in Future Models?
Tianwei Chen, Yusuke Hirota, Mayu Otani et al.
L2MAC: Large Language Model Automatic Computer for Extensive Code Generation
Samuel Holt, Max Ruiz Luyten, Mihaela van der Schaar
PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios
Jingbo Wang, Zhengyi Luo, Ye Yuan et al.
VAREN: Very Accurate and Realistic Equine Network
Silvia Zuffi, Ylva Mellbin, Ci Li et al.
What's in a Prior? Learned Proximal Networks for Inverse Problems
Zhenghan Fang, Sam Buchanan, Jeremias Sulam
Deep Equilibrium Diffusion Restoration with Parallel Sampling
Jiezhang Cao, Yue Shi, Kai Zhang et al.
Some Fundamental Aspects about Lipschitz Continuity of Neural Networks
Grigory Khromov, Sidak Pal Singh
Towards Robust 3D Object Detection with LiDAR and 4D Radar Fusion in Various Weather Conditions
Yujeong Chae, Hyeonseong Kim, Kuk-Jin Yoon
Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts
Byeongjun Park, Hyojun Go, Jin-Young Kim et al.
6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation
Li Xu, Haoxuan Qu, Yujun Cai et al.
StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models
Wen Li, Muyuan Fang, Cheng Zou et al.
V2Meow: Meowing to the Visual Beat via Video-to-Music Generation
Kun Su, Judith Li, Qingqing Huang et al.
SANeRF-HQ: Segment Anything for NeRF in High Quality
Yichen Liu, Benran Hu, Chi-Keung Tang et al.
Non-exemplar Online Class-Incremental Continual Learning via Dual-Prototype Self-Augment and Refinement
Fushuo Huo, Wenchao Xu, Jingcai Guo et al.
HybridGait: A Benchmark for Spatial-Temporal Cloth-Changing Gait Recognition with Hybrid Explorations
Yilan Dong, Chunlin Yu, Ruiyang Ha et al.
Retrieval-Augmented Primitive Representations for Compositional Zero-Shot Learning
Chenchen Jing, Yukun Li, Hao Chen et al.
Face2Diffusion for Fast and Editable Face Personalization
Kaede Shiohara, Toshihiko Yamasaki
Learning Equi-angular Representations for Online Continual Learning
Minhyuk Seo, Hyunseo Koh, Wonje Jeung et al.
Trash to Treasure: Low-Light Object Detection via Decomposition-and-Aggregation
Xiaohan Cui, Long Ma, Tengyu Ma et al.
Multi-Domain Incremental Learning for Face Presentation Attack Detection
Keyao Wang, Guosheng Zhang, Haixiao Yue et al.
VideoMamba: Spatio-Temporal Selective State Space Model
Jinyoung Park, Hee-Seon Kim, Kangwook Ko et al.
Learning Time Slot Preferences via Mobility Tree for Next POI Recommendation
Tianhao Huang, Xuan Pan, Xiangrui Cai et al.
MLNet: Mutual Learning Network with Neighborhood Invariance for Universal Domain Adaptation
Yanzuo Lu, Meng Shen, Andy J Ma et al.
TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data
Siyi Du, Shaoming Zheng, Yinsong Wang et al.
PALM: Predicting Actions through Language Models
Sanghwan Kim, Daoji Huang, Yongqin Xian et al.
milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing
Fangqiang Ding, Zhen Luo, Peijun Zhao et al.
Concept Weaver: Enabling Multi-Concept Fusion in Text-to-Image Models
Gihyun Kwon, Simon Jenni, Ding Li et al.
SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant
Guohao Sun, Can Qin, JIAMINAN WANG et al.
G-NAS: Generalizable Neural Architecture Search for Single Domain Generalization Object Detection
Fan Wu, Jinling Gao, Lanqing Hong et al.
Pre-Training Goal-based Models for Sample-Efficient Reinforcement Learning
Haoqi Yuan, Zhancun Mu, Feiyang Xie et al.
FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance Head-pose and Facial Expression Features
Andre Rochow, Max Schwarz, Sven Behnke
FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning
Chenhao Li, Elijah Stanger-Jones, Steve Heim et al.
Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation
Siteng Huang, Biao Gong, Yutong Feng et al.
Prioritized Semantic Learning for Zero-shot Instance Navigation
Xinyu Sun, Lizhao Liu, Hongyan Zhi et al.
Improving Medical Multi-modal Contrastive Learning with Expert Annotations
Yogesh Kumar, Pekka Marttinen
DataDream: Few-shot Guided Dataset Generation
Jae Myung Kim, Jessica Bader, Stephan Alaniz et al.
PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis
Zhengyao Lv, Yuxiang Wei, Wangmeng Zuo et al.
FocusDiffuser: Perceiving Local Disparities for Camouflaged Object Detection
Jianwei Zhao, Xin Li, Fan Yang et al.
Does Few-Shot Learning Suffer from Backdoor Attacks?
Xinwei Liu, Xiaojun Jia, Jindong Gu et al.
Bayesian Diffusion Models for 3D Shape Reconstruction
Haiyang Xu, Yu lei, Zeyuan Chen et al.
Revisit Anything: Visual Place Recognition via Image Segment Retrieval
Kartik Garg, Sai Shubodh Puligilla, Shishir N Y Kolathaya et al.
GeoCalib: Learning Single-image Calibration with Geometric Optimization
Alexander Veicht, Paul-Edouard Sarlin, Philipp Lindenberger et al.
MANUS: Markerless Grasp Capture using Articulated 3D Gaussians
Chandradeep Pokhariya, Ishaan Shah, Angela Xing et al.
ValUES: A Framework for Systematic Validation of Uncertainty Estimation in Semantic Segmentation
Kim-Celine Kahl, Carsten Lüth, Maximilian Zenk et al.
Teaching Tailored to Talent: Adverse Weather Restoration via Prompt Pool and Depth-Anything Constraint
Sixiang Chen, Tian Ye, Kai Zhang et al.
Beyond Mimicking Under-Represented Emotions: Deep Data Augmentation with Emotional Subspace Constraints for EEG-Based Emotion Recognition
Zhi ZHANG, Sheng-hua Zhong, Yan Liu
Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion
Xiang Fan, Anand Bhattad, Ranjay Krishna
Generalizable Sleep Staging via Multi-Level Domain Alignment
Jiquan Wang, Sha Zhao, Haiteng Jiang et al.
Learning to Reweight for Generalizable Graph Neural Network
Zhengyu Chen, Teng Xiao, Kun Kuang et al.
Implicit bias of SGD in $L_2$-regularized linear DNNs: One-way jumps from high to low rank
Zihan Wang, Arthur Jacot
DyFADet: Dynamic Feature Aggregation for Temporal Action Detection
Le Yang, Ziwei Zheng, Yizeng Han et al.
Noise Map Guidance: Inversion with Spatial Context for Real Image Editing
Hansam Cho, Jonghyun Lee, Seoung Bum Kim et al.
WeditGAN: Few-Shot Image Generation via Latent Space Relocation
Yuxuan Duan, Li Niu, Yan Hong et al.
SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection
Hongcheng Zhang, Liu Liang, Pengxin Zeng et al.
Flatten Long-Range Loss Landscapes for Cross-Domain Few-Shot Learning
Yixiong Zou, Yicong Liu, Yiman Hu et al.
LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning
Bolin Lai, Xiaoliang Dai, Lawrence Chen et al.
Spherical Mask: Coarse-to-Fine 3D Point Cloud Instance Segmentation with Spherical Representation
Sangyun Shin, Kaichen Zhou, Madhu Vankadari et al.
SGFormer: Semantic Graph Transformer for Point Cloud-Based 3D Scene Graph Generation
Changsheng Lv, Mengshi Qi, Xia Li et al.
MeDM: Mediating Image Diffusion Models for Video-to-Video Translation with Temporal Correspondence Guidance
Ernie Chu, Tzuhsuan Huang, Shuo-Yen LIN et al.
Test-Time Adaptation for Depth Completion
Hyoungseob Park, Anjali W Gupta, Alex Wong
PanoContext-Former: Panoramic Total Scene Understanding with a Transformer
Yuan Dong, Chuan Fang, Liefeng Bo et al.
Deep SE(3)-Equivariant Geometric Reasoning for Precise Placement Tasks
Ben Eisner, Yi Yang, Todor Davchev et al.
Domain Prompt Learning with Quaternion Networks
Qinglong Cao, Zhengqin Xu, Yuntian Chen et al.
FM-OV3D: Foundation Model-Based Cross-Modal Knowledge Blending for Open-Vocabulary 3D Detection
Dongmei Zhang, Chang Li, Renrui Zhang et al.
Category-Level Multi-Part Multi-Joint 3D Shape Assembly
Yichen Li, Kaichun Mo, Yueqi Duan et al.
Simple Image-Level Classification Improves Open-Vocabulary Object Detection
Ruohuan Fang, Guansong Pang, Xiao Bai
Understanding Certified Training with Interval Bound Propagation
Yuhao Mao, Mark N Müller, Marc Fischer et al.
Rethinking Multi-view Representation Learning via Distilled Disentangling
Guanzhou Ke, Bo Wang, Xiao-Li Wang et al.
Extend Your Own Correspondences: Unsupervised Distant Point Cloud Registration by Progressive Distance Extension
Quan Liu, Hongzi Zhu, Zhenxi Wang et al.
Summarizing Stream Data for Memory-Constrained Online Continual Learning
Jianyang Gu, Kai Wang, Wei Jiang et al.
On the Provable Advantage of Unsupervised Pretraining
Jiawei Ge, Shange Tang, Jianqing Fan et al.
GEARS: Local Geometry-aware Hand-object Interaction Synthesis
Keyang Zhou, Bharat Lal Bhatnagar, Jan Lenssen et al.
GAFusion: Adaptive Fusing LiDAR and Camera with Multiple Guidance for 3D Object Detection
Xiaotian Li, Baojie Fan, Jiandong Tian et al.
Robust Calibration of Large Vision-Language Adapters
Balamurali Murugesan, Julio Silva-Rodríguez, Ismail Ben Ayed et al.
Bayesian Neural Controlled Differential Equations for Treatment Effect Estimation
Konstantin Hess, Valentyn Melnychuk, Dennis Frauen et al.
7471 PNeRFLoc: Visual Localization with Point-Based Neural Radiance Fields
Boming Zhao, Luwei Yang, Mao Mao et al.
Reliability in Semantic Segmentation: Can We Use Synthetic Data?
Thibaut Loiseau, Tuan Hung Vu, Mickael Chen et al.
StegoGAN: Leveraging Steganography for Non-Bijective Image-to-Image Translation
Sidi Wu, Yizi Chen, Loic Landrieu et al.
Semantic-aware SAM for Point-Prompted Instance Segmentation
Zhaoyang Wei, Pengfei Chen, Xuehui Yu et al.
NightRain: Nighttime Video Deraining via Adaptive-Rain-Removal and Adaptive-Correction
Beibei Lin, Yeying Jin, Wending Yan et al.
On the Role of Server Momentum in Federated Learning
Jianhui Sun, Xidong Wu, Heng Huang et al.
OGNI-DC: Robust Depth Completion with Optimization-Guided Neural Iterations
Yiming Zuo, Jia Deng
RadEdit: stress-testing biomedical vision models via diffusion image editing
Fernando Pérez-García, Sam Bond-Taylor, Pedro Sanchez et al.
UniGen: A Unified Generative Framework for Retrieval and Question Answering with Large Language Models
Xiaoxi Li, Yujia Zhou, Zhicheng Dou
Graph Contrastive Invariant Learning from the Causal Perspective
9672 Yanhu Mo, Xiao Wang, Shaohua Fan et al.
Unifying Top-down and Bottom-up Scanpath Prediction Using Transformers
Zhibo Yang, Sounak Mondal, Seoyoung Ahn et al.
DiffAIL: Diffusion Adversarial Imitation Learning
Bingzheng Wang, Guoqiang Wu, Teng Pang et al.
COCONut: Modernizing COCO Segmentation
Xueqing Deng, Qihang Yu, Peng Wang et al.
LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL Architectures
Vimal Thilak, Chen Huang, Omid Saremi et al.
Learning to Prompt Knowledge Transfer for Open-World Continual Learning
Yujie Li, Xin Yang, Hao Wang et al.
NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields
Muhammad Zubair Irshad, Sergey Zakharov, Vitor Guizilini et al.
Towards Explainable Joint Models via Information Theory for Multiple Intent Detection and Slot Filling
Xianwei Zhuang, Xuxin Cheng, Yuexian Zou
Text2LiDAR: Text-guided LiDAR Point Clouds Generation via Equirectangular Transformer
Yang Wu, Kaihua Zhang, Jianjun Qian et al.
MotionChain: Conversational Motion Controllers via Multimodal Prompts
Biao Jiang, Xin Chen, Chi Zhang et al.
Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation
Razvan Pasca, Alexey Gavryushin, Muhammad Hamza et al.
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
Benjamin J Biggs, Arjun Seshadri, Yang Zou et al.
LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free Environment
yiming ren, xiao han, Chengfeng Zhao et al.
How to Overcome Curse-of-Dimensionality for Out-of-Distribution Detection?
Soumya Suvra Ghosal, Yiyou Sun, Yixuan Li
DIM: Dyadic Interaction Modeling for Social Behavior Generation
Minh Tran, Di Chang, Maksim Siniukov et al.
ViLA: Efficient Video-Language Alignment for Video Question Answering
Xijun Wang, Junbang Liang, Chun-Kai Wang et al.
Targeted Representation Alignment for Open-World Semi-Supervised Learning
Ruixuan Xiao, Lei Feng, Kai Tang et al.
A Diffusion-Based Pre-training Framework for Crystal Property Prediction
Zixing Song, Ziqiao Meng, Irwin King
SLEDGE: Synthesizing Driving Environments with Generative Models and Rule-Based Traffic
Kashyap Chitta, Daniel Dauner, Andreas Geiger
Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-Hoc Retrieval
Weihang Su, Qingyao Ai, Xiangsheng Li et al.
EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding
Yuanming Li, Wei-Jin Huang, An-Lan Wang et al.
Large Language Models are Good Prompt Learners for Low-Shot Image Classification
Zhaoheng Zheng, Jingmin Wei, Xuefeng Hu et al.
F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions
Jie Yang, Xuesong Niu, Nan Jiang et al.
Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-Free Multi-Exposure Image Fusion
Guanyao Wu, Hongming Fu, Jinyuan Liu et al.
Mind Marginal Non-Crack Regions: Clustering-Inspired Representation Learning for Crack Segmentation
zhuangzhuang chen, Zhuonan Lai, Jie Chen et al.
Spatio-Temporal Turbulence Mitigation: A Translational Perspective
Xingguang Zhang, Nicholas M Chimitt, Yiheng Chi et al.
Weakly-Supervised Temporal Action Localization by Inferring Salient Snippet-Feature
Wu Yun, Mengshi Qi, Chuanming Wang et al.
Meaning Representations from Trajectories in Autoregressive Models
Tian Yu Liu, Matthew Trager, Alessandro Achille et al.
ControlNet-XS: Rethinking the Control of Text-to-Image Diffusion Models as Feedback-Control Systems
Denis Zavadski, Johann-Friedrich Feiden, Carsten Rother
Object-Centric Diffusion for Efficient Video Editing
Kumara Kahatapitiya, Adil Karjauv, Davide Abati et al.
NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors
Yannan He, Garvita Tiwari, Tolga Birdal et al.
Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning
Desai Xie, Jiahao Li, Hao Tan et al.
Region-Adaptive Transform with Segmentation Prior for Image Compression
Yuxi Liu, Wenhan Yang, Huihui Bai et al.
Rethinking Few-shot 3D Point Cloud Semantic Segmentation
Zhaochong An, Guolei Sun, Yun Liu et al.
PrPSeg: Universal Proposition Learning for Panoramic Renal Pathology Segmentation
Ruining Deng, Quan Liu, Can Cui et al.
IMPUS: Image Morphing with Perceptually-Uniform Sampling Using Diffusion Models
Zhaoyuan Yang, Zhengyang Yu, Zhiwei Xu et al.
An Incremental Unified Framework for Small Defect Inspection
Jiaqi Tang, Hao Lu, Xiaogang Xu et al.
Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach
Mir Rayat Imtiaz Hossain, Mennatullah Siam, Leonid Sigal et al.
Visible and Clear: Finding Tiny Objects in Difference Map
Bing Cao, Haiyu Yao, Pengfei Zhu et al.
MonoHair: High-Fidelity Hair Modeling from a Monocular Video
Keyu Wu, LINGCHEN YANG, Zhiyi Kuang et al.
Question Calibration and Multi-Hop Modeling for Temporal Question Answering
Chao Xue, Di Liang, Pengfei Wang et al.
Composed Video Retrieval via Enriched Context and Discriminative Embeddings
Omkar Thawakar, Muzammal Naseer, Rao Anwer et al.
Adaptive FSS: A Novel Few-Shot Segmentation Framework via Prototype Enhancement
Jing Wang, Jiangyun Li, Chen Chen et al.
Language-Driven Anchors for Zero-Shot Adversarial Robustness
Xiao Li, Wei Zhang, Yining Liu et al.
Self-Supervised Multi-Object Tracking with Path Consistency
Zijia Lu, Bing Shuai, Yanbei Chen et al.
VideoRF: Rendering Dynamic Radiance Fields as 2D Feature Video Streams
Liao Wang, Kaixin Yao, Chengcheng Guo et al.
Clustering Propagation for Universal Medical Image Segmentation
Yuhang Ding, Liulei Li, Wenguan Wang et al.
MOFDiff: Coarse-grained Diffusion for Metal-Organic Framework Design
Xiang Fu, Tian Xie, Andrew Rosen et al.
Surface Reconstruction for 3D Gaussian Splatting via Local Structural Hints
Qianyi Wu, Jianmin Zheng, Jianfei Cai
Online Zero-Shot Classification with CLIP
Qi Qian, JUHUA HU
PreSight: Enhancing Autonomous Vehicle Perception with City-Scale NeRF Priors
Tianyuan Yuan, Mao Yucheng, Jiawei Yang et al.
3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation
Songchun Zhang, Yibo Zhang, Quan Zheng et al.
WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding
Quan Kong, Yuki Kawana, Rajat Saini et al.
Learning to Adapt SAM for Segmenting Cross-domain Point Clouds
Xidong Peng, Runnan Chen, Feng Qiao et al.
Pathologies of Predictive Diversity in Deep Ensembles
Geoff Pleiss, Taiga Abe, E. Kelly Buchanan et al.
VisionTrap: Vision-Augmented Trajectory Prediction Guided by Textual Descriptions
Seokha Moon, Hyun Woo, Hongbeen Park et al.
TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models
Haomiao Ni, Bernhard Egger, Suhas Lohit et al.
HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations
Peng Dai, Yang Zhang, Tao Liu et al.
UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement
yaofeng xie, Lingwei Kong, Kai Chen et al.
AE-NeRF: Audio Enhanced Neural Radiance Field for Few Shot Talking Head Synthesis
Dongze Li, Kang Zhao, Wei Wang et al.
SAVSR: Arbitrary-Scale Video Super-resolution via a Learned Scale-Adaptive Network
Zekun Li, Hongying Liu, Fanhua Shang et al.
Hyperspectral Image Reconstruction via Combinatorial Embedding of Cross-Channel Spatio-Spectral Clues
Xingxing Yang, Jie Chen, Zaifeng Yang
Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget
Johannes Lehner, Benedikt Alkin, Andreas Fürst et al.
SEED: A Simple and Effective 3D DETR in Point Clouds
Zhe Liu, Jinghua Hou, Xiaoqing Ye et al.
DreamPropeller: Supercharge Text-to-3D Generation with Parallel Sampling
Linqi Zhou, Andy Shih, Chenlin Meng et al.
One-Shot Diffusion Mimicker for Handwritten Text Generation
Gang Dai, Yifan Zhang, Quhui Ke et al.
Guided Slot Attention for Unsupervised Video Object Segmentation
Minhyeok Lee, Suhwan Cho, Dogyoon Lee et al.
Aligning Geometric Spatial Layout in Cross-View Geo-Localization via Feature Recombination
Qingwang Zhang, Yingying Zhu