Most Cited 2024 "physics-informed neural operator" Papers
12,324 papers found • Page 51 of 62
Conference
Bi-level Learning of Task-Specific Decoders for Joint Registration and One-Shot Medical Image Segmentation
Xin Fan, Xiaolin Wang, Jiaxin Gao et al.
FairCLIP: Harnessing Fairness in Vision-Language Learning
Yan Luo, MIN SHI, Muhammad Osama Khan et al.
Navigate Beyond Shortcuts: Debiased Learning Through the Lens of Neural Collapse
Yining Wang, Junjie Sun, Chenyue Wang et al.
Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language
Mark Hamilton, Andrew Zisserman, John Hershey et al.
Learning Visual Prompt for Gait Recognition
Kang Ma, Ying Fu, Chunshui Cao et al.
ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models
Xinyu Tian, Shu Zou, Zhaoyuan Yang et al.
PolarRec: Improving Radio Interferometric Data Reconstruction Using Polar Coordinates
Ruoqi Wang, Zhuoyang Chen, Jiayi Zhu et al.
LDP: Language-driven Dual-Pixel Image Defocus Deblurring Network
Hao Yang, Liyuan Pan, Yan Yang et al.
StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN
Jongwoo Choi, Kwanggyoon Seo, Amirsaman Ashtari et al.
MS-MANO: Enabling Hand Pose Tracking with Biomechanical Constraints
Pengfei Xie, Wenqiang Xu, Tutian Tang et al.
CGI-DM: Digital Copyright Authentication for Diffusion Models via Contrasting Gradient Inversion
Xiaoyu Wu, Yang Hua, Chumeng Liang et al.
Temporally Consistent Unbalanced Optimal Transport for Unsupervised Action Segmentation
Ming Xu, Stephen Gould
ICON: Incremental CONfidence for Joint Pose and Radiance Field Optimization
Weiyao Wang, Pierre Gleize, Hao Tang et al.
Learning Large-Factor EM Image Super-Resolution with Generative Priors
Jiateng Shou, Zeyu Xiao, Shiyu Deng et al.
PI3D: Efficient Text-to-3D Generation with Pseudo-Image Diffusion
Ying-Tian Liu, Yuan-Chen Guo, Guan Luo et al.
Learning for Transductive Threshold Calibration in Open-World Recognition
Qin ZHANG, DONGSHENG An, Tianjun Xiao et al.
Separate and Conquer: Decoupling Co-occurrence via Decomposition and Representation for Weakly Supervised Semantic Segmentation
Zhiwei Yang, Kexue Fu, Minghong Duan et al.
Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence
Junyi Zhang, Charles Herrmann, Junhwa Hur et al.
Amodal Ground Truth and Completion in the Wild
Guanqi Zhan, Chuanxia Zheng, Weidi Xie et al.
MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual Grounding
Chun-Peng Chang, Shaoxiang Wang, Alain Pagani et al.
Seeing Unseen: Discover Novel Biomedical Concepts via Geometry-Constrained Probabilistic Modeling
Jianan Fan, Dongnan Liu, Hang Chang et al.
Real-Time Exposure Correction via Collaborative Transformations and Adaptive Sampling
Ziwen Li, Feng Zhang, Meng Cao et al.
Gaussian Splatting SLAM
Hidenobu Matsuki, Riku Murai, Paul Kelly et al.
PromptCoT: Align Prompt Distribution via Adapted Chain-of-Thought
Junyi Yao, Yijiang Liu, Zhen Dong et al.
NeLF-Pro: Neural Light Field Probes for Multi-Scale Novel View Synthesis
Zinuo You, Andreas Geiger, Anpei Chen
Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction
Guillaume Jaume, Anurag Vaidya, Richard J. Chen et al.
Practical Measurements of Translucent Materials with Inter-Pixel Translucency Prior
Zhenyu Chen, Jie Guo, Shuichang Lai et al.
RTracker: Recoverable Tracking via PN Tree Structured Memory
Yuqing Huang, Xin Li, Zikun Zhou et al.
View-Category Interactive Sharing Transformer for Incomplete Multi-View Multi-Label Learning
Shilong Ou, Zhe Xue, Yawen Li et al.
Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models
Yabin Zhang, Wenjie Zhu, Hui Tang et al.
A Versatile Framework for Continual Test-Time Domain Adaptation: Balancing Discriminability and Generalizability
Xu Yang, Xuan chen, Moqi Li et al.
Efficient Solution of Point-Line Absolute Pose
Petr Hruby, Timothy Duff, Marc Pollefeys
SPIN: Simultaneous Perception Interaction and Navigation
Shagun Uppal, Ananye Agarwal, Haoyu Xiong et al.
CAMixerSR: Only Details Need More "Attention"
Yan Wang, Yi Liu, Shijie Zhao et al.
FISBe: A Real-World Benchmark Dataset for Instance Segmentation of Long-Range Thin Filamentous Structures
Lisa Mais, Peter Hirsch, Claire Managan et al.
POPDG: Popular 3D Dance Generation with PopDanceSet
Zhenye Luo, Min Ren, Xuecai Hu et al.
RankMatch: Exploring the Better Consistency Regularization for Semi-supervised Semantic Segmentation
Huayu Mai, Rui Sun, Tianzhu Zhang et al.
CoDe: An Explicit Content Decoupling Framework for Image Restoration
Enxuan Gu, Hongwei Ge, Yong Guo
D^4: Dataset Distillation via Disentangled Diffusion Model
Duo Su, Junjie Hou, Weizhi Gao et al.
MarkovGen: Structured Prediction for Efficient Text-to-Image Generation
Sadeep Jayasumana, Daniel Glasner, Srikumar Ramalingam et al.
Rethinking the Representation in Federated Unsupervised Learning with Non-IID Data
Xinting Liao, Weiming Liu, Chaochao Chen et al.
ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification
Jiangbo Shi, Chen Li, Tieliang Gong et al.
Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance
Phuc Nguyen, Tuan Duc Ngo, Evangelos Kalogerakis et al.
Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs
Kanchana Ranasinghe, Satya Narayan Shukla, Omid Poursaeed et al.
CaDeT: a Causal Disentanglement Approach for Robust Trajectory Prediction in Autonomous Driving
Mozhgan Pourkeshavarz, Junrui Zhang, Amir Rasouli
Continual Forgetting for Pre-trained Vision Models
Hongbo Zhao, Bolin Ni, Junsong Fan et al.
Boosting Neural Representations for Videos with a Conditional Decoder
XINJIE ZHANG, Ren Yang, Dailan He et al.
Unsupervised Feature Learning with Emergent Data-Driven Prototypicality
Yunhui Guo, Youren Zhang, Yubei Chen et al.
Text-Guided 3D Face Synthesis - From Generation to Editing
Yunjie Wu, Yapeng Meng, Zhipeng Hu et al.
Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models
Huan Ling, Seung Wook Kim, Antonio Torralba et al.
IReNe: Instant Recoloring of Neural Radiance Fields
Alessio Mazzucchelli, Adrian Garcia-Garcia, Elena Garces et al.
Constrained Layout Generation with Factor Graphs
Mohammed Haroon Dupty, Yanfei Dong, Sicong Leng et al.
URHand: Universal Relightable Hands
Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo et al.
Neural Implicit Morphing of Face Images
Guilherme Schardong, Tiago Novello, Hallison Paz et al.
Distilling CLIP with Dual Guidance for Learning Discriminative Human Body Shape Representation
Feng Liu, Minchul Kim, Zhiyuan Ren et al.
Snapshot Lidar: Fourier Embedding of Amplitude and Phase for Single-Image Depth Reconstruction
Sarah Friday, Yunzi Shi, Yaswanth Kumar Cherivirala et al.
CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification
Haoran Lai, Qingsong Yao, Zihang Jiang et al.
MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant
Chenlu Zhan, Gaoang Wang, Yu LIN et al.
GLID: Pre-training a Generalist Encoder-Decoder Vision Model
Jihao Liu, Jinliang Zheng, Yu Liu et al.
Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion
Sofia Casarin, Cynthia Ugwu, Sergio Escalera et al.
Kernel Adaptive Convolution for Scene Text Detection via Distance Map Prediction
Jinzhi Zheng, Heng Fan, Libo Zhang
NeRFCodec: Neural Feature Compression Meets Neural Radiance Fields for Memory-Efficient Scene Representation
Sicheng Li, Hao Li, Yiyi Liao et al.
Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing
Hyelin Nam, Gihyun Kwon, Geon Yeong Park et al.
PlatoNeRF: 3D Reconstruction in Plato's Cave via Single-View Two-Bounce Lidar
Tzofi Klinghoffer, Xiaoyu Xiang, Siddharth Somasundaram et al.
DiffLoc: Diffusion Model for Outdoor LiDAR Localization
Wen Li, Yuyang Yang, Shangshu Yu et al.
Text-Guided Variational Image Generation for Industrial Anomaly Detection and Segmentation
Mingyu Lee, Jongwon Choi
Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models
Pengze Zhang, Hubery Yin, Chen Li et al.
Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning
Siteng Huang, Biao Gong, Yutong Feng et al.
Soften to Defend: Towards Adversarial Robustness via Self-Guided Label Refinement
Daiwei Yu, Zhuorong Li, Lina Wei et al.
LoCoNet: Long-Short Context Network for Active Speaker Detection
Xizi Wang, Feng Cheng, Gedas Bertasius
WinSyn: : A High Resolution Testbed for Synthetic Data
Tom Kelly, John Femiani, Peter Wonka
Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
Daichi Horita, Naoto Inoue, Kotaro Kikuchi et al.
Wired Perspectives: Multi-View Wire Art Embraces Generative AI
Zhiyu Qu, LAN YANG, Honggang Zhang et al.
Small Scale Data-Free Knowledge Distillation
He Liu, Yikai Wang, Huaping Liu et al.
Transfer CLIP for Generalizable Image Denoising
Jun Cheng, Dong Liang, Shan Tan
Validating Privacy-Preserving Face Recognition under a Minimum Assumption
Hui Zhang, Xingbo Dong, YenLungLai et al.
CLiC: Concept Learning in Context
Mehdi Safaee, Aryan Mikaeili, Or Patashnik et al.
IDGuard: Robust General Identity-centric POI Proactive Defense Against Face Editing Abuse
Yunshu Dai, Jianwei Fei, Fangjun Huang
Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology
Wenhao Tang, Fengtao ZHOU, Sheng Huang et al.
SpatialTracker: Tracking Any 2D Pixels in 3D Space
Yuxi Xiao, Qianqian Wang, Shangzhan Zhang et al.
TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models
Zhongwei Zhang, Fuchen Long, Yingwei Pan et al.
Perceptual Assessment and Optimization of HDR Image Rendering
Peibei Cao, Rafal Mantiuk, Kede Ma
Pose-Transformed Equivariant Network for 3D Point Trajectory Prediction
Ruixuan Yu, Jian Sun
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications
Yuwen Xiong, Zhiqi Li, Yuntao Chen et al.
Multimodal Representation Learning by Alternating Unimodal Adaptation
Xiaohui Zhang, Jaehong Yoon, Mohit Bansal et al.
Compositional Video Understanding with Spatiotemporal Structure-based Transformers
Hoyeoung Yun, Jinwoo Ahn, Minseo Kim et al.
Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text
Junshu Tang, Yanhong Zeng, Ke Fan et al.
Coherent Temporal Synthesis for Incremental Action Segmentation
Guodong Ding, Hans Golong, Angela Yao
Person in Place: Generating Associative Skeleton-Guidance Maps for Human-Object Interaction Image Editing
ChangHee Yang, ChanHee Kang, Kyeongbo Kong et al.
Estimating Extreme 3D Image Rotations using Cascaded Attention
Shay Dekel, Yosi Keller, Martin Čadík
Towards Real-World HDR Video Reconstruction: A Large-Scale Benchmark Dataset and A Two-Stage Alignment Network
Yong Shu, Liquan Shen, Xiangyu Hu et al.
Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular Stereo and RGB-D Cameras
Huajian Huang, Longwei Li, Hui Cheng et al.
Attention Calibration for Disentangled Text-to-Image Personalization
Yanbing Zhang, Mengping Yang, Qin Zhou et al.
SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained Object Insertion and Layout Control
Jaskirat Singh, Jianming Zhang, Qing Liu et al.
GraCo: Granularity-Controllable Interactive Segmentation
Yian Zhao, Kehan Li, Zesen Cheng et al.
Segment Every Out-of-Distribution Object
Wenjie Zhao, Jia Li, Xin Dong et al.
Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution
Shangchen Zhou, Peiqing Yang, Jianyi Wang et al.
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Fanghua Yu, Jinjin Gu, Zheyuan Li et al.
Masked and Shuffled Blind Spot Denoising for Real-World Images
Hamadi Chihaoui, Paolo Favaro
Open-Vocabulary Object 6D Pose Estimation
Jaime Corsetti, Davide Boscaini, Changjae Oh et al.
Generative Region-Language Pretraining for Open-Ended Object Detection
Chuang Lin, Yi Jiang, Lizhen Qu et al.
Boosting Diffusion Models with Moving Average Sampling in Frequency Domain
Yurui Qian, Qi Cai, Yingwei Pan et al.
Discovering Syntactic Interaction Clues for Human-Object Interaction Detection
Jinguo Luo, Weihong Ren, Weibo Jiang et al.
Quantifying Uncertainty in Motion Prediction with Variational Bayesian Mixture
Juanwu Lu, Can Cui, Yunsheng Ma et al.
Generative Latent Coding for Ultra-Low Bitrate Image Compression
Zhaoyang Jia, Jiahao Li, Bin Li et al.
Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization
Jimyeong Kim, Jungwon Park, Wonjong Rhee
SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks
Yaxu Xie, Alain Pagani, Didier Stricker
Back to 3D: Few-Shot 3D Keypoint Detection with Back-Projected 2D Features
Thomas Wimmer, Peter Wonka, Maks Ovsjanikov
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
Zeyi Sun, Ye Fang, Tong Wu et al.
DemoFusion: Democratising High-Resolution Image Generation With No $$$
Ruoyi DU, Dongliang Chang, Timothy Hospedales et al.
Activity-Biometrics: Person Identification from Daily Activities
Shehreen Azad, Yogesh S. Rawat
Holoported Characters: Real-time Free-viewpoint Rendering of Humans from Sparse RGB Cameras
Ashwath Shetty, Marc Habermann, Guoxing Sun et al.
Neighbor Relations Matter in Video Scene Detection
Jiawei Tan, Hongxing Wang, Jiaxin Li et al.
Fast ODE-based Sampling for Diffusion Models in Around 5 Steps
Zhenyu Zhou, Defang Chen, Can Wang et al.
Referring Image Editing: Object-level Image Editing via Referring Expressions
Chang Liu, Xiangtai Li, Henghui Ding
InNeRF360: Text-Guided 3D-Consistent Object Inpainting on 360-degree Neural Radiance Fields
Dongqing Wang, Tong Zhang, Alaa Abboud et al.
From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior
Jaeho Moon, Juan Luis Gonzalez Bello, Byeongjun Kwon et al.
Unsupervised Blind Image Deblurring Based on Self-Enhancement
Lufei Chen, Xiangpeng Tian, Shuhua Xiong et al.
Mask Grounding for Referring Image Segmentation
Yong Xien Chng, Henry Zheng, Yizeng Han et al.
SignGraph: A Sign Sequence is Worth Graphs of Nodes
Shiwei Gan, Yafeng Yin, Zhiwei Jiang et al.
Embracing Unimodal Aleatoric Uncertainty for Robust Multimodal Fusion
Zixian Gao, Xun Jiang, Xing Xu et al.
DGC-GNN: Leveraging Geometry and Color Cues for Visual Descriptor-Free 2D-3D Matching
Shuzhe Wang, Juho Kannala, Daniel Barath
FreeDrag: Feature Dragging for Reliable Point-based Image Editing
Pengyang Ling, Lin Chen, Pan Zhang et al.
TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
Yushi Huang, Ruihao Gong, Jing Liu et al.
GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians
Shenhan Qian, Tobias Kirschstein, Liam Schoneveld et al.
Explaining CLIP's Performance Disparities on Data from Blind/Low Vision Users
Daniela Massiceti, Camilla Longden, Agnieszka Słowik et al.
MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models
Yanting Wang, Hongye Fu, Wei Zou et al.
DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data
Hanrong Ye, Dan Xu
Revisiting Spatial-Frequency Information Integration from a Hierarchical Perspective for Panchromatic and Multi-Spectral Image Fusion
Jiangtong Tan, Jie Huang, Naishan Zheng et al.
FineSports: A Multi-person Hierarchical Sports Video Dataset for Fine-grained Action Understanding
Jinglin Xu, Guohao Zhao, Sibo Yin et al.
MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning
Matteo Farina, Massimiliano Mancini, Elia Cunegatti et al.
Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization
Guopeng Li, Ming Qian, Gui-Song Xia
FCS: Feature Calibration and Separation for Non-Exemplar Class Incremental Learning
Qiwei Li, Yuxin Peng, Jiahuan Zhou
Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation
Yi Zhang, Meng-Hao Guo, Miao Wang et al.
GALA: Generating Animatable Layered Assets from a Single Scan
Taeksoo Kim, Byungjun Kim, Shunsuke Saito et al.
Improving Graph Contrastive Learning via Adaptive Positive Sampling
Jiaming Zhuo, Feiyang Qin, Can Cui et al.
Hearing Anything Anywhere
Mason Wang, Ryosuke Sawata, Samuel Clarke et al.
Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning
Chen Zhao, Shuming Liu, Karttikeya Mangalam et al.
Diffusion-EDFs: Bi-equivariant Denoising Generative Modeling on SE(3) for Visual Robotic Manipulation
Hyunwoo Ryu, Jiwoo Kim, Hyunseok An et al.
BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics
Wenqian Zhang, Molin Huang, Yuxuan Zhou et al.
Bayesian Exploration of Pre-trained Models for Low-shot Image Classification
Yibo Miao, Yu lei, Feng Zhou et al.
Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions
Runhao Zeng, Xiaoyong Chen, Jiaming Liang et al.
RepKPU: Point Cloud Upsampling with Kernel Point Representation and Deformation
Yi Rong, Haoran Zhou, Kang Xia et al.
4K4D: Real-Time 4D View Synthesis at 4K Resolution
Zhen Xu, Sida Peng, Haotong Lin et al.
Context-Guided Spatio-Temporal Video Grounding
Xin Gu, Heng Fan, Yan Huang et al.
TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation
Sai Kumar Dwivedi, Yu Sun, Priyanka Patel et al.
Re-thinking Data Availability Attacks Against Deep Neural Networks
Bin Fang, Bo Li, Shuang Wu et al.
Logit Standardization in Knowledge Distillation
Shangquan Sun, Wenqi Ren, Jingzhi Li et al.
A Unified Approach for Text- and Image-guided 4D Scene Generation
Yufeng Zheng, Xueting Li, Koki Nagano et al.
CONFORM: Contrast is All You Need for High-Fidelity Text-to-Image Diffusion Models
Tuna Han Salih Meral, Enis Simsar, Federico Tombari et al.
SPECAT: SPatial-spEctral Cumulative-Attention Transformer for High-Resolution Hyperspectral Image Reconstruction
Zhiyang Yao, Shuyang Liu, Xiaoyun Yuan et al.
Video-Based Human Pose Regression via Decoupled Space-Time Aggregation
Jijie He, Wenwu Yang
Neural Refinement for Absolute Pose Regression with Feature Synthesis
Shuai Chen, Yash Bhalgat, Xinghui Li et al.
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Jianyuan Wang, Nikita Karaev, Christian Rupprecht et al.
Boosting Image Restoration via Priors from Pre-trained Models
Xiaogang Xu, Shu Kong, Tao Hu et al.
CPP-Net: Embracing Multi-Scale Feature Fusion into Deep Unfolding CP-PPA Network for Compressive Sensing
Zhen Guo, Hongping Gan
GenFlow: Generalizable Recurrent Flow for 6D Pose Refinement of Novel Objects
Sungphill Moon, Hyeontae Son, Dongcheol Hur et al.
PKU-DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human Modeling
Xiaoyun Zheng, Liwei Liao, Xufeng Li et al.
MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers
Yawar Siddiqui, Antonio Alliegro, Alexey Artemov et al.
RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features
Geonho Bang, Kwangjin Choi, Jisong Kim et al.
Task-Conditioned Adaptation of Visual Features in Multi-Task Policy Learning
Pierre Marza, Laetitia Matignon, Olivier Simonin et al.
EasyDrag: Efficient Point-based Manipulation on Diffusion Models
Xingzhong Hou, Boxiao Liu, Yi Zhang et al.
Learned Lossless Image Compression based on Bit Plane Slicing
Zhe Zhang, Huairui Wang, Zhenzhong Chen et al.
BEM: Balanced and Entropy-based Mix for Long-Tailed Semi-Supervised Learning
Hongwei Zheng, Linyuan Zhou, Han Li et al.
Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement
Ziyu Wang, Yue Xu, Cewu Lu et al.
Frequency-Adaptive Dilated Convolution for Semantic Segmentation
Linwei Chen, Lin Gu, Dezhi Zheng et al.
TexTile: A Differentiable Metric for Texture Tileability
Carlos Rodriguez-Pardo, Dan Casas, Elena Garces et al.
MatSynth: A Modern PBR Materials Dataset
Giuseppe Vecchio, Valentin Deschaintre
Image Processing GNN: Breaking Rigidity in Super-Resolution
Yuchuan Tian, Hanting Chen, Chao Xu et al.
ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation
Suraj Patni, Aradhye Agarwal, Chetan Arora
Riemannian Multinomial Logistics Regression for SPD Neural Networks
Ziheng Chen, Yue Song, Gaowen Liu et al.
LED: A Large-scale Real-world Paired Dataset for Event Camera Denoising
Yuxing Duan
NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging
Takahiro Shirakawa, Seiichi Uchida
OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM
Yutao Hu, Tianbin, Quanfeng Lu et al.
Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models
Daniel Geng, Inbum Park, Andrew Owens
Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding
Zhihao Yuan, Jinke Ren, Chun-Mei Feng et al.
Towards HDR and HFR Video from Rolling-Mixed-Bit Spikings
Yakun Chang, Yeliduosi Xiaokaiti, Yujia Liu et al.
Learn from View Correlation: An Anchor Enhancement Strategy for Multi-view Clustering
Suyuan Liu, KE LIANG, Zhibin Dong et al.
Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging
Bhargav Ghanekar, Salman Siddique Khan, Pranav Sharma et al.
UniPAD: A Universal Pre-training Paradigm for Autonomous Driving
Honghui Yang, Sha Zhang, Di Huang et al.
ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations
Maitreya Patel, Changhoon Kim, Sheng Cheng et al.
Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning
Haoyu Chen, Wenbo Li, Jinjin Gu et al.
Neural Video Compression with Feature Modulation
Jiahao Li, Bin Li, Yan Lu
Nearest is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks
Boheng Li, Yishuo Cai, Haowei Li et al.
Dual DETRs for Multi-Label Temporal Action Detection
Yuhan Zhu, Guozhen Zhang, Jing Tan et al.
Discriminative Probing and Tuning for Text-to-Image Generation
Leigang Qu, Wenjie Wang, Yongqi Li et al.
GigaTraj: Predicting Long-term Trajectories of Hundreds of Pedestrians in Gigapixel Complex Scenes
Haozhe Lin, Chunyu Wei, Li He et al.
Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata
Dongsu Zhang, Francis Williams, Žan Gojčič et al.
Comparing the Decision-Making Mechanisms by Transformers and CNNs via Explanation Methods
Mingqi Jiang, Saeed Khorram, Li Fuxin
Continual Segmentation with Disentangled Objectness Learning and Class Recognition
Yizheng Gong, Siyue Yu, Xiaoyang Wang et al.
Image Sculpting: Precise Object Editing with 3D Geometry Control
Jiraphon Yenphraphai, Xichen Pan, Sainan Liu et al.
Attribute-Guided Pedestrian Retrieval: Bridging Person Re-ID with Internal Attribute Variability
Yan Huang, Zhang Zhang, Qiang Wu et al.
Weakly Misalignment-free Adaptive Feature Alignment for UAVs-based Multimodal Object Detection
Chen Chen, Jiahao Qi, Xingyue Liu et al.
Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization
Deng Li, Aming Wu, Yaowei Wang et al.
EscherNet: A Generative Model for Scalable View Synthesis
Xin Kong, Shikun Liu, Xiaoyang Lyu et al.
MVCPS-NeuS: Multi-view Constrained Photometric Stereo for Neural Surface Reconstruction
Hiroaki Santo, Fumio Okura, Yasuyuki Matsushita
OHTA: One-shot Hand Avatar via Data-driven Implicit Priors
Xiaozheng Zheng, Chao Wen, Zhuo Su et al.
E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator
Wenjun Wu, Lingling Zhang, Jun Liu et al.
MultiPhys: Multi-Person Physics-aware 3D Motion Estimation
Nicolás Ugrinovic, Boxiao Pan, Georgios Pavlakos et al.
LMDrive: Closed-Loop End-to-End Driving with Large Language Models
Hao Shao, Yuxuan Hu, Letian Wang et al.