Most Cited 2025 "parameterized environment configurations" Papers
22,274 papers found • Page 106 of 112
Conference
Co-Speech Gesture Video Generation with Implicit Motion-Audio Entanglement
Xinjie Li, Ziyi Chen, Xinlu Yu et al.
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation
Chengyue Wu, Xiaokang Chen, Zhiyu Wu et al.
FATE: Full-head Gaussian Avatar with Textural Editing from Monocular Video
Jiawei Zhang, Zijian Wu, Zhiyang Liang et al.
Can Knowledge be Transferred from Unimodal to Multimodal? Investigating the Transitivity of Multimodal Knowledge Editing
Lingyong Fang, Xinzhong Wang, Depeng depeng wang et al.
LOMIA: Label-Only Membership Inference Attacks against Pre-trained Large Vision-Language Models
Yihao LIU, Xinqi Lyu, Dong Wang et al.
ConsNoTrainLoRA: Data-driven Weight Initialization of Low-rank Adapters using Constraints
Debasmit Das, Hyoungwoo Park, Munawar Hayat et al.
Reasoning Models Hallucinate More: Factuality-Aware Reinforcement Learning for Large Reasoning Models
Junyi Li, Hwee Tou Ng
UDC-VIT: A Real-World Video Dataset for Under-Display Cameras
Kyusu Ahn, JiSoo Kim, Sangik Lee et al.
Is Visual in-Context Learning for Compositional Medical Tasks within Reach?
Simon Reiß, Zdravko Marinov, Alexander Jaus et al.
Feature Unlearning: Theoretical Foundations and Practical Applications with Shuffling
Yue Yang, Jinhao Li, Hao Wang
Efficient semantic uncertainty quantification in language models via diversity-steered sampling
Ji Won Park, Kyunghyun Cho
Optimal Transport for Brain-Image Alignment: Unveiling Redundancy and Synergy in Neural Information Processing
Yang Xiao, Wang Lu, Jie Ji et al.
A Physics-preserved Transfer Learning Method for Differential Equations
Hao-Ran Yang, Chuan-Xian Ren
On the sample complexity of semi-supervised multi-objective learning
Tobias Wegel, Geelon So, Junhyung Park et al.
Stable Cinemetrics : Structured Taxonomy and Evaluation for Professional Video Generation
Agneet Chatterjee, Rahim Entezari, Maksym Zhuravinskyi et al.
Chimera: Improving Generalist Model with Domain-Specific Experts
Tianshuo Peng, Mingsheng Li, Jiakang Yuan et al.
Enhanced Event-based Dense Stereo via Cross-Sensor Knowledge Distillation
Haihao Zhang, Yunjian Zhang, Jianing Li et al.
Vector Quantization in the Brain: Grid-like Codes in World Models
Xiangyuan Peng, Xingsi Dong, Si Wu
The Nuclear Route: Sharp Asymptotics of ERM in Overparameterized Quadratic Networks
Vittorio Erba, Emanuele Troiani, Lenka Zdeborová et al.
Not Only Vision: Evolve Visual Speech Recognition via Peripheral Information
Zhaoxin Yuan, Shuang Yang, Shiguang Shan et al.
Don’t Let It Fade: Preserving Edits in Diffusion Language Models via Token Timestep Allocation
Woojin Kim, Jaeyoung Do
ReFlex: Text-Guided Editing of Real Images in Rectified Flow via Mid-Step Feature Extraction and Attention Adaptation
Jimyeong Kim, Jungwon Park, Yeji Song et al.
Distance-informed Neural Processes
Aishwarya Venkataramanan, Joachim Denzler
Imbalance in Balance: Online Concept Balancing in Generation Models
Yukai Shi, Jiarong Ou, Rui Chen et al.
RALoc: Enhancing Outdoor LiDAR Localization via Rotation Awareness
Yuyang Yang, Wen Li, Sheng Ao et al.
Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for Dermatology
Siyuan Yan, Ming Hu, Yiwen Jiang et al.
Large Language Diffusion Models
Shen Nie, Fengqi Zhu, Zebin You et al.
Generative Caching for Structurally Similar Prompts and Responses
Sarthak Chakraborty, Suman Nath, Xuchao Zhang et al.
MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization
Hengjia Li, Lifan Jiang, Xi Xiao et al.
OMiSO: Adaptive optimization of state-dependent brain stimulation to shape neural population states
Yuki Minai, Joana Soldado-Magraner, Byron M Yu et al.
Visual Interestingness Decoded: How GPT-4o Mirrors Human Interests
Fitim Abdullahu, Helmut Grabner
D-Attn: Decomposed Attention for Large Vision-and-Language Model
Chia-Wen Kuo, Sijie Zhu, Fan Chen et al.
Understanding Personal Concept in Open-Vocabulary Semantic Segmentation
Sunghyun Park, Jungsoo Lee, Shubhankar Borse et al.
Influence Functions for Edge Edits in Non-Convex Graph Neural Networks
Jaeseung Heo, Kyeongheung Yun, Seokwon Yoon et al.
Bézier Splatting for Fast and Differentiable Vector Graphics Rendering
Xi Liu, Chaoyi Zhou, Nanxuan Zhao et al.
GeoClip: Geometry-Aware Clipping for Differentially Private SGD
Atefeh Gilani, Naima Tasnim, Lalitha Sankar et al.
CoDa-4DGS: Dynamic Gaussian Splatting with Context and Deformation Awareness for Autonomous Driving
Rui Song, Chenwei Liang, Yan Xia et al.
UnZipLoRA: Separating Content and Style from a Single Image
Chang Liu, Viraj Shah, Aiyu Cui et al.
SAM Encoder Breach by Adversarial Simplicial Complex Triggers Downstream Model Failures
Yi Qin, Rui Wang, Tao Huang et al.
Johnson-Lindenstrauss Lemma Beyond Euclidean Geometry
Chengyuan Deng, Jie Gao, Kevin Lu et al.
Mamba Only Glances Once (MOGO): A Lightweight Framework for Efficient Video Action Detection
Yunqing Liu, Nan Zhang, Fangjun Wang et al.
Semi-supervised Concept Bottleneck Models
Lijie Hu, Tianhao Huang, Huanyi Xie et al.
WINS: Winograd Structured Pruning for Fast Winograd Convolution
Cheonjun Park, Hyunjae Oh, Mincheol Park et al.
Sparsity Outperforms Low-Rank Projections in Few-Shot Adaptation
Nairouz Mrabah, Nicolas Richet, Ismail Ayed et al.
ART: Adaptive Relation Tuning for Generalized Relation Prediction
Gopika Sudhakaran, Hikaru Shindo, Patrick Schramowski et al.
Feed-Forward SceneDINO for Unsupervised Semantic Scene Completion
Aleksandar Jevtić, Christoph Reich, Felix Wimbauer et al.
No Pose at All: Self-Supervised Pose-Free 3D Gaussian Splatting from Sparse Views
Ranran Huang, Krystian Mikolajczyk
Cooperative Pseudo Labeling for Unsupervised Federated Classification
Kuangpu Guo, Lijun Sheng, Yongcan Yu et al.
MemDistill: Distilling LiDAR Knowledge into Memory for Camera-Only 3D Object Detection
Donghyeon Kwon, Youngseok Yoon, Hyeongseok Son et al.
From Sharp to Blur: Unsupervised Domain Adaptation for 2D Human Pose Estimation Under Extreme Motion Blur Using Event Cameras
Youngho Kim, Hoonhee Cho, Kuk-Jin Yoon
Augmenting Moment Retrieval: Zero-Dependency Two-Stage Learning
Zhengxuan Wei, Jiajin Tang, Sibei Yang
PAN-Crafter: Learning Modality-Consistent Alignment for PAN-Sharpening
Jeonghyeok Do, Sungpyo Kim, Geunhyuk Youk et al.
Differentially Private Fine-Tuning of Diffusion Models
Yu-Lin Tsai, Yizhe Li, Zekai Chen et al.
IRGPT: Understanding Real-world Infrared Image with Bi-cross-modal Curriculum on Large-scale Benchmark
Zhe Cao, Jin Zhang, Ruiheng Zhang
One Object, Multiple Lies: A Benchmark for Cross-task Adversarial Attack on Unified Vision-Language Models
Jiale Zhao, XINYANG JIANG, Junyao Gao et al.
Reducing Unimodal Bias in Multi-Modal Semantic Segmentation with Multi-Scale Functional Entropy Regularization
Xu Zheng, Yuanhuiyi Lyu, Lutao Jiang et al.
PRVQL: Progressive Knowledge-guided Refinement for Robust Egocentric Visual Query Localization
Bing Fan, Yunhe Feng, Yapeng Tian et al.
Language-Driven Multi-Label Zero-Shot Learning with Semantic Granularity
Shouwen Wang, Qian Wan, Junbin Gao et al.
IM360: Large-scale Indoor Mapping with 360 Cameras
Dongki Jung, Jaehoon Choi, Yonghan Lee et al.
PersonaCraft: Personalized and Controllable Full-Body Multi-Human Scene Generation Using Occlusion-Aware 3D-Conditioned Diffusion
Gwanghyun Kim, Suh Jeon Jeon, Seunggyu Lee et al.
Elastic ViTs from Pretrained Models without Retraining
Walter Simoncini, Michael Dorkenwald, Tijmen Blankevoort et al.
On Logic-based Self-Explainable Graph Neural Networks
Alessio Ragno, Marc Plantevit, Céline Robardet
MA-CIR: A Multimodal Arithmetic Benchmark for Composed Image Retrieval
Jaeseok Byun, Young Kyun Jang, Seokhyeon Jeong et al.
Adaptive Learning of High-Value Regions for Semi-Supervised Medical Image Segmentation
Tao Lei, Ziyao Yang, Xingwu wang et al.
Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning
Xinyao Liu, Diping Song
Differential Privacy for Euclidean Jordan Algebra with Applications to Private Symmetric Cone Programming
Zhao Song, Jianfei Xue, Lichen Zhang
Integrating Biological Knowledge for Robust Microscopy Image Profiling on De Novo Cell Lines
Jiayuan Chen, Thai-Hoang Pham, Yuanlong Wang et al.
Spectral Sensitivity Estimation with an Uncalibrated Diffraction Grating
Lilika Makabe, Hiroaki Santo, Fumio Okura et al.
TransiT: Transient Transformer for Non-line-of-sight Videography
Ruiqian Li, Siyuan Shen, Suan Xia et al.
Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data
Zi Liang, Qingqing Ye, Xuan Liu et al.
On the Complexity-Faithfulness Trade-off of Gradient-Based Explanations
Amir Mehrpanah, Matteo Gamba, Kevin Smith et al.
A TRIANGLE Enables Multimodal Alignment Beyond Cosine Similarity
Giordano Cicchetti, Eleonora Grassucci, Danilo Comminiello
Brain-Inspired fMRI-to-Text Decoding via Incremental and Wrap-Up Language Modeling
Wentao Lu, Dong Nie, Pengcheng Xue et al.
FedDifRC: Unlocking the Potential of Text-to-Image Diffusion Models in Heterogeneous Federated Learning
Huan Wang, Haoran Li, Huaming Chen et al.
Category-Specific Selective Feature Enhancement for Long-Tailed Multi-Label Image Classification
Ruiqi Du, Xu Tang, Xiangrong Zhang et al.
Registration beyond Points: General Affine Subspace Alignment via Geodesic Distance on Grassmann Manifold
Jaeho Shin, Hyeonjae Gil, Junwoo Jang et al.
An Efficient Post-hoc Framework for Reducing Task Discrepancy of Text Encoders for Composed Image Retrieval
Jaeseok Byun, Seokhyeon Jeong, Wonjae Kim et al.
Find a Scapegoat: Poisoning Membership Inference Attack and Defense to Federated Learning
Wenjin Mo, Zhiyuan Li, Minghong Fang et al.
To Label or Not to Label: PALM – A Predictive Model for Evaluating Sample Efficiency in Active Learning Models
Julia Machnio, Mads Nielsen, Mostafa Mehdipour Ghazi
Personalized Federated Learning under Local Supervision
Qiqi Liu, Jiaqiang Li, Yuchen Liu et al.
Radiant Foam: Real-Time Differentiable Ray Tracing
Shrisudhan Govindarajan, Daniel Rebain, Kwang Moo Yi et al.
COSTARR: Consolidated Open Set Technique with Attenuation for Robust Recognition
Ryan Rabinowitz, Steve Cruz, Walter Scheirer et al.
Information Density Principle for MLLM Benchmarks
Chunyi Li, Xiaozhe Li, Zicheng Zhang et al.
Perspective-Aware Teaching: Adapting Knowledge for Heterogeneous Distillation
Jhe-Hao Lin, Yi Yao, Chan-Feng Hsu et al.
Is Meta-Learning Out? Rethinking Unsupervised Few-Shot Classification with Limited Entropy
Yunchuan Guan, Yu Liu, Ke Zhou et al.
Long-Tailed Classification with Multi-Granularity Semantics
Yuting Liu, Liu Yang, Yu Wang
Computable universal online learning
Dariusz Kalociński, Tomasz Steifer
Decoding Causal Structure: End-to-End Mediation Pathways Inference
Yulong Li, Xiwei Liu, feilong tang et al.
ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools
Shaofeng Yin, Ting Lei, Yang Liu
Token-Level Self-Play with Importance-Aware Guidance for Large Language Models
Tue Le, Hoang Tran, Quyen Tran et al.
FEVER-OOD: Free Energy Vulnerability Elimination for Robust Out-of-Distribution Detection
Brian Isaac-Medina, Mauricio Che, Yona Falinie A. Gaus et al.
Adversarial Purification via Super-Resolution and Diffusion
Mincheol Park, Cheonjun Park, Seungseop Lim et al.
SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models
Xianfu Cheng, Wei Zhang, Shiwei Zhang et al.
ProJudge: A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges
Jiaxin Ai, Pengfei Zhou, xu Pan et al.
Failure Cases Are Better Learned But Boundary Says Sorry: Facilitating Smooth Perception Change for Accuracy-Robustness Trade-Off in Adversarial Training
Yanyun Wang, Li Liu
Secure On-Device Video OOD Detection Without Backpropagation
Li Li, Peilin Cai, Yuxiao Zhou et al.
Learning Counterfactually Decoupled Attention for Open-World Model Attribution
Yu Zheng, Boyang Gong, Fanye Kong et al.
Latte: Collaborative Test-Time Adaptation of Vision-Language Models in Federated Learning
Wenxuan Bao, Ruxi Deng, Ruizhong Qiu et al.
Is Less More? Exploring Token Condensation as Training-free Test-time Adaptation
Zixin Wang, Dong Gong, Sen Wang et al.
Mastering Collaborative Multi-modal Data Selection: A Focus on Informativeness, Uniqueness, and Representativeness
Qifan Yu, Zhebei Shen, Zhongqi Yue et al.
Flow Matching Neural Processes
Hussen Abu Hamad, Dan Rosenbaum
Generalized Tensor-based Parameter-Efficient Fine-Tuning via Lie Group Transformations
Chongjie Si, Zhiyi Shi, Xuehui Wang et al.
Partial Forward Blocking: A Novel Data Pruning Paradigm for Lossless Training Acceleration
Dongyue Wu, Zilin Guo, Jialong Zuo et al.
CIARD: Cyclic Iterative Adversarial Robustness Distillation
Liming Lu, Shuchao Pang, Xu Zheng et al.
Learning Chern Numbers of Multiband Topological Insulators with Gauge Equivariant Neural Networks
Longde Huang, Oleksandr Balabanov, Hampus Linander et al.
InfoBridge: Balanced Multimodal Integration through Conditional Dependency Modeling
Chenxin Li, Yifan Liu, Panwang Pan et al.
ChartPoint: Guiding MLLMs with Grounding Reflection for Chart Reasoning
Zhengzhuo Xu, Sinan Du, Yiyan Qi et al.
DiffRefine: Diffusion-based Proposal Specific Point Cloud Densification for Cross-Domain Object Detection
Sangyun Shin, Yuhang He, Xinyu Hou et al.
Boosting Generative Adversarial Transferability with Self-supervised Vision Transformer Features
Shangbo Wu, Yu-an Tan, Ruinan Ma et al.
Divide-and-Conquer for Enhancing Unlabeled Learning, Stability, and Plasticity in Semi-supervised Continual Learning
Yue Duan, Taicai Chen, Lei Qi et al.
Towards Building Model/Prompt-Transferable Attackers against Large Vision-Language Models
Xiaowen Cai, Daizong Liu, Xiaoye Qu et al.
Confound from All Sides, Distill with Resilience: Multi-Objective Adversarial Paths to Zero-Shot Robustness
Junhao Dong, Jiao Liu, Xinghua Qu et al.
Dual-Path Temporal Decoder for End-to-End Multi-Object Tracking
Hyunseop Kim, Juheon Jeong, Hanul Kim et al.
Mitigating Object Hallucinations via Sentence-Level Early Intervention
Shangpin Peng, Senqiao Yang, Li Jiang et al.
Privately Learning from Graphs with Applications in Fine-tuning Large Language Models
Haoteng Yin, Rongzhe Wei, Eli Chien et al.
Open-Unfairness Adversarial Mitigation for Generalized Deepfake Detection
Zhaoyang Li, Zhu Teng, Baopeng Zhang et al.
Spatial Preference Rewarding for MLLMs Spatial Understanding
Han Qiu, Peng Gao, Lewei Lu et al.
Structured Policy Optimization: Enhance Large Vision-Language Model via Self-referenced Dialogue
Guohao Sun, Can Qin, Yihao Feng et al.
GLVD: Guided Learned Vertex Descent
Pol Caselles RIco, Francesc Moreno-Noguer
Steering Large Language Model Activations in Sparse Spaces
Reza Bayat, Ali Rahimi-Kalahroudi, Mohammad Pezeshki et al.
Self-Evolving Critique Abilities in Large Language Models
Zhengyang Tang, Ziniu Li, Zhenyang Xiao et al.
A Framework for Double-Blind Federated Adaptation of Foundation Models
Nurbek Tastan, Karthik Nandakumar
Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought
Hanlin Zhu, Shibo Hao, Zhiting Hu et al.
MMOne: Representing Multiple Modalities in One Scene
Zhifeng Gu, Bing WANG
VisionMath: Vision-Form Mathematical Problem-Solving
Zongyang Ma, Yuxin Chen, Ziqi Zhang et al.
Quanta Neural Networks: From Photons to Perception
Varun Sundar, Tianyi Zhang, Sacha Jungerman et al.
OpenSubstance: A High-quality Measured Dataset of Multi-View and -Lighting Images and Shapes
Fan Pei, jinchen bai, Xiang Feng et al.
VGMamba: Attribute-to-Location Clue Reasoning for Quantity-Agnostic 3D Visual Grounding
Zhu Yihang, Jinhao Zhang, Yuxuan Wang et al.
Computational Budget Should Be Considered in Data Selection
Weilin Wan, Weizhong Zhang, Cheng Jin
RMultiplex200K: Toward Reliable Multimodal Process Supervision for Visual Language Models on Telecommunications
Sijia Chen, Bin Song
EFTViT: Efficient Federated Training of Vision Transformers with Masked Images on Resource-Constrained Clients
meihan wu, Tao Chang, Cui Miao et al.
Target Bias Is All You Need: Zero-Shot Debiasing of Vision-Language Models with Bias Corpus
Taeuk Jang, Hoin Jung, Xiaoqian Wang
Kernel von Mises Formula of the Influence Function
Yaroslav Mukhin
The quest for the GRAph Level autoEncoder (GRALE)
Paul Krzakala, Gabriel Melo, Charlotte Laclau et al.
Policy Gradient Methods Converge Globally in Imperfect-Information Extensive-Form Games
Fivos Kalogiannis, Gabriele Farina
Multi-Cache Enhanced Prototype Learning for Test-Time Generalization of Vision-Language Models
Xinyu Chen, Haotian Zhai, Can Zhang et al.
Unsupervised Visual Chain-of-Thought Reasoning via Preference Optimization
Kesen Zhao, Beier Zhu, Qianru Sun et al.
TRNAS: A Training-Free Robust Neural Architecture Search
Yeming Yang, Qingling Zhu, Jianping Luo et al.
The Inter-Intra Modal Measure: A Predictive Lens on Fine-Tuning Outcomes in Vision-Language Models
Laura Niss, Kevin Vogt-Lowell, Theodoros Tsiligkaridis
What to Distill? Fast Knowledge Distillation with Adaptive Sampling
Byungchul Chae, Seonyeong Heo
Generative Modeling of Shape-Dependent Self-Contact Human Poses
Takehiko Ohkawa, Jihyun Lee, Shunsuke Saito et al.
Met2Net: A Decoupled Two-Stage Spatio-Temporal Forecasting Model for Complex Meteorological Systems
Shaohan Li, Hao Yang, Min Chen et al.
Beyond RGB: Adaptive Parallel Processing for RAW Object Detection
Shani Gamrian, Hila Barel, Feiran Li et al.
PoseSyn: Synthesizing Diverse 3D Pose Data from In-the-Wild 2D Data
CHANGHEE YANG, Hyeonseop Song, Seokhun Choi et al.
TorchAdapt: Towards Light-Agnostic Real-Time Visual Perception
Khurram Azeem Hashmi, Karthik Suresh, Didier Stricker et al.
Human-in-the-Loop Local Corrections of 3D Scene Layouts via Infilling
Christopher Xie, Armen Avetisyan, Henry Howard-Jenkins et al.
DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion
Qingcheng Zhao, Xiang Zhang, Haiyang Xu et al.
Invisible Watermarks, Visible Gains: Steering Machine Unlearning with Bi-Level Watermarking Design
Yuhao Sun, Yihua Zhang, Gaowen Liu et al.
Real3D: Towards Scaling Large Reconstruction Models with Real Images
Hanwen Jiang, Qixing Huang, Georgios Pavlakos
Do It Yourself: Learning Semantic Correspondence from Pseudo-Labels
Olaf Dünkel, Thomas Wimmer, Christian Theobalt et al.
MultiNet: Adaptive Multi-Viewed Subgraph Convolutional Networks for Graph Classification
Xinya Qin, Lu Bai, Lixin Cui et al.
Partner Modelling Emerges in Recurrent Agents (But Only When It Matters)
Ruaridh Mon-Williams, Max Taylor-Davies, Elizabeth Mieczkowski et al.
Plug-and-play Feature Causality Decomposition for Multimodal Representation Learning
Ye Liu, Zihan Ji, Hongmin Cai
LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling
Yang Xiao, Jiashuo WANG, Ruifeng Yuan et al.
CCMNet: Leveraging Calibrated Color Correction Matrices for Cross-Camera Color Constancy
Dongyoung Kim, Mahmoud Afifi, Dongyun Kim et al.
Zero-shot Inexact CAD Model Alignment from a Single Image
Pattaramanee Arsomngern, Sasikarn Khwanmuang, Matthias Nießner et al.
MS-SSM: A Multi-Scale State Space Model for Efficient Sequence Modeling
Mahdi Karami, Ali Behrouz, Peilin Zhong et al.
Motal: Unsupervised 3D Object Detection by Modality and Task-specific Knowledge Transfer
Hai Wu, Hongwei Lin, Xusheng Guo et al.
MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulation
Pingrui Zhang, Xianqiang Gao, Yuhan Wu et al.
OVA-Fields: Weakly Supervised Open-Vocabulary Affordance Fields for Robot Operational Part Detection
Heng Su, Mengying Xie, Nieqing Cao et al.
Supposedly Equivalent Facts That Aren’t? Entity Frequency in Pre-training Induces Asymmetry in LLMs
Yuan He, Bailan He, Zifeng Ding et al.
X-Capture: An Open-Source Portable Device for Multi-Sensory Learning
Samuel Clarke, Suzannah Wistreich, Yanjie Ze et al.
GloPER: Unsupervised Animal Pattern Extraction from Local Reconstruction
Bowen Chen, Yun Sing Koh, Gillian Dobbie
Focal Plane Visual Feature Generation and Matching on a Pixel Processor Array
Hongyi Zhang, Laurie Bose, Jianing Chen et al.
Seeing and Seeing Through the Glass: Real and Synthetic Data for Multi-Layer Depth Estimation
Hongyu Wen, Yiming Zuo, Venkat Subramanian et al.
AR-VRM: Imitating Human Motions for Visual Robot Manipulation with Analogical Reasoning
Dejie Yang, Zijing Zhao, Yang Liu
Unleashing the Temporal Potential of Stereo Event Cameras for Continuous-Time 3D Object Detection
Jae Young Kang, Hoonhee Cho, Kuk-Jin Yoon
PlaneRAS: Learning Planar Primitives for 3D Plane Recovery
Fang Zhang, Wenzhao Zheng, Linqing Zhao et al.
Depth-Supervised Fusion Network for Seamless-Free Image Stitching
Zhiying Jiang, Ruhao Yan, Zengxi Zhang et al.
3DSRBench: A Comprehensive 3D Spatial Reasoning Benchmark
Wufei Ma, Haoyu Chen, Guofeng Zhang et al.
TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
Xuying Zhang, Yutong Liu, Yangguang Li et al.
Layer-wise Vision Injection with Disentangled Attention for Efficient LVLMs
Xuange Zhang, Dengjie Li, Bo Liu et al.
HccePose (BF): Predicting Front & Back Surfaces to Construct Ultra-Dense 2D-3D Correspondences for Pose Estimation
Yulin Wang, Mengting Hu, Hongli Li et al.
Tabula: A Tabular Self-Supervised Foundation Model for Single-Cell Transcriptomics
Jiayuan Ding, Jianhui Lin, Shiyu Jiang et al.
CamSAM2: Segment Anything Accurately in Camouflaged Videos
Yuli Zhou, Yawei Li, Yuqian Fu et al.
Improving Monte Carlo Tree Search for Symbolic Regression
Zhengyao Huang, Daniel Huang, Tiannan Xiao et al.
MM-Spatial: Exploring 3D Spatial Understanding in Multimodal LLMs
Erik Daxberger, Nina Wenzel, David Griffiths et al.
Understanding Flatness in Generative Models: Its Role and Benefits
Taehwan Lee, Kyeongkook Seo, Jaejun Yoo et al.
Image-Guided Shape-from-Template Using Mesh Inextensibility Constraints
Dinh-Vinh-Thuy Tran, Ruochen Chen, Shaifali Parashar
PHD: Personalized 3D Human Body Fitting with Point Diffusion
Hsuan-I Ho, Chen Guo, Po-Chen Wu et al.
ScoreHOI: Physically Plausible Reconstruction of Human-Object Interaction via Score-Guided Diffusion
AO LI, Jinpeng Liu, Yixuan Zhu et al.
MonoSOWA: Scalable monocular 3D Object detector Without human Annotations
Jan Skvrna, Lukas Neumann
Estimating 2D Camera Motion with Hybrid Motion Basis
Haipeng Li, Tianhao Zhou, Zhanglei Yang et al.
Unified 2D-3D Discrete Priors for Noise-Robust and Calibration-Free Multiview 3D Human Pose Estimation
Geng Chen, Pengfei Ren, Xufeng Jian et al.
OceanBench: A Benchmark for Data-Driven Global Ocean Forecasting systems
Anass El Aouni, Quentin Gaudel, J. Emmanuel Johnson et al.
TESPEC: Temporally-Enhanced Self-Supervised Pretraining for Event Cameras
Mohammad Mohammadi, Ziyi Wu, Igor Gilitschenski
Separating the 'what' and 'how' of compositional computation to enable reuse and continual learning
Haozhe Shan, Sun Minni, Lea Duncker
DetectiumFire: A Comprehensive Multi-modal Dataset Bridging Vision and Language for Fire Understanding
Zixuan Liu, Siavash H. Khajavi, Guangkai Jiang
Adapting Vehicle Detectors for Aerial Imagery to Unseen Domains with Weak Supervision
Xiao Fang, Minhyek Jeon, Zheyang Qin et al.
AtmosSci-Bench: Evaluating the Recent Advance of Large Language Model for Atmospheric Science
Chenyue Li, Wen Deng, Mengqian Lu et al.
Revisiting Image Fusion for Multi-Illuminant White-Balance Correction
David Serrano, Aditya Arora, Luis Herranz et al.
Harnessing the Computation Redundancy in ViTs to Boost Adversarial Transferability
Jiani Liu, Zhiyuan Wang, Zeliang Zhang et al.
Uncertainty-Aware Gradient Stabilization for Small Object Detection
Huixin Sun, Yanjing Li, Linlin Yang et al.
CryoFastAR: Fast Cryo-EM Ab initio Reconstruction Made Easy
Jiakai Zhang, Shouchen Zhou, Haizhao Dai et al.
Event-guided Unified Framework for Low-light Video Enhancement, Frame Interpolation, and Deblurring
Taewoo Kim, Kuk-Jin Yoon
Spatial Alignment and Temporal Matching Adapter for Video-Radar Remote Physiological Measurement
Qian Liang, Ruixu Geng, Jinbo Chen et al.
Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Yusuke Hirota, Ryo Hachiuma, Boyi Li et al.
SEHDR: Single-Exposure HDR Novel View Synthesis via 3D Gaussian Bracketing
Yiyu Li, Haoyuan Wang, Ke Xu et al.
AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems
Yu Shang, Peijie Liu, Yuwei Yan et al.
MaGS: Reconstructing and Simulating Dynamic 3D Objects with Mesh-adsorbed Gaussian Splatting
Shaojie Ma, Yawei Luo, Wei Yang et al.