Most Cited 2025 "experiment design" Papers
22,274 papers found • Page 36 of 112
Conference
Forgetting Through Transforming: Enabling Federated Unlearning via Class-Aware Representation Transformation
Qi Guo, Zhen Tian, Minghao Yao et al.
Learning Streaming Video Representation via Multitask Training
Yibin Yan, Jilan Xu, Shangzhe Di et al.
BATCLIP: Bimodal Online Test-Time Adaptation for CLIP
Sarthak Kumar Maharana, Baoming Zhang, Leonid Karlinsky et al.
Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting
Jiaxin Huang, Sheng Miao, Bangbang Yang et al.
Semantic Causality-Aware Vision-Based 3D Occupancy Prediction
Dubing Chen, Huan Zheng, Yucheng Zhou et al.
Cross-Subject Mind Decoding from Inaccurate Representations
Yangyang Xu, Bangzhen Liu, Wenqi Shao et al.
X2-Gaussian: 4D Radiative Gaussian Splatting for Continuous-time Tomographic Reconstruction
Weihao Yu, Yuanhao Cai, Ruyi Zha et al.
SHeaP: Self-supervised Head Geometry Predictor Learned via 2D Gaussians
Liam Schoneveld, Zhe Chen, Davide Davoli et al.
TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis
Tri Ton, Ji Woo Hong, Chang Yoo
Preserve or Modify? Context-Aware Evaluation for Balancing Preservation and Modification in Text-Guided Image Editing
Yoonjeon Kim, Soohyun Ryu, Yeonsung Jung et al.
Large-scale Pre-training for Grounded Video Caption Generation
Evangelos Kazakos, Cordelia Schmid, Josef Sivic
TailedCore: Few-Shot Sampling for Unsupervised Long-Tail Noisy Anomaly Detection
Yoon Gyo Jung, Jaewoo Park, Jaeho Yoon et al.
Collaborative Instance Object Navigation: Leveraging Uncertainty-Awareness to Minimize Human-Agent Dialogues
Francesco Taioli, Edoardo Zorzi, Gianni Franchi et al.
TokensGen: Harnessing Condensed Tokens for Long Video Generation
Wenqi Ouyang, Zeqi Xiao, Danni Yang et al.
Inference-Time Reward Hacking in Large Language Models
Hadi Khalaf, Claudio Mayrink Verdun, Alex Oesterling et al.
Flow4Agent: Long-form Video Understanding via Motion Prior from Optical Flow
Ruyang Liu, Shangkun Sun, Haoran Tang et al.
LD-RPS: Zero-Shot Unified Image Restoration via Latent Diffusion Recurrent Posterior Sampling
Li Huaqiu, Yong Wang, Tongwen Huang et al.
Stable Diffusion Models are Secretly Good at Visual In-Context Learning
Trevine Oorloff, Vishwanath Sindagi, Wele Gedara Chaminda Bandara et al.
LLaFEA: Frame-Event Complementary Fusion for Fine-Grained Spatiotemporal Understanding in LMMs
Hanyu Zhou, Gim Hee Lee
Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints
Guanjie Chen, Xinyu Zhao, Yucheng Zhou et al.
DMesh++: An Efficient Differentiable Mesh for Complex Shapes
Sanghyun Son, Matheus Gadelha, Yang Zhou et al.
ReasonVQA: A Multi-hop Reasoning Benchmark with Structural Knowledge for Visual Question Answering
Duong T. Tran, Trung-Kien Tran, Manfred Hauswirth et al.
Grouped Speculative Decoding for Autoregressive Image Generation
Junhyuk So, Juncheol Shin, Hyunho Kook et al.
SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs
Jiahui Wang, Zuyan Liu, Yongming Rao et al.
SCAN: Bootstrapping Contrastive Pre-training for Data Efficiency
Yangyang Guo, Mohan Kankanhalli
EDiT: Efficient Diffusion Transformers with Linear Compressed Attention
Philipp Becker, Abhinav Mehrotra, Ruchika Chavhan et al.
Dark-ISP: Enhancing RAW Image Processing for Low-Light Object Detection
Jiasheng Guo, Xin Gao, Yuxiang Yan et al.
Bridging Domain Generalization to Multimodal Domain Generalization via Unified Representations
Hai Huang, Yan Xia, Sashuai Zhou et al.
Enhancing Image Restoration Transformer via Adaptive Translation Equivariance
JiaKui Hu, Zhengjian Yao, Lujia Jin et al.
Reanimating Images using Neural Representations of Dynamic Stimuli
Jacob Yeung, Andrew Luo, Gabriel Sarch et al.
Weakly Supervised Visible-Infrared Person Re-Identification via Heterogeneous Expert Collaborative Consistency Learning
Yafei Zhang, Lingqi Kong, Huafeng Li et al.
Generalizable Object Re-Identification via Visual In-Context Prompting
Zhizhong Huang, Xiaoming Liu
Task Vector Quantization for Memory-Efficient Model Merging
Youngeun Kim, Seunghwan Lee, Aecheon Jung et al.
Dynamic Reconstruction of Hand-Object Interaction with Distributed Force-aware Contact Representation
Zhenjun Yu, Wenqiang Xu, Pengfei Xie et al.
PGC: Physics-Based Gaussian Cloth from a Single Pose
Michelle Guo, Matt Jen-Yuan Chiang, Igor Santesteban et al.
Align Your Rhythm: Generating Highly Aligned Dance Poses with Gating-Enhanced Rhythm-Aware Feature Representation
Congyi Fan, Jian Guan, Xuanjia Zhao et al.
Progressive Test Time Energy Adaptation for Medical Image Segmentation
Xiaoran Zhang, Byung-Woo Hong, Hyoungseob Park et al.
On Large Multimodal Models as Open-World Image Classifiers
Alessandro Conti, Massimiliano Mancini, Enrico Fini et al.
Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle
Miroslav Purkrabek, Jiri Matas
FiffDepth: Feed-forward Transformation of Diffusion-Based Generators for Detailed Depth Estimation
Yunpeng Bai, Qixing Huang
Breaking the Encoder Barrier for Seamless Video-Language Understanding
Handong Li, Yiyuan Zhang, Longteng Guo et al.
Pinco: Position-induced Consistent Adapter for Diffusion Transformer in Foreground-conditioned Inpainting
Guangben Lu, Yuzhen N/A, Zhimin Sun et al.
Distilling Spatially-Heterogeneous Distortion Perception for Blind Image Quality Assessment
Xudong Li, Wenjie Nie, Yan Zhang et al.
D2SP: Dynamic Dual-Stage Purification Framework for Dual Noise Mitigation in Vision-based Affective Recognition.
Haoran Wang, Xinji Mai, Zeng Tao et al.
Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing
XianJun, Davin Choo, Yuqi Pan, Tonghan Wang et al.
PAC Bench: Do Foundation Models Understand Prerequisites for Executing Manipulation Policies?
Atharva Gundawar, Som Sagar, Ransalu Senanayake
Sparta Alignment: Collectively Aligning Multiple Language Models through Combat
Yuru Jiang, Wenxuan Ding, Shangbin Feng et al.
Improve Representation for Imbalanced Regression through Geometric Constraints
Zijian Dong, Yilei Wu, Chongyao Chen et al.
Spatial-Temporal Aware Visuomotor Diffusion Policy Learning
Zhenyang Liu, Yikai Wang, Kuanning Wang et al.
Adaptive Prediction-Powered AutoEval with Reliability and Efficiency Guarantees
Sangwoo Park, Matteo Zecchin, Osvaldo Simeone
Articulated Kinematics Distillation from Video Diffusion Models
Xuan Li, Qianli Ma, Tsung-Yi Lin et al.
Flatness is Necessary, Neural Collapse is Not: Rethinking Generalization via Grokking
Ting Han, Linara Adilova, Henning Petzka et al.
How Can Objects Help Video-Language Understanding?
Zitian Tang, Shijie Wang, Junho Cho et al.
A Statistical Theory of Contrastive Learning via Approximate Sufficient Statistics
Licong Lin, Song Mei
CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation
Jungsoo Lee, Debasmit Das, Munawar Hayat et al.
Efficient Video Super-Resolution for Real-time Rendering with Decoupled G-buffer Guidance
Mingjun Zheng, Long Sun, Jiangxin Dong et al.
Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion
Songsong Yu, Yuxin Chen, Zhongang Qi et al.
Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation
Junyu Xie, Tengda Han, Max Bain et al.
TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation
Jiaben Chen, Zixin Wang, AILING ZENG et al.
FSNet: Feasibility-Seeking Neural Network for Constrained Optimization with Guarantees
Hoang Nguyen, Priya Donti
Causal Head Gating: A Framework for Interpreting Roles of Attention Heads in Transformers
Andrew Nam, Henry Conklin, Yukang Yang et al.
Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space
Zhengrui Ma, Yang Feng, Chenze Shao et al.
Face Forgery Video Detection via Temporal Forgery Cue Unraveling
Zonghui Guo, YingJie Liu, Jie Zhang et al.
MV-SSM: Multi-View State Space Modeling for 3D Human Pose Estimation
Aviral Chharia, Wenbo Gou, Haoye Dong
FedSPA: Generalizable Federated Graph Learning under Homophily Heterogeneity
Zihan Tan, Guancheng Wan, Wenke Huang et al.
Efficient Parametric SVD of Koopman Operator for Stochastic Dynamical Systems
Minchan Jeong, Jongha (Jon) Ryu, Se-Young Yun et al.
AutoDiscovery: Open-ended Scientific Discovery via Bayesian Surprise
Dhruv Agarwal, Bodhisattwa Prasad Majumder, Reece Adamson et al.
CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays
Hyungyung Lee, Geon Choi, Jung-Oh Lee et al.
FiRe: Fixed-points of Restoration Priors for Solving Inverse Problems
Matthieu Terris, Ulugbek Kamilov, Thomas Moreau
AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play
Ran Xu, Yuchen Zhuang, Zihan Dong et al.
Detecting Adversarial Data Using Perturbation Forgery
Qian Wang, Chen Li, Yuchen Luo et al.
Learning to Focus: Causal Attention Distillation via Gradient‐Guided Token Pruning
Yiju Guo, Wenkai Yang, Zexu Sun et al.
SAM-REF: Introducing Image-Prompt Synergy during Interaction for Detail Enhancement in the Segment Anything Model
Chongkai Yu, Ting Liu, Li Anqi et al.
Balanced Rate-Distortion Optimization in Learned Image Compression
Yichi Zhang, Zhihao Duan, Yuning Huang et al.
Bridge the Gap: From Weak to Full Supervision for Temporal Action Localization with PseudoFormer
Ziyi Liu, Yangcen Liu
PSA-SSL: Pose and Size-aware Self-Supervised Learning on LiDAR Point Clouds
Barza Nisar, Steven L. Waslander
ASHiTA: Automatic Scene-grounded HIerarchical Task Analysis
Yun Chang, Leonor Fermoselle, Duy Ta et al.
Enhancing Diversity for Data-free Quantization
Kai Zhao, zhihao zhuang, Miao Zhang et al.
FIction: 4D Future Interaction Prediction from Video
Kumar Ashutosh, Georgios Pavlakos, Kristen Grauman
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang, Wonmin Byeon, Jiarui Xu et al.
Monitoring Risks in Test-Time Adaptation
Mona Schirmer, Metod Jazbec, Christian Andersson Naesseth et al.
Efficient Adaptive Federated Optimization
Su Hyeong Lee, Sidharth Sharma, Manzil Zaheer et al.
scMRDR: A scalable and flexible framework for unpaired single-cell multi-omics data integration
Jianle Sun, Chaoqi Liang, Ran Wei et al.
Provably Efficient Online RLHF with One-Pass Reward Modeling
Long-Fei Li, Yu-Yang Qian, Peng Zhao et al.
CleanPose: Category-Level Object Pose Estimation via Causal Learning and Knowledge Distillation
Xiao Lin, Yun Peng, Liuyi Wang et al.
Person De-reidentification: A Variation-guided Identity Shift Modeling
Yi-Xing Peng, Yu-Ming Tang, Kun-Yu Lin et al.
CASP: Compression of Large Multimodal Models Based on Attention Sparsity
Mohsen Gholami, Mohammad Akbari, Kevin Cannons et al.
Transformers for Mixed-type Event Sequences
Felix Draxler, Yang Meng, Kai Nelson et al.
Flexible Group Count Enables Hassle-Free Structured Pruning
Jiamu Zhang, Shaochen Zhong, Andrew Ye et al.
Provable Ordering and Continuity in Vision-Language Pretraining for Generalizable Embodied Agents
Zhizhen Zhang, Lei Zhu, Zhen Fang et al.
Representation Consistency for Accurate and Coherent LLM Answer Aggregation
Junqi Jiang, Tom Bewley, Salim I. Amoukou et al.
Exploring the Translation Mechanism of Large Language Models
Hongbin Zhang, Kehai Chen, Xuefeng Bai et al.
PRING: Rethinking Protein-Protein Interaction Prediction from Pairs to Graphs
Xinzhe Zheng, Hao Du, Fanding Xu et al.
ODP-Bench: Benchmarking Out-of-Distribution Performance Prediction
Han Yu, Kehan Li, Dongbai Li et al.
MixA-Q: Revisiting Activation Sparsity for Vision Transformers from a Mixed-Precision Quantization Perspective
Weitian Wang, Shubham rai, Cecilia De la Parra et al.
Describe, Don’t Dictate: Semantic Image Editing with Natural Language Intent
En Ci, Shanyan Guan, Yanhao Ge et al.
SciVid: Cross-Domain Evaluation of Video Models in Scientific Applications
Yana Hasson, Pauline Luc, Liliane Momeni et al.
Bring Your Rear Cameras for Egocentric 3D Human Pose Estimation
HIroyasu Akada, Jian Wang, Vladislav Golyanik et al.
FG-OrIU: Towards Better Forgetting via Feature-Gradient Orthogonality for Incremental Unlearning
qian feng, Jiahang Tu, Mintong Kang et al.
IntroStyle: Training-Free Introspective Style Attribution using Diffusion Features
Anand Kumar, Jiteng Mu, Nuno Vasconcelos
Copresheaf Topological Neural Networks: A Generalized Deep Learning Framework
Mustafa Hajij, Lennart Bastian, Sarah Osentoski et al.
Where the Devil Hides: Deepfake Detectors Can No Longer Be Trusted
Shuaiwei Yuan, Junyu Dong, Yuezun Li
Self-Evolving Visual Concept Library using Vision-Language Critics
Atharva Sehgal, Patrick Yuan, Ziniu Hu et al.
Risk-aware Direct Preference Optimization under Nested Risk Measure
Lijun Zhang, Lin Li, Yajie Qi et al.
GRIP: A Graph-Based Reasoning Instruction Producer
Jiankang Wang, Jianjun Xu, Xiaorui Wang et al.
Bridging Theory and Practice in Link Representation with Graph Neural Networks
Veronica Lachi, Francesco Ferrini, Antonio Longa et al.
Learn2Mix: Training Neural Networks Using Adaptive Data Integration
Shyam Venkatasubramanian, Vahid Tarokh
Diff-Palm: Realistic Palmprint Generation with Polynomial Creases and Intra-Class Variation Controllable Diffusion Models
Jianlong Jin, Chenglong Zhao, Ruixin Zhang et al.
The Effect of Optimal Self-Distillation in Noisy Gaussian Mixture Model
Kaito Takanami, Takashi Takahashi, Ayaka Sakata
What's in a Latent? Leveraging Diffusion Latent Space for Domain Generalization
Xavier Thomas, Deepti Ghadiyaram
A Theory for Worst-Case vs. Average-Case Guarantees for LLMs
Noga Amit, Shafi Goldwasser, Orr Paradise et al.
PIDSR: Complementary Polarized Image Demosaicing and Super-Resolution
Shuangfan Zhou, Chu Zhou, Youwei Lyu et al.
Reward Reasoning Models
Jiaxin Guo, Zewen Chi, Li Dong et al.
Object-level Correlation for Few-Shot Segmentation
chunlin wen, Yu Zhang, Jie Fan et al.
Exact and Linear Convergence for Federated Learning under Arbitrary Client Participation is Attainable
Bicheng Ying, Zhe Li, Haibo Yang
Prior2Former - Evidential Modeling of Mask Transformers for Assumption-Free Open-World Panoptic Segmentation
Sebastian Schmidt, Julius Koerner, Dominik Fuchsgruber et al.
Detecting Generated Images by Fitting Natural Image Distributions
Yonggang Zhang, Jun Nie, Xinmei Tian et al.
Mixture-of-Experts Meets In-Context Reinforcement Learning
Wenhao Wu, Fuhong Liu, Haoru Li et al.
Decouple-Then-Merge: Finetune Diffusion Models as Multi-Task Learning
Qianli Ma, Xuefei Ning, Dongrui Liu et al.
Tail-Optimized Caching for LLM Inference
Wenxin Zhang, Yueying Li, Ciamac C Moallemi et al.
NOBLE - Neural Operator with Biologically-informed Latent Embeddings to Capture Experimental Variability in Biological Neuron Models
Luca Ghafourpour, Valentin Duruisseaux, Bahareh Tolooshams et al.
Learning Dynamic Collaborative Network for Semi-supervised 3D Vessel Segmentation
Jiao Xu, Xin Chen, Lihe Zhang
Insightful Instance Features for 3D Instance Segmentation
Wonseok Roh, Hwanhee Jung, Giljoo Nam et al.
An Analysis of Concept Bottleneck Models: Measuring, Understanding, and Mitigating the Impact of Noisy Annotations
Seonghwan Park, Jueun Mun, Donghyun Oh et al.
Attack via Overfitting: 10-shot Benign Fine-tuning to Jailbreak LLMs
Zhixin Xie, Xurui Song, Jun Luo
Efficient Preference-Based Reinforcement Learning: Randomized Exploration meets Experimental Design
Andreas Schlaginhaufen, Reda Ouhamma, Maryam Kamgarpour
Scaling Diffusion Transformers Efficiently via $\mu$P
Chenyu Zheng, Xinyu Zhang, Rongzhen Wang et al.
Feedback-Aware MCTS for Goal-Oriented Information Seeking
Harshita Chopra, Chirag Shah
Stylized-Face: A Million-level Stylized Face Dataset for Face Recognition
Zhengyuan Peng, Jianqing Xu, Yuge Huang et al.
VSC: Visual Search Compositional Text-to-Image Diffusion Model
Do Dat, Nam Hyeon-Woo, Po-Yuan Mao et al.
AdaSTaR: Adaptive Data Sampling for Training Self-Taught Reasoners
Reiss Koh, Wonbeen Oh, Jaein Jang et al.
Forensic Self-Descriptions Are All You Need for Zero-Shot Detection, Open-Set Source Attribution, and Clustering of AI-generated Images
Tai Nguyen, Aref Azizpour, Matthew Stamm
Influence Guided Context Selection for Effective Retrieval-Augmented Generation
Jiale Deng, Yanyan Shen, Ziyuan Pei et al.
Contrastive Self-Supervised Learning As Neural Manifold Packing
Guanming Zhang, David Heeger, Stefano Martiniani
CGS-GAN: 3D Consistent Gaussian Splatting GANs for High Resolution Human Head Synthesis
Florian Barthel, Wieland Morgenstern, Paul Hinzer et al.
Robustness in Both Domains: CLIP Needs a Robust Text Encoder
Elias Abad Rocamora, Christian Schlarmann, Naman Deep Singh et al.
CheckManual: A New Challenge and Benchmark for Manual-based Appliance Manipulation
Yuxing Long, Jiyao Zhang, Mingjie Pan et al.
Quantum Doubly Stochastic Transformers
Jannis Born, Filip Skogh, Kahn Rhrissorrakrai et al.
SplArt: Articulation Estimation and Part-Level Reconstruction with 3D Gaussian Splatting
Shengjie Lin, Jiading Fang, Muhammad Zubair Irshad et al.
GradMetaNet: An Equivariant Architecture for Learning on Gradients
Yoav Gelberg, Yam Eitan, Aviv Navon et al.
Learn2Synth: Learning Optimal Data Synthesis Using Hypergradients for Brain Image Segmentation
Xiaoling Hu, Xiangrui Zeng, Oula Puonti et al.
EAP-GS: Efficient Augmentation of Pointcloud for 3D Gaussian Splatting in Few-shot Scene Reconstruction
Dongrui Dai, Yuxiang Xing
SVRPBench: A Realistic Benchmark for Stochastic Vehicle Routing Problem
Ahmed Heakl, Yahia Salaheldin Shaaban, Salem Lahlou et al.
Scale Your Instructions: Enhance the Instruction-Following Fidelity of Unified Image Generation Model by Self-Adaptive Attention Scaling
Chao Zhou, Tianyi Wei, Nenghai Yu
The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements
Bingchen Zhao, Despoina Magka, Minqi Jiang et al.
PhysGym: Benchmarking LLMs in Interactive Physics Discovery with Controlled Priors
Yimeng Chen, Piotr Piękos, Mateusz Ostaszewski et al.
Second-Order Convergence in Private Stochastic Non-Convex Optimization
Youming Tao, Zuyuan Zhang, Dongxiao Yu et al.
COIN: Confidence Score-Guided Distillation for Annotation-Free Cell Segmentation
Sanghyun Jo, Seo Lee, Seungwoo Lee et al.
DNF-Intrinsic: Deterministic Noise-Free Diffusion for Indoor Inverse Rendering
Rongjia Zheng, Qing Zhang, Chengjiang Long et al.
Language Modeling by Language Models
Junyan Cheng, Peter Clark, Kyle Richardson
Seg2Any: Open-set Segmentation-Mask-to-Image Generation with Precise Shape and Semantic Control
Danfeng Li, Hui Zhang, Sheng Wang et al.
ETA: Energy-based Test-time Adaptation for Depth Completion
Younjoon Chung, Hyoungseob Park, Patrick Rim et al.
SDMatte: Grafting Diffusion Models for Interactive Matting
Longfei Huang, Yu Liang, Hao Zhang et al.
FairImagen: Post-Processing for Bias Mitigation in Text-to-Image Models
Zihao Fu, Ryan Brown, Shun Shao et al.
DEPTHOR: Depth Enhancement from a Practical Light-Weight dToF Sensor and RGB Image
Jijun Xiang, Xuan Zhu, Xianqi Wang et al.
Learning Class Prototypes for Unified Sparse-Supervised 3D Object Detection
Yun Zhu, Le Hui, Hang Yang et al.
Temporal Logic-Based Multi-Vehicle Backdoor Attacks against Offline RL Agents in End-to-end Autonomous Driving
Xuan Chen, Shiwei Feng, Zikang Xiong et al.
FlexDrive: Toward Trajectory Flexibility in Driving Scene Gaussian Splatting Reconstruction and Rendering
Jingqiu Zhou, Lue Fan, Linjiang Huang et al.
Rao-Blackwell Gradient Estimators for Equivariant Denoising Diffusion
Vinh Tong, Trung-Dung Hoang, Anji Liu et al.
Generative Map Priors for Collaborative BEV Semantic Segmentation
Jiahui Fu, Yue Gong, Luting Wang et al.
DeepCompress-ViT: Rethinking Model Compression to Enhance Efficiency of Vision Transformers at the Edge
Sabbir Ahmed, Abdullah Al Arafat, Deniz Najafi et al.
Coherent 3D Portrait Video Reconstruction via Triplane Fusion
Shengze Wang, Xueting Li, Chao Liu et al.
TexGarment: Consistent Garment UV Texture Generation via Efficient 3D Structure-Guided Diffusion Transformer
Jialun Liu, Jinbo Wu, Xiaobo Gao et al.
Cross-Architecture Distillation Made Simple with Redundancy Suppression
Weijia Zhang, Yuehao Liu, Wu Ran et al.
RobotSmith: Generative Robotic Tool Design for Acquisition of Complex Manipulation Skills
Chunru Lin, Haotian Yuan, Yian Wang et al.
Orientation-anchored Hyper-Gaussian for 4D Reconstruction from Casual Videos
Junyi Wu, Jiachen Tao, Haoxuan Wang et al.
Multivariate Latent Recalibration for Conditional Normalizing Flows
Victor Dheur, Souhaib Ben Taieb
BLADE: Single-view Body Mesh Estimation through Accurate Depth Estimation
Shengze Wang, Jiefeng Li, Tianye Li et al.
AstroVisBench: A Code Benchmark for Scientific Computing and Visualization in Astronomy
Sebastian Joseph, Syed M. Husain, Stella Offner et al.
A Circular Argument: Does RoPE need to be Equivariant for Vision?
Chase van de Geijn, Timo Lüddecke, Polina Turishcheva et al.
MAP Estimation with Denoisers: Convergence Rates and Guarantees
Scott Pesme, Giacomo Meanti, Michael Arbel et al.
Brain Harmony: A Multimodal Foundation Model Unifying Morphology and Function into 1D Tokens
Zijian Dong, Ruilin Li, Joanna Chong et al.
LabUtopia: High-Fidelity Simulation and Hierarchical Benchmark for Scientific Embodied Agents
Rui Li, Zixuan Hu, Wenxi Qu et al.
Continual Multiple Instance Learning with Enhanced Localization for Histopathological Whole Slide Image Analysis
Byung Hyun Lee, Wongi Jeong, Woojae Han et al.
Guard Me If You Know Me: Protecting Specific Face-Identity from Deepfakes
Kaiqing Lin, Zhiyuan Yan, Ke-Yue Zhang et al.
Disentangling Instance and Scene Contexts for 3D Semantic Scene Completion
Enyu Liu, En Yu, Sijia Chen et al.
Composition and Alignment of Diffusion Models using Constrained Learning
Shervin Khalafi, Ignacio Hounie, Dongsheng Ding et al.
C-SEO Bench: Does Conversational SEO Work?
Haritz Puerto, Martin Gubri, Tommaso Green et al.
Robust Transfer Learning with Unreliable Source Data
Jianqing Fan, Cheng Gao, Jason Klusowski
Dual-Granularity Semantic Guided Sparse Routing Diffusion Model for General Pansharpening
Yinghui Xing, Qu Li Tao, Shizhou Zhang et al.
DH-Set: Improving Vision-Language Alignment with Diverse and Hybrid Set-Embeddings Learning
Kun Zhang, Jingyu Li, Zhe Li et al.
Conformal Arbitrage: Risk-Controlled Balancing of Competing Objectives in Language Models
William Overman, Mohsen Bayati
See through the Dark: Learning Illumination-affined Representations for Nighttime Occupancy Prediction
Yuan Wu, Zhiqiang Yan, Yigong Zhang et al.
Adaptive Inference-Time Scaling via Cyclic Diffusion Search
Gyubin Lee, Bao Truong, Jaesik Yoon et al.
Robust Multi-Object 4D Generation for In-the-wild Videos
Wen-Hsuan Chu, Lei Ke, Jianmeng Liu et al.
AdsQA: Towards Advertisement Video Understanding
Xinwei Long, Kai Tian, Peng Xu et al.
Boosting Vision Semantic Density with Anatomy Normality Modeling for Medical Vision-language Pre-training
Weiwei Cao, Jianpeng Zhang, Zhongyi Shui et al.
Measuring Fingerprints of Web-filtered Text Datasets and Fingerprint Propagation Through Training
Youssef Mansour, Reinhard Heckel
FLAVC: Learned Video Compression with Feature Level Attention
Chun Zhang, Heming Sun, Jiro Katto
Multimodal Prompt Alignment for Facial Expression Recognition
Fuyan Ma, Yiran He, Bin Sun et al.
Stable Score Distillation
Haiming Zhu, Yangyang Xu, Chenshu Xu et al.
BackdoorDM: A Comprehensive Benchmark for Backdoor Learning on Diffusion Model
Weilin Lin, Nanjun Zhou, Yanyun Wang et al.
Joint Asymmetric Loss for Learning with Noisy Labels
Jialiang Wang, Xianming Liu, Xiong Zhou et al.
ClusterFusion: Expanding Operator Fusion Scope for LLM Inference via Cluster-Level Collective Primitive
Xinhao Luo, Zihan Liu, Yangjie Zhou et al.
STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection
Divya Velayudhan, Abdelfatah Ahmed, Mohamad Alansari et al.
From Black-box to Causal-box: Towards Building More Interpretable Models
Inwoo Hwang, Yushu Pan, Elias Bareinboim
Certifying Stability of Reinforcement Learning Policies using Generalized Lyapunov Functions
Kehan Long, Jorge Cortes, Nikolay Atanasov
Segment This Thing: Foveated Tokenization for Efficient Point-Prompted Segmentation
Tanner Schmidt, Richard Newcombe
Hybrid-grained Feature Aggregation with Coare-to-fine Language Guidance for Self-supervised Monocular Depth Estimation
Wenyao Zhang, Hongsi Liu, Bohan Li et al.
GLEAM: Enhanced Transferable Adversarial Attacks for Vision-Language Pre-training Models via Global-Local Transformations
Yunqi Liu, Xiaohui Cui, Ouyang Xue