Most Cited 2025 "squared error loss" Papers
22,274 papers found • Page 57 of 112
Conference
DictAS: A Framework for Class-Generalizable Few-Shot Anomaly Segmentation via Dictionary Lookup
Zhen Qu, Xian Tao, Xinyi Gong et al.
MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language Modeling
Yingyue Li, Bencheng Liao, Wenyu Liu et al.
LiON-LoRA: Rethinking LoRA Fusion to Unify Controllable Spatial and Temporal Generation for Video Diffusion
Yisu Zhang, Chenjie Cao, Chaohui Yu et al.
InstaScene: Towards Complete 3D Instance Decomposition and Reconstruction from Cluttered Scenes
Zesong Yang, Bangbang Yang, Wenqi Dong et al.
JailbreakDiffBench: A Comprehensive Benchmark for Jailbreaking Diffusion Models
Xiaolong Jin, Zixuan Weng, Hanxi Guo et al.
Exploiting Domain Properties in Language-Driven Domain Generalization for Semantic Segmentation
Seogkyu Jeon, Kibeom Hong, Hyeran Byun
Semantic versus Identity: A Divide-and-Conquer Approach towards Adjustable Medical Image De-Identification
Yuan Tian, Shuo Wang, Rongzhao Zhang et al.
Supercharging Floorplan Localization with Semantic Rays
Yuval Grader, Hadar Averbuch-Elor
G2SF: Geometry-Guided Score Fusion for Multimodal Industrial Anomaly Detection
Chengyu Tao, Xuanming Cao, Juan Du
One Polyp Identifies All: One-Shot Polyp Segmentation with SAM via Cascaded Priors and Iterative Prompt Evolution
Xinyu Mao, Xiaohan Xing, Fei MENG et al.
You Are Your Own Best Teacher: Achieving Centralized-level Performance in Federated Learning under Heterogeneous and Long-tailed Data
Shanshan Yan, Zexi Li, Chao Wu et al.
GT-Mean Loss: A Simple Yet Effective Solution for Brightness Mismatch in Low-Light Image Enhancement
Jingxi Liao, Shijie Hao, Richang Hong et al.
Learn2Synth: Learning Optimal Data Synthesis Using Hypergradients for Brain Image Segmentation
Xiaoling Hu, Xiangrui Zeng, Oula Puonti et al.
COIN: Confidence Score-Guided Distillation for Annotation-Free Cell Segmentation
Sanghyun Jo, Seo Lee, Seungwoo Lee et al.
Learning Deblurring Texture Prior from Unpaired Data with Diffusion Model
Chengxu Liu, Lu Qi, Jinshan Pan et al.
Identity-aware Language Gaussian Splatting for Open-vocabulary 3D Semantic Segmentation
SungMin Jang, Wonjun Kim
Structure-aware Semantic Discrepancy and Consistency for 3D Medical Image Self-supervised Learning
Tan Pan, Zhaorui Tan, Kaiyu Guo et al.
DASH: 4D Hash Encoding with Self-Supervised Decomposition for Real-Time Dynamic Scene Rendering
Jie Chen, Zhangchi Hu, Peixi Wu et al.
EVT: Efficient View Transformation for Multi-Modal 3D Object Detection
Yongjin Lee, Hyeon-Mun Jeong, Yurim Jeon et al.
An Inversion-based Measure of Memorization for Diffusion Models
Zhe Ma, Qingming Li, Xuhong Zhang et al.
Beyond Simple Edits: Composed Video Retrieval with Dense Modifications
Omkar Thawakar, Dmitry Demidov, Ritesh Thawkar et al.
Balanced Sharpness-Aware Minimization for Imbalanced Regression
Yahao Liu, Qin Wang, Lixin Duan et al.
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation
Qihui Zhang, Munan Ning, Zheyuan Liu et al.
FICGen: Frequency-Inspired Contextual Disentanglement for Layout-driven Degraded Image Generation
Wenzhuang Wang, Yifan Zhao, Mingcan Ma et al.
GLEAM: Enhanced Transferable Adversarial Attacks for Vision-Language Pre-training Models via Global-Local Transformations
Yunqi Liu, Xiaohui Cui, Ouyang Xue
Stepping Out of Similar Semantic Space for Open-Vocabulary Segmentation
Yong Liu, Song-Li Wu, Sule Bai et al.
HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding
JIAHE ZHAO, RuiBing Hou, zejie tian et al.
Adapting In-Domain Few-Shot Segmentation to New Domains without Source Domain Retraining
Qi Fan, Kaiqi Liu, Nian Liu et al.
DOGR: Towards Versatile Visual Document Grounding and Referring
Yinan Zhou, Yuxin Chen, Haokun Lin et al.
ProbRes: Probabilistic Jump Diffusion for Open-World Egocentric Activity Recognition
Sanjoy Kundu, Shanmukha Vellamcheti, Sathyanarayanan Aakur
Disentangling Instance and Scene Contexts for 3D Semantic Scene Completion
Enyu Liu, En Yu, Sijia Chen et al.
Adaptive Routing of Text-to-Image Generation Requests Between Large Cloud Model and Light-Weight Edge Model
Zewei Xin, Qinya Li, Chaoyue Niu et al.
AnyPortal: Zero-Shot Consistent Video Background Replacement
Wenshuo Gao, Xicheng Lan, Shuai Yang
LC-Mamba: Local and Continuous Mamba with Shifted Windows for Frame Interpolation
Min Wu Jeong, Chae Eun Rhee
SciVid: Cross-Domain Evaluation of Video Models in Scientific Applications
Yana Hasson, Pauline Luc, Liliane Momeni et al.
VSC: Visual Search Compositional Text-to-Image Diffusion Model
Do Dat, Nam Hyeon-Woo, Po-Yuan Mao et al.
Counting Stacked Objects
Corentin Dumery, Noa Ette, Aoxiang Fan et al.
Bi-Level Optimization for Self-Supervised AI-Generated Face Detection
Mian Zou, Nan Zhong, Baosheng Yu et al.
DCT-Shield: A Robust Frequency Domain Defense against Malicious Image Editing
Aniruddha Bala, Rohit Chowdhury, Rohan Jaiswal et al.
MagShield: Towards Better Robustness in Sparse Inertial Motion Capture Under Magnetic Disturbances
Yunzhe Shao, Xinyu Yi, Lu Yin et al.
Towards Cross-modal Backward-compatible Representation Learning for Vision-Language Models
Young Kyun Jang, Ser-Nam Lim
LOCATEdit: Graph Laplacian Optimized Cross Attention for Localized Text-Guided Image Editing
Achint Soni, Meet Soni, Sirisha Rambhatla
From Objects to Events: Unlocking Complex Visual Understanding in Object Detectors via LLM-guided Symbolic Reasoning
Yuhui Zeng, Haoxiang Wu, Wenjie Nie et al.
CarGait: Cross-Attention based Re-ranking for Gait recognition
Gavriel Habib, Noa Barzilay, Or Shimshi et al.
PASTA: Part-Aware Sketch-to-3D Shape Generation with Text-Aligned Prior
Seunggwan Lee, Hwanhee Jung, ByoungSoo Koh et al.
Preacher: Paper-to-Video Agentic System
Jingwei Liu, Ling Yang, Hao Luo et al.
MR-FIQA: Face Image Quality Assessment with Multi-Reference Representations from Synthetic Data Generation
Fu-Zhao Ou, Chongyi Li, Shiqi Wang et al.
Gait-X: Exploring X modality for Generalized Gait Recognition
Zengbin Wang, Saihui Hou, Junjie Li et al.
Adaptive Articulated Object Manipulation On The Fly with Foundation Model Reasoning and Part Grounding
Xiaojie Zhang, Yuanfei Wang, Ruihai Wu et al.
Do Your Best and Get Enough Rest for Continual Learning
Hankyul Kang, Gregor Seifer, Donghyun Lee et al.
Robust 3D-Masked Part-level Editing in 3D Gaussian Splatting with Regularized Score Distillation Sampling
Hayeon Kim, Ji Ha Jang, Se Young Chun
VoxelKP: A Voxel-based Network Architecture for Human Keypoint Estimation in LiDAR Data
Jian Shi, Peter Wonka
Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding
Huy Ta, Duy Anh Huynh, Yutong Xie et al.
LoRAverse: A Submodular Framework to Retrieve Diverse Adapters for Diffusion Models
Mert Sonmezer, Matthew Zheng, Pinar Yanardag
Structure Matters: Revisiting Boundary Refinement in Video Object Segmentation
Guanyi Qin, Ziyue Wang, Daiyun Shen et al.
Autoregressive Denoising Score Matching is a Good Video Anomaly Detector
hanwen Zhang, Congqi Cao, Qinyi Lv et al.
SEGA: A Stepwise Evolution Paradigm for Content-Aware Layout Generation with Design Prior
Bo Zhao, Haoran Wang, Jinghui Wang et al.
RobustSplat: Decoupling Densification and Dynamics for Transient-Free 3DGS
Chuanyu Fu, Yuqi Zhang, Kunbin Yao et al.
Denoising Token Prediction in Masked Autoregressive Models
Ting Yao, Yehao Li, Yingwei Pan et al.
Consistency Trajectory Matching for One-Step Generative Super-Resolution
Weiyi You, Mingyang Zhang, Leheng Zhang et al.
LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping
Pascal Chang, Sergio Sancho, Jingwei Tang et al.
Supercharged One-step Text-to-Image Diffusion Models with Negative Prompts
Viet Nguyen, Anh Nguyen, Trung Dao et al.
CVPT: Cross Visual Prompt Tuning
Lingyun Huang, Jianxu Mao, Junfei YI et al.
Addressing Text Embedding Leakage in Diffusion-based Image Editing
Sunung Mun, Jinhwan Nam, Sunghyun Cho et al.
InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis
Tao Han, Wanghan Xu, Junchao Gong et al.
Efficient Multi-Person Motion Prediction by Lightweight Spatial and Temporal Interactions
Yuanhong Zheng, Ruixuan Yu, Jian Sun
FedMVP: Federated Multimodal Visual Prompt Tuning for Vision-Language Models
Mainak Singha, Subhankar Roy, Sarthak Mehrotra et al.
CapeLLM: Support-Free Category-Agnostic Pose Estimation with Multimodal Large Language Models
Junho Kim, Hyungjin Chung, Byung-Hoon Kim
Subjective Camera 1.0: Bridging Human Cognition and Visual Reconstruction through Sequence-Aware Sketch-Guided Diffusion
Haoyang Chen, Dongfang Sun, Caoyuan Ma et al.
Perceiving and Acting in First-Person: A Dataset and Benchmark for Egocentric Human-Object-Human Interactions
Liang Xu, Chengqun Yang, Zili Lin et al.
ViLU: Learning Vision-Language Uncertainties for Failure Prediction
Marc Lafon, Yannis Karmim, Julio Silva-Rodríguez et al.
High-Fidelity Lightweight Mesh Reconstruction from Point Clouds
Chen Zhang, Wentao Wang, Ximeng Li et al.
HouseTour: A Virtual Real Estate A(I)gent
Ata Çelen, Iro Armeni, Daniel Barath et al.
Dynamic-DINO: Fine-Grained Mixture of Experts Tuning for Real-time Open-Vocabulary Object Detection
Yehao Lu, Minghe Weng, Zekang Xiao et al.
SurfaceSplat: Connecting Surface Reconstruction and Gaussian Splatting
Zihui Gao, Jia-Wang Bian, Guosheng Lin et al.
Progressive Growing of Video Tokenizers for Temporally Compact Latent Spaces
Aniruddha Mahapatra, Long Mai, David Bourgin et al.
Enhancing Transformers Through Conditioned Embedded Tokens
Hemanth Saratchandran, Simon Lucey
Joint Asymmetric Loss for Learning with Noisy Labels
Jialiang Wang, Xianming Liu, Xiong Zhou et al.
Trade-offs in Image Generation: How Do Different Dimensions Interact?
Sicheng Zhang, Binzhu Xie, Zhonghao Yan et al.
LOTA: Bit-Planes Guided AI-Generated Image Detection
Renxi Cheng, Hongsong Wang, Yang Zhang et al.
Minimal Variance Model Aggregation: A principled, non-intrusive, and versatile integration of black box models
Theo Bourdais, Houman Owhadi
DNF-Intrinsic: Deterministic Noise-Free Diffusion for Indoor Inverse Rendering
Rongjia Zheng, Qing Zhang, Chengjiang Long et al.
Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models
Hyungjin Kim, Seokho Ahn, Young-Duk Seo
Using Powerful Prior Knowledge of Diffusion Model in Deep Unfolding Networks for Image Compressive Sensing
Chen Liao, Yan Shen, Dan Li et al.
Stylized-Face: A Million-level Stylized Face Dataset for Face Recognition
Zhengyuan Peng, Jianqing Xu, Yuge Huang et al.
MoSiC: Optimal-Transport Motion Trajectory for Dense Self-Supervised Learning
Mohammadreza Salehi, Shashanka Venkataramanan, Ioana Simion et al.
Holistic Tokenizer for Autoregressive Image Generation
Anlin Zheng, Haochen Wang, Yucheng Zhao et al.
PLA: Prompt Learning Attack against Text-to-Image Generative Models
XINQI LYU, Yihao LIU, Yanjie Li et al.
MixA-Q: Revisiting Activation Sparsity for Vision Transformers from a Mixed-Precision Quantization Perspective
Weitian Wang, Shubham rai, Cecilia De la Parra et al.
CleanPose: Category-Level Object Pose Estimation via Causal Learning and Knowledge Distillation
Xiao Lin, Yun Peng, Liuyi Wang et al.
LaRender: Training-Free Occlusion Control in Image Generation via Latent Rendering
Xiaohang Zhan, Dingming Liu
Generalized Few-Shot Point Cloud Segmentation via LLM-Assisted Hyper-Relation Matching
Zhaoyang Li, Yuan Wang, Guoxin Xiong et al.
MIRACLE 3D: Memory-efficient Integrated Robust Approach for Continual Learning on 3D Point Clouds via Shape Model Construction
Hossein Resani, Behrooz Nasihatkon
Monocular Facial Appearance Capture in the Wild
Yingyan Xu, Kate Gadola, Prashanth Chandran et al.
AMD: Adaptive Momentum and Decoupled Contrastive Learning Framework for Robust Long-Tail Trajectory Prediction
Bin Rao, Haicheng Liao, Yanchen Guan et al.
Deep Incomplete Multi-view Clustering with Distribution Dual-Consistency Recovery Guidance
Jiaqi Jin, Siwei Wang, Zhibin Dong et al.
Learning Visual Hierarchies in Hyperbolic Space for Image Retrieval
Ziwei Wang, Sameera Ramasinghe, Chenchen Xu et al.
D2ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition
Wenjie Pei, Qizhong Tan, Guangming Lu et al.
FreeUV: Ground-Truth-Free Realistic Facial UV Texture Recovery via Cross-Assembly Inference Strategy
Xingchao Yang, Takafumi Taketomi, Yuki Endo et al.
DIMCIM: A Quantitative Evaluation Framework for Default-mode Diversity and Generalization in Text-to-Image Generative Models
Revant Teotia, Candace Ross, Karen Ullrich et al.
DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing
Yang JingYi, Xun Lin, Zitong YU et al.
Quantifying and Narrowing the Unknown: Interactive Text-to-Video Retrieval via Uncertainty Minimization
Bingqing Zhang, Zhuo Cao, Heming Du et al.
Timestep-Aware Diffusion Model for Extreme Image Rescaling
Ce Wang, Zhenyu Hu, Wanjie Sun et al.
Adversarial Attention Perturbations for Large Object Detection Transformers
Zachary Yahn, Selim Tekin, Fatih Ilhan et al.
HiGarment: Cross-modal Harmony Based Diffusion Model for Flat Sketch to Realistic Garment Image
Junyi Guo, Jingxuan Zhang, Fangyu Wu et al.
What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models
Lorenzo Baraldi, Davide Bucciarelli, Federico Betti et al.
DALIP: Distribution Alignment-based Language-Image Pre-Training for Domain-Specific Data
Junjie Wu, Jiangtao Xie, Zhaolin Zhang et al.
Stronger, Steadier & Superior: Geometric Consistency in Depth VFM Forges Domain Generalized Semantic Segmentation
Siyu Chen, Ting Han, Changshe Zhang et al.
Aligning Constraint Generation with Design Intent in Parametric CAD
Evan Casey, Tianyu Zhang, Shu Ishida et al.
CCL-LGS: Contrastive Codebook Learning for 3D Language Gaussian Splatting
Lei Tian, Xiaomin Li, Liqian Ma et al.
SAGI: Semantically Aligned and Uncertainty Guided AI Image Inpainting
Paschalis Giakoumoglou, Dimitrios Karageorgiou, Symeon Papadopoulos et al.
MUSE: Multi-Subject Unified Synthesis via Explicit Layout Semantic Expansion
Fei Peng, Junqiang Wu, Yan Li et al.
StyleKeeper: Prevent Content Leakage using Negative Visual Query Guidance
Jaeseok Jeong, Junho Kim, Youngjung Uh et al.
Bootstrap3D: Improving Multi-view Diffusion Model with Synthetic Data
Zeyi Sun, Tong Wu, Pan Zhang et al.
MoGA: 3D Generative Avatar Prior for Monocular Gaussian Avatar Reconstruction
Zijian Dong, Longteng Duan, Jie Song et al.
MotionShot: Adaptive Motion Transfer across Arbitrary Objects for Text-to-Video Generation
Yanchen Liu, Yanan SUN, Zhening Xing et al.
GroundFlow: A Plug-in Module for Temporal Reasoning on 3D Point Cloud Sequential Grounding
Zijun Lin, Shuting He, Cheston Tan et al.
SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation
Shiqi Huang, Shuting He, Huaiyuan Qin et al.
RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions
Bimsara Pathiraja, Maitreya Patel, Shivam Singh et al.
InterSyn: Interleaved Learning for Dynamic Motion Synthesis in the Wild
Yiyi Ma, Yuanzhi Liang, Xiu Li et al.
SPADE: Spatial-Aware Denoising Network for Open-vocabulary Panoptic Scene Graph Generation with Long- and Local-range Context Reasoning
XIN Hu, Ke Qin, Guiduo Duan et al.
Towards a 3D Transfer-based Black-box Attack via Critical Feature Guidance
Shuchao Pang, Zhenghan Chen, Shen Zhang et al.
DepthSync: Diffusion Guidance-Based Depth Synchronization for Scale- and Geometry-Consistent Video Depth Estimation
Yue-Jiang Dong, Wang Zhao, Jiale Xu et al.
PathDiff: Histopathology Image Synthesis with Unpaired Text and Mask Conditions
Mahesh Bhosale, Abdul Wasi, Yuanhao Zhai et al.
SDMatte: Grafting Diffusion Models for Interactive Matting
Longfei Huang, Yu Liang, Hao Zhang et al.
Retinex-MEF: Retinex-based Glare Effects Aware Unsupervised Multi-Exposure Image Fusion
Haowen Bai, Jiangshe Zhang, Zixiang Zhao et al.
When Schrödinger Bridge Meets Real-World Image Dehazing with Unpaired Training
Yunwei Lan, Zhigao Cui, Xin Luo et al.
Hawaii: Hierarchical Visual Knowledge Transfer for Efficient Vision-Language Models
Yimu Wang, Mozhgan Nasr Azadani, Sean Sedwards et al.
When Kernels Multiply, Clusters Unify: Fusing Embeddings with the Kronecker Product
Youqi WU, Jingwei Zhang, Farzan Farnia
Limitations of Normalization in Attention
Timur Mudarisov, Mikhail Burtsev, Tatiana Petrova et al.
MatchDiffusion: Training-free Generation of Match-Cuts
Alejandro Pardo, Fabio Pizzati, Tong Zhang et al.
Information Theoretic Learning for Diffusion Models with Warm Start
Yirong Shen, Lu GAN, Cong Ling
Resource-Constrained Federated Continual Learning: What Does Matter?
Yichen Li, Yuying Wang, Jiahua Dong et al.
V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation
Hanyue Lou, Jinxiu Liang, Minggui Teng et al.
IntroStyle: Training-Free Introspective Style Attribution using Diffusion Features
Anand Kumar, Jiteng Mu, Nuno Vasconcelos
AffordBot: 3D Fine-grained Embodied Reasoning via Multimodal Large Language Models
Xinyi Wang, Xun Yang, Yanlong Xu et al.
Enhancing Sample Selection Against Label Noise by Cutting Mislabeled Easy Examples
Suqin Yuan, Lei Feng, Bo Han et al.
NAUTILUS: A Large Multimodal Model for Underwater Scene Understanding
Wei Xu, Cheng Wang, Dingkang Liang et al.
Physics-informed Reduced Order Modeling of Time-dependent PDEs via Differentiable Solvers
Nima Hosseini Dashtbayaz, Hesam Salehipour, Adrian Butscher et al.
LUT-Fuse: Towards Extremely Fast Infrared and Visible Image Fusion via Distillation to Learnable Look-Up Tables
Xunpeng Yi, yibing zhang, Xinyu Xiang et al.
The Promise of RL for Autoregressive Image Editing
Saba Ahmadi, Rabiul Awal, Ankur Sikarwar et al.
Proximalized Preference Optimization for Diverse Feedback Types: A Decomposed Perspective on DPO
Kaiyang Guo, Yinchuan Li, Zhitang Chen
PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning
Zongqian Li, Yixuan Su, Nigel Collier
Gradient Variance Reveals Failure Modes in Flow-Based Generative Models
Teodora Reu, Sixtine Dromigny, Michael Bronstein et al.
Attack by Yourself: Effective and Unnoticeable Multi-Category Graph Backdoor Attacks with Subgraph Triggers Pool
Jiangtong Li, Dongyi Liu, Kun Zhu et al.
SPARKE: Scalable Prompt-Aware Diversity and Novelty Guidance in Diffusion Models via RKE Score
Mohammad Jalali, Haoyu Lei, Amin Gohari et al.
Reward-Aware Proto-Representations in Reinforcement Learning
Hon Tik Tse, Siddarth Chandrasekar, Marlos C. Machado
Music-Aligned Holistic 3D Dance Generation via Hierarchical Motion Modeling
LI XIAOJIE, Ronghui Li, Shukai Fang et al.
TRIDENT: Tri-Modal Molecular Representation Learning with Taxonomic Annotations and Local Correspondence
Feng Jiang, Mangal Prakash, Hehuan Ma et al.
Towards Self-Refinement of Vision-Language Models with Triangular Consistency
Yunlong Deng, Guangyi Chen, Tianpei Gu et al.
A Diffusion Model for Regular Time Series Generation from Irregular Data with Completion and Masking
Gal Fadlon, Idan Arbiv, Nimrod Berman et al.
Towards Explicit Exoskeleton for the Reconstruction of Complicated 3D Human Avatars
Yifan Zhan, Qingtian Zhu, Muyao Niu et al.
On-Device Diffusion Transformer Policy for Efficient Robot Manipulation
Yiming Wu, Huan Wang, Zhenghao Chen et al.
Metric Convolutions: A Unifying Theory to Adaptive Image Convolutions
Thomas Dagès, Michael Lindenbaum, Alfred Bruckstein
A duality framework for analyzing random feature and two-layer neural networks
Hongrui Chen, Jihao Long, Lei Wu
IMPROVED LEARNING THEORY FOR KERNEL DISTRIBUTION REGRESSION WITH TWO-STAGE SAMPLING
Alberto González-Sanz, François Bachoc, Jean-Michel Loubes et al.
AI Testing Should Account for Sophisticated Strategic Behaviour
Vojta Kovarik, Eric Chen, Sami Petersen et al.
Rigor in AI: Doing Rigorous AI Work Requires a Broader, Responsible AI-Informed Conception of Rigor
Alexandra Olteanu, Su Lin Blodgett, Agathe Balayn et al.
VoluMe – Authentic 3D Video Calls from Live Gaussian Splat Prediction
Martin de La Gorce, Charlie Hewitt, Tibor Takács et al.
OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models
Ziheng Cheng, Yixiao Huang, Hui Xu et al.
Struct-Bench: A Benchmark for Differentially Private Structured Text Generation
Shuaiqi Wang, Vikas Raunak, Arturs Backurs et al.
Factorio Learning Environment
Jack Hopkins, Mart Bakler, Akbir Khan
Dense Backpropagation Improves Training for Sparse Mixture-of-Experts
Ashwinee Panda, Vatsal Baherwani, Zain Sarwar et al.
Do You Really Need Public Data? Surrogate Public Data for Differential Privacy on Tabular Data
Shlomi Hod, Lucas Rosenblatt, Julia Stoyanovich
Trokens: Semantic-Aware Relational Trajectory Tokens for Few-Shot Action Recognition
Pulkit Kumar, Shuaiyi Huang, Matthew Walmer et al.
DisenQ: Disentangling Q-Former for Activity-Biometrics
Shehreen Azad, Yogesh Rawat
QUT-DV25: A Dataset for Dynamic Analysis of Next-Gen Software Supply Chain Attacks
Sk Tanzir Mehedi, Raja Jurdak, Chadni Islam et al.
NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval
Zengrong Lin, Zheng Wang, Tianwen Qian et al.
SVG-Head: Hybrid Surface-Volumetric Gaussians for High-Fidelity Head Reconstruction and Real-Time Editing
Heyi Sun, Cong Wang, Tian-Xing Xu et al.
ChartCap: Mitigating Hallucination of Dense Chart Captioning
Junyoung Lim, Jaewoo Ahn, Gunhee Kim
AgMMU: A Comprehensive Agricultural Multimodal Understanding Benchmark
Aruna Gauba, Irene Pi, Yunze Man et al.
MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models
Hang Hua, Ziyun Zeng, Yizhi Song et al.
Clean First, Align Later: Benchmarking Preference Data Cleaning for Reliable LLM Alignment
Samuel (Min-Hsuan) Yeh, Sharon Li
Disentanglement Beyond Static vs. Dynamic: A Benchmark and Evaluation Framework for Multi-Factor Sequential Representations
Tal Barami, Nimrod Berman, Ilan Naiman et al.
CosmoBench: A Multiscale, Multiview, Multitask Cosmology Benchmark for Geometric Deep Learning
Teresa Huang, Richard Stiskalek, Jun-Young Lee et al.
Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy and Research
A. Feder Cooper, Christopher A. Choquette-Choo, Miranda Bogen et al.
MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs
Tianhao Peng, Haochen Wang, Yuanxing Zhang et al.
IMoRe: Implicit Program-Guided Reasoning for Human Motion Q&A
Chen Li, Chinthani Sugandhika, Ee Yeo Keat et al.
Blind2Sound: Self-Supervised Image Denoising without Residual Noise
Jiazheng Liu, Zejin Wang, Bohao Chen et al.
Deferring Concept Bottleneck Models: Learning to Defer Interventions to Inaccurate Experts
Andrea Pugnana, Riccardo Massidda, Francesco Giannini et al.
EngiBench: A Framework for Data-Driven Engineering Design Research
Florian Felten, Gabriel Apaza, Gerhard Bräunlich et al.
Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images
Boyang Deng, Kyle Genova, Songyou Peng et al.
MTBBench: A Multimodal Sequential Clinical Decision-Making Benchmark in Oncology
Kiril Vasilev, Alexandre Misrahi, Eeshaan Jain et al.
BenchmarkCards: Standardized Documentation for Large Language Model Benchmarks
Anna Sokol, Elizabeth Daly, Michael Hind et al.
Bridging the Skeleton-Text Modality Gap: Diffusion-Powered Modality Alignment for Zero-shot Skeleton-based Action Recognition
Jeonghyeok Do, Munchurl Kim
PERSONA: Personalized Whole-Body 3D Avatar with Pose-Driven Deformations from a Single Image
Geonhee Sim, Gyeongsik Moon
LIFEBENCH: Evaluating Length Instruction Following in Large Language Models
Wei Zhang, Zhenhong Zhou, Kun Wang et al.
ExAct: A Video-Language Benchmark for Expert Action Analysis
Han Yi, Yulu Pan, Feihong He et al.
BackdoorDM: A Comprehensive Benchmark for Backdoor Learning on Diffusion Model
Weilin Lin, Nanjun Zhou, Yanyun Wang et al.
Measuring Fingerprints of Web-filtered Text Datasets and Fingerprint Propagation Through Training
Youssef Mansour, Reinhard Heckel
Alchemist: Turning Public Text-to-Image Data into Generative Gold
Valerii Startsev, Alexander Ustyuzhanin, Alexey Kirillov et al.
Synchronization of Multiple Videos
Avihai Naaman, Ron Shapira Weber, Oren Freifeld
AstroVisBench: A Code Benchmark for Scientific Computing and Visualization in Astronomy
Sebastian Joseph, Syed M. Husain, Stella Offner et al.
What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning
Chi-Hsi Kung, Frangil Ramirez, Juhyung Ha et al.
PhysGym: Benchmarking LLMs in Interactive Physics Discovery with Controlled Priors
Yimeng Chen, Piotr Piękos, Mateusz Ostaszewski et al.
The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements
Bingchen Zhao, Despoina Magka, Minqi Jiang et al.
DeceptionBench: A Comprehensive Benchmark for AI Deception Behaviors in Real-world Scenarios
Yao Huang, Yitong Sun, Yichi Zhang et al.
A Practical Guide for Incorporating Symmetry in Diffusion Policy
Dian Wang, Boce Hu, Shuran Song et al.
SVRPBench: A Realistic Benchmark for Stochastic Vehicle Routing Problem
Ahmed Heakl, Yahia Salaheldin Shaaban, Salem Lahlou et al.
SmallKV: Small Model Assisted Compensation of KV Cache Compression for Efficient LLM Inference
Yi Zhao, Yajuan Peng, Nguyen Cam-Tu et al.