Most Cited 2025 "safety benchmarks" Papers
22,274 papers found • Page 112 of 112
Conference
Continuous Bayesian Model Selection for Multivariate Causal Discovery
Anish Dhir, Ruby Sedgwick, Avinash Kori et al.
Pretraining Generative Flow Networks with Inexpensive Rewards for Molecular Graph Generation
Mohit Pandey, Gopeshh Subbaraj, Artem Cherkasov et al.
Visual Attention Never Fades: Selective Progressive Attention ReCalibration for Detailed Image Captioning in Multimodal Large Language Models
Mingi Jung, Saehyung Lee, Eunji Kim et al.
Text-to-LoRA: Instant Transformer Adaption
Rujikorn Charakorn, Edoardo Cetin, Yujin Tang et al.
On the Tension between Byzantine Robustness and No-Attack Accuracy in Distributed Learning
Yi-Rui Yang, Chang-Wei Shi, Wu-Jun Li
Can Large Language Models Understand Intermediate Representations in Compilers?
Hailong Jiang, Jianfeng Zhu, Yao Wan et al.
Editable Noise Map Inversion: Encoding Target-image into Noise For High-Fidelity Image Manipulation
Mingyu Kang, Yong Suk Choi
Learnware Specification via Dual Alignment
Wei Chen, Jun-Xiang Mao, Xiaozheng Wang et al.
On the Importance of Gaussianizing Representations
Daniel Eftekhari, Vardan Papyan
CodeSync: Synchronizing Large Language Models with Dynamic Code Evolution at Scale
Chenlong Wang, Zhaoyang Chu, Zhengxiang Cheng et al.
Adversarial Inception Backdoor Attacks against Reinforcement Learning
Ethan Rathbun, Alina Oprea, Christopher Amato
Gradient Flow Provably Learns Robust Classifiers for Orthonormal GMMs
Hancheng Min, Rene Vidal
The underlying structures of self-attention: symmetry, directionality, and emergent dynamics in Transformer training
Matteo Saponati, Pascal J. Sager, Pau Vilimelis Aceituno et al.
Novelty Detection in Reinforcement Learning with World Models
Geigh Zollicoffer, Kenneth Eaton, Jonathan Balloch et al.
Relational Invariant Learning for Robust Solvation Free Energy Prediction
Yeyun Chen
Unconstrained Robust Online Convex Optimization
Jiujia Zhang, Ashok Cutkosky
The Hidden Dimensions of LLM Alignment: A Multi-Dimensional Analysis of Orthogonal Safety Directions
Wenbo Pan, Zhichao Liu, Qiguang Chen et al.
Outlier-Aware Post-Training Quantization for Discrete Graph Diffusion Models
Zheng Gong, Ying Sun
ReverB-SNN: Reversing Bit of the Weight and Activation for Spiking Neural Networks
Yufei Guo, Yuhan Zhang, Zhou Jie et al.
Scaling Laws for Forgetting during Finetuning with Pretraining Data Injection
Louis Béthune, David Grangier, Dan Busbridge et al.
Relative Error Fair Clustering in the Weak-Strong Oracle Model
Vladimir Braverman, Prathamesh Dharangutte, Shaofeng Jiang et al.
Doubly Protected Estimation for Survival Outcomes Utilizing External Controls for Randomized Clinical Trials
Chenyin Gao, Shu Yang, Mingyang Shan et al.
Outlier Gradient Analysis: Efficiently Identifying Detrimental Training Samples for Deep Learning Models
Anshuman Chhabra, Bo Li, Jian Chen et al.
BaWA: Automatic Optimizing Pruning Metric for Large Language Models with Balanced Weight and Activation
Lian Liu, Xiandong Zhao, Guanchen Li et al.
Geometric Contact Flows: Contactomorphisms for Dynamics and Control
Andrea Testa, Søren Hauberg, Tamim Asfour et al.
Neural Genetic Search in Discrete Spaces
Hyeonah Kim, Sanghyeok Choi, Jiwoo Son et al.
Sparse Video-Gen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
Haocheng Xi, Shuo Yang, Yilong Zhao et al.
PieClam: A Universal Graph Autoencoder Based on Overlapping Inclusive and Exclusive Communities
Daniel Zilberg, Ron Levie
Efficient Source-free Unlearning via Energy-Guided Data Synthesis and Discrimination-Aware Multitask Optimization
Xiuyuan Wang, Chaochao Chen, Weiming Liu et al.
Ad Hoc Teamwork via Offline Goal-Based Decision Transformers
Xinzhi Zhang, Hoehi Chan, Deheng Ye et al.
AlphaVerus: Bootstrapping Formally Verified Code Generation through Self-Improving Translation and Treefinement
Pranjal Aggarwal, Bryan Parno, Sean Welleck
The Ripple Effect: On Unforeseen Complications of Backdoor Attacks
Rui Zhang, Yun Shen, Hongwei Li et al.
A Sharper Global Convergence Analysis for Average Reward Reinforcement Learning via an Actor-Critic Approach
Swetha Ganesh, Washim Mondal, Vaneet Aggarwal
Contrastive Localized Language-Image Pre-Training
Hong-You Chen, Zhengfeng Lai, Haotian Zhang et al.
Implicit Bias of Gradient Descent for Non-Homogeneous Deep Networks
Yuhang Cai, Kangjie Zhou, Jingfeng Wu et al.
Improved and Oracle-Efficient Online $\ell_1$-Multicalibration
Rohan Ghuge, Vidya Muthukumar, Sahil Singla
One Diffusion Step to Real-World Super-Resolution via Flow Trajectory Distillation
Jianze Li, Jiezhang Cao, Yong Guo et al.
Capturing Temporal Dynamics in Large-Scale Canopy Tree Height Estimation
Jan Pauls, Max Zimmer, Berkant Turan et al.
Towards the Efficient Inference by Incorporating Automated Computational Phenotypes under Covariate Shift
chao ying, Jun Jin, Yi Guo et al.
A Sample Efficient Conditional Independence Test in the Presence of Discretization
Boyang Sun, Yu Yao, Xinshuai Dong et al.
"Why Is There a Tumor?": Tell Me the Reason, Show Me the Evidence
Mengmeng Ma, Tang Li, Yunxiang Peng et al.
Diverging Preferences: When do Annotators Disagree and do Models Know?
Michael Zhang, Zhilin Wang, Jena Hwang et al.
Self-cross Feature based Spiking Neural Networks for Efficient Few-shot Learning
Qi Xu, Junyang Zhu, Dongdong Zhou et al.
Active Treatment Effect Estimation via Limited Samples
Zhiheng Zhang, Haoxiang Wang, Haoxuan Li et al.
Random Policy Evaluation Uncovers Policies of Generative Flow Networks
Haoran He, Emmanuel Bengio, Qingpeng Cai et al.
Generalized Category Discovery via Reciprocal Learning and Class-Wise Distribution Regularization
Duo Liu, Zhiquan Tan, Linglan Zhao et al.
Inductive Gradient Adjustment for Spectral Bias in Implicit Neural Representations
Kexuan Shi, Hai Chen, Leheng Zhang et al.
Learning Efficient Robotic Garment Manipulation with Standardization
zhou changshi, Feng Luan, hujiarui et al.
Efficient Heterogeneity-Aware Federated Active Data Selection
Yingpeng Tang, Chao Ren, Xiaoli Tang et al.
Preconditioned Riemannian Gradient Descent Algorithm for Low-Multilinear-Rank Tensor Completion
Yuanwei Zhang, Fengmiao Bian, Xiaoqun Zhang et al.
Grammar-Forced Translation of Natural Language to Temporal Logic using LLMs
William English, Dominic Simon, Sumit Jha et al.
Equivariant Neural Tangent Kernels
Philipp Misof, Pan Kessel, Jan Gerken
Empowering World Models with Reflection for Embodied Video Prediction
Xiaowei Chi, Chun-Kai Fan, Hengyuan Zhang et al.
LoRA-Gen: Specializing Large Language Model via Online LoRA Generation
Yicheng Xiao, Lin Song, Rui Yang et al.
Protein Structure Tokenization: Benchmarking and New Recipe
Xinyu Yuan, Zichen Wang, Marcus Collins et al.
How to Move Your Dragon: Text-to-Motion Synthesis for Large-Vocabulary Objects
Wonkwang Lee, Jongwon Jeong, Taehong Moon et al.
Efficiently Serving Large Multimodal Models Using EPD Disaggregation
Gursimran Singh, Xinglu Wang, Yifan Hu et al.
Tensorized Multi-View Multi-Label Classification via Laplace Tensor Rank
Qiyu Zhong, Yi Shan, Haobo Wang et al.
Learning to Keep a Promise: Scaling Language Model Decoding Parallelism with Learned Asynchronous Decoding
Tian Jin, Ellie Cheng, Zachary Ankner et al.
Learning Multi-Level Features with Matryoshka Sparse Autoencoders
Bart Bussmann, Noa Nabeshima, Adam Karvonen et al.
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Roman Abramov, Felix Steinbauer, Gjergji Kasneci
Code-Generated Graph Representations Using Multiple LLM Agents for Material Properties Prediction
Jiao Huang, Qianli Xing, Jinglong Ji et al.
FeatSharp: Your Vision Model Features, Sharper
Mike Ranzinger, Greg Heinrich, Pavlo Molchanov et al.
Private Lossless Multiple Release
Joel Daniel Andersson, Lukas Retschmeier, Boel Nelson et al.
Disentangling and Integrating Relational and Sensory Information in Transformer Architectures
Awni Altabaa, John Lafferty
Rethinking External Slow-Thinking: From Snowball Errors to Probability of Correct Reasoning
Zeyu Gan, Yun Liao, Yong Liu
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models
Samira Abnar, Harshay Shah, Dan Busbridge et al.
A Reductions Approach to Risk-Sensitive Reinforcement Learning with Optimized Certainty Equivalents
Kaiwen Wang, Dawen Liang, Nathan Kallus et al.
LIMEFLDL: A Local Interpretable Model-Agnostic Explanations Approach for Label Distribution Learning
Xiuyi Jia, Jinchi Li, Yunan Lu et al.
Hardware and Software Platform Inference
Cheng Zhang, Hanna Foerster, Robert Mullins et al.
Nonlinearly Preconditioned Gradient Methods under Generalized Smoothness
Konstantinos Oikonomidis, Jan Quan, Emanuel Laude et al.
What Limits Bidirectional Model's Generative Capabilities? A Uni-Bi-Directional Mixture-of-Expert Method For Bidirectional Fine-tuning
Zuchao Li, Yonghua Hei, Qiwei Li et al.
How to Evaluate and Mitigate IP Infringement in Visual Generative AI?
Zhenting Wang, Chen Chen, Vikash Sehwag et al.
Locate-then-edit for Multi-hop Factual Recall under Knowledge Editing
Zhuoran Zhang, Yongxiang Li, Zijian Kan et al.