🧬Applications

Scientific Machine Learning

ML for scientific computing and discovery

100 papers1,520 total citations
Compare with other topics
Feb '24 Jan '26345 papers
Also includes: scientific machine learning, physics-informed, partial differential equations, pde, scientific computing

Top Papers

#1

Fast Machine Unlearning without Retraining through Selective Synaptic Dampening

Jack Foster, Stefan Schoepf, Alexandra Brintrup

AAAI 2024arXiv:2308.07707
machine unlearningselective synaptic dampeningfisher information matrixpost hoc unlearning+3
170
citations
#2

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

Jun Shern Chan, Neil Chowdhury, Oliver Jaffe et al.

ICLR 2025
127
citations
#3

Learning Smooth and Expressive Interatomic Potentials for Physical Property Prediction

Xiang Fu, Brandon Wood, Luis Barroso-Luque et al.

ICML 2025
87
citations
#4

MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine

Renrui Zhang, Xinyu Wei, Dongzhi Jiang et al.

ICLR 2025
74
citations
#5

Scaling Laws for Data Filtering— Data Curation cannot be Compute Agnostic

Sachin Goyal, Pratyush Maini, Zachary Lipton et al.

CVPR 2024
67
citations
#6

SWE-smith: Scaling Data for Software Engineering Agents

John Yang, Kilian Lieret, Carlos Jimenez et al.

NeurIPS 2025
64
citations
#7

DSBench: How Far Are Data Science Agents from Becoming Data Science Experts?

Liqiang Jing, Zhehui Huang, Xiaoyang Wang et al.

ICLR 2025arXiv:2409.07703
data science agentslarge language modelslarge vision-language modelsdata analysis tasks+4
62
citations
#8

CycleResearcher: Improving Automated Research via Automated Review

Yixuan Weng, Minjun Zhu, Guangsheng Bao et al.

ICLR 2025
62
citations
#9

LLM-SR: Scientific Equation Discovery via Programming with Large Language Models

Parshin Shojaee, Kazem Meidani, Shashank Gupta et al.

ICLR 2025
55
citations
#10

MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled Images

Xurui Li, Ziming Huang, Feng Xue et al.

ICLR 2024
55
citations
#11

STI-Bench: Are MLLMs Ready for Precise Spatial-Temporal World Understanding?

Yun Li, Yiming Zhang, Tao Lin et al.

ICCV 2025
36
citations
#12

DiscoveryBench: Towards Data-Driven Discovery with Large Language Models

Bodhisattwa Prasad Majumder, Harshit Surana, Dhruv Agarwal et al.

ICLR 2025arXiv:2407.01725
data-driven discoverylarge language modelscode generationfunction calling+4
36
citations
#13

Scaling Wearable Foundation Models

Girish Narayanswamy, Xin Liu, Kumar Ayush et al.

ICLR 2025
33
citations
#14

The dark side of the forces: assessing non-conservative force models for atomistic machine learning

Filippo Bigi, Marcel Langer, Michele Ceriotti

ICML 2025
29
citations
#15

From Mechanistic Interpretability to Mechanistic Biology: Training, Evaluating, and Interpreting Sparse Autoencoders on Protein Language Models

Etowah Adams, Liam Bai, Minji Lee et al.

ICML 2025
28
citations
#16

Machine Unlearning Fails to Remove Data Poisoning Attacks

Martin Pawelczyk, Jimmy Di, Yiwei Lu et al.

ICLR 2025
28
citations
#17

MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code

Zimu Lu, Aojun Zhou, Ke Wang et al.

ICLR 2025arXiv:2410.08196
mathematical reasoningcode generationcontinued pretrainingsynthetic data generation+2
27
citations
#18

HyperFast: Instant Classification for Tabular Data

David Bonet, Daniel Mas Montserrat, Xavier Giró-i-Nieto et al.

AAAI 2024arXiv:2402.14335
tabular data classificationhypernetwork architecturemeta-trained modelsinstant inference+4
26
citations
#19

Learning to design protein-protein interactions with enhanced generalization

Anton Bushuiev, Roman Bushuiev, Petr Kouba et al.

ICLR 2024
25
citations
#20

ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World

Weixiang Yan, Haitian Liu, Tengxiao Wu et al.

NeurIPS 2025
22
citations
#21

Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning

Somnath Basu Roy Chowdhury, Krzysztof Choromanski, Arijit Sehanobish et al.

ICLR 2025arXiv:2406.16257
machine unlearningexact unlearningparameter-efficient fine-tuningparameter isolation+4
22
citations
#22

Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians

Ishan Amin, Sanjeev Raja, Aditi Krishnapriyan

ICLR 2025
21
citations
#23

Adaptive Message Passing: A General Framework to Mitigate Oversmoothing, Oversquashing, and Underreaching

Federico Errica, Henrik Christiansen, Viktor Zaverkin et al.

ICML 2025
19
citations
#24

Benchmarking Predictive Coding Networks -- Made Simple

Luca Pinchetti, Chang Qi, Oleh Lokshyn et al.

ICLR 2025arXiv:2407.01163
predictive coding networksbio-plausible deep learningscalability in pcnsbenchmarking neural networks+1
18
citations
#25

Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data

Manuel Brenner, Elias Weber, Georgia Koppe et al.

ICLR 2025
16
citations
#26

AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench

Edan Toledo, Karen Hambardzumyan, Martin Josifoski et al.

NeurIPS 2025arXiv:2507.02554
ai research agentsautomated machine learningsearch policiesmcts algorithms+4
15
citations
#27

Learning MDL Logic Programs from Noisy Data

Céline Hocquette, Andreas Niskanen, Matti Järvisalo et al.

AAAI 2024arXiv:2308.09393
inductive logic programmingminimal description lengthnoisy data learningrecursive program synthesis+2
15
citations
#28

Apollo-MILP: An Alternating Prediction-Correction Neural Solving Framework for Mixed-Integer Linear Programming

Haoyang Liu, Jie Wang, Zijie Geng et al.

ICLR 2025arXiv:2503.01129
mixed-integer linear programmingneural solving frameworktrust-region searchproblem reduction+4
15
citations
#29

BatteryML: An Open-source Platform for Machine Learning on Battery Degradation

Han Zhang, Xiaofan Gui, Shun Zheng et al.

ICLR 2024
11
citations
#30

LiveXiv - A Multi-Modal live benchmark based on Arxiv papers content

Nimrod Shabtay, Felipe Maia Polo, Sivan Doveh et al.

ICLR 2025arXiv:2410.10783
multi-modal modelsvisual question answeringtest data contaminationscientific document understanding+4
11
citations
#31

Adaptive Self-improvement LLM Agentic System for ML Library Development

Genghan Zhang, Weixin Liang, Olivia Hsu et al.

ICML 2025
10
citations
#32

The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization

Jae-Won Chung, Jeff J. Ma, Ruofan Wu et al.

NeurIPS 2025arXiv:2505.06371
inference energy consumptionenergy measurement benchmarkgenerative ai servicesautomated optimization recommendations+2
10
citations
#33

MindLLM: A Subject-Agnostic and Versatile Model for fMRI-to-text Decoding

Weikang Qiu, Zheng Huang, Haoyu Hu et al.

ICML 2025
10
citations
#34

Fast training and sampling of Restricted Boltzmann Machines

Nicolas BEREUX, Aurélien Decelle, Cyril Furtlehner et al.

ICLR 2025
10
citations
#35

Deep Nonlinear Sufficient Dimension Reduction

Yinfeng Chen, Yuling Jiao, Rui Qiu et al.

NeurIPS 2025
9
citations
#36

Neural Auto-designer for Enhanced Quantum Kernels

Cong Lei, Yuxuan Du, Peng Mi et al.

ICLR 2024
8
citations
#37

Bridging the Semantic Latent Space between Brain and Machine: Similarity Is All You Need

Jiaxuan Chen, Yu Qi, Yueming Wang et al.

AAAI 2024
8
citations
#38

On Harmonizing Implicit Subpopulations

Feng Hong, Jiangchao Yao, YUEMING LYU et al.

ICLR 2024
8
citations
#39

Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development

Daoyuan Chen, Haibin Wang, Yilun Huang et al.

ICML 2025
7
citations
#40

FlashMD: long-stride, universal prediction of molecular dynamics

Filippo Bigi, Sanggyu Chong, Agustinus Kristiadi et al.

NeurIPS 2025arXiv:2505.19350
molecular dynamics simulationhamiltonian dynamicsthermodynamic ensembleslong-stride prediction+3
7
citations
#41

Learning-Augmented Search Data Structures

Chunkai Fu, Brandon G. Nguyen, Jung Seo et al.

ICLR 2025arXiv:2402.10457
learning-augmented algorithmssearch data structuresskip listskd trees+4
6
citations
#42

Causal Discovery from Conditionally Stationary Time Series

Carles Balsells-Rodas, Xavier Sumba, Tanmayee Narendra et al.

ICML 2025
6
citations
#43

Position: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process

Jing Yang

ICML 2025
6
citations
#44

ML-SemReg: Boosting Point Cloud Registration with Multi-level Semantic Consistency

Shaocheng Yan, Pengcheng Shi, Jiayuan Li

ECCV 2024
6
citations
#45

Learning a Neural Solver for Parametric PDEs to Enhance Physics-Informed Methods

Lise Le Boudec, Emmanuel de Bézenac, Louis Serrano et al.

ICLR 2025
6
citations
#46

Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models

Daoyuan Chen, Yilun Huang, Xuchen Pan et al.

NeurIPS 2025
6
citations
#47

Scalable Bayesian Learning with posteriors

Samuel Duffield, Kaelan Donatella, Johnathan Chiu et al.

ICLR 2025
6
citations
#48

PINNsAgent: Automated PDE Surrogation with Large Language Models

Qingpo Wuwu, Chonghan Gao, Tianyu Chen et al.

ICML 2025
5
citations
#49

Understanding Generalization in Quantum Machine Learning with Margins

TAK HUR, Daniel Kyungdeock Park

ICML 2025
5
citations
#50

DOLPHIN: A Programmable Framework for Scalable Neurosymbolic Learning

Aaditya Naik, Jason Liu, Claire Wang et al.

ICML 2025
5
citations
#51

In-Context Learning of Stochastic Differential Equations with Foundation Inference Models

Patrick Seifner, Kostadin Cvejoski, David Berghaus et al.

NeurIPS 2025
5
citations
#52

SymMaP: Improving Computational Efficiency in Linear Solvers through Symbolic Preconditioning

Hong Wang, Jie Wang, Minghao Ma et al.

NeurIPS 2025
5
citations
#53

Scaling Physical Reasoning with the PHYSICS Dataset

Shenghe Zheng, Qianjia Cheng, Junchi Yao et al.

NeurIPS 2025
5
citations
#54

X-Hacking: The Threat of Misguided AutoML

Rahul Sharma, Sumantrak Mukherjee, Andrea Šipka et al.

ICML 2025
4
citations
#55

MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering

Rushi Qiang, Yuchen Zhuang, Yinghao Li et al.

NeurIPS 2025
4
citations
#56

Towards Establishing Guaranteed Error for Learned Database Operations

Sepanta Zeighami, Cyrus Shahabi

ICLR 2024
4
citations
#57

An LLM-Empowered Adaptive Evolutionary Algorithm for Multi-Component Deep Learning Systems

Haoxiang Tian, Xingshuo Han, Guoquan Wu et al.

AAAI 2025
4
citations
#58

Online Continuous Generalized Category Discovery

Keon-Hee Park, Hakyung Lee, Kyungwoo Song et al.

ECCV 2024
4
citations
#59

Epistemic Monte Carlo Tree Search

Yaniv Oren, Viliam Vadocz, Matthijs T. J. Spaan et al.

ICLR 2025
4
citations
#60

SnowMaster: Comprehensive Real-world Image Desnowing via MLLM with Multi-Model Feedback Optimization

Jianyu LAI, Sixiang Chen, yunlong lin et al.

CVPR 2025
4
citations
#61

Causal-StoNet: Causal Inference for High-Dimensional Complex Data

Yaxin Fang, Faming Liang

ICLR 2024
4
citations
#62

Open-Insect: Benchmarking Open-Set Recognition of Novel Species in Biodiversity Monitoring

Yuyan Chen, Nico Lang, B. Schmidt et al.

NeurIPS 2025
3
citations
#63

MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?

Yunxiang Zhang, Muhammad Khalifa, Shitanshu Bhushan et al.

NeurIPS 2025
3
citations
#64

Microcanonical Langevin Ensembles: Advancing the Sampling of Bayesian Neural Networks

Emanuel Sommer, Jakob Robnik, Giorgi Nozadze et al.

ICLR 2025
3
citations
#65

ECD: A Machine Learning Benchmark for Predicting Enhanced-Precision Electronic Charge Density in Crystalline Inorganic Materials

Pin Chen, Zexin Xu, Qing Mo et al.

ICLR 2025
electronic charge densitydensity functional theorycrystalline materialsmachine learning prediction+3
3
citations
#66

No Equations Needed: Learning System Dynamics Without Relying on Closed-Form ODEs

Krzysztof Kacprzyk, Mihaela van der Schaar

ICLR 2025
3
citations
#67

AutoDiscovery: Open-ended Scientific Discovery via Bayesian Surprise

Dhruv Agarwal, Bodhisattwa Prasad Majumder, Reece Adamson et al.

NeurIPS 2025arXiv:2507.00310
autonomous scientific discoverybayesian surprisemonte carlo tree searchhypothesis generation+4
3
citations
#68

PhysGym: Benchmarking LLMs in Interactive Physics Discovery with Controlled Priors

Yimeng Chen, Piotr Piękos, Mateusz Ostaszewski et al.

NeurIPS 2025
2
citations
#69

Common Task Framework For a Critical Evaluation of Scientific Machine Learning Algorithms

Philippe Wyder, Judah Goldfeder, Alexey Yermakov et al.

NeurIPS 2025
2
citations
#70

AstroVisBench: A Code Benchmark for Scientific Computing and Visualization in Astronomy

Sebastian Joseph, Syed M. Husain, Stella Offner et al.

NeurIPS 2025
2
citations
#71

CosmoBench: A Multiscale, Multiview, Multitask Cosmology Benchmark for Geometric Deep Learning

Teresa Huang, Richard Stiskalek, Jun-Young Lee et al.

NeurIPS 2025
2
citations
#72

Prices, Bids, Values: One ML-Powered Combinatorial Auction to Rule Them All

Ermis Soumalias, Jakob Heiss, Jakob Weissteiner et al.

ICML 2025
2
citations
#73

AutoSciDACT: Automated Scientific Discovery through Contrastive Embedding and Hypothesis Testing

Sam Bright-Thonney, Christina Reissel, Gaia Grosso et al.

NeurIPS 2025
2
citations
#74

CarbonSense: A Multimodal Dataset and Baseline for Carbon Flux Modelling

Matthew Fortier, Mats L. Richter, Oliver Sonnentag et al.

ICLR 2025arXiv:2406.04940
carbon flux modellingmultimodal datasetsatellite imagerymeteorological predictors+3
2
citations
#75

The Catechol Benchmark: Time-series Solvent Selection Data for Few-shot Machine Learning

Toby Boyne, Juan Campos, Rebecca Langdon et al.

NeurIPS 2025
2
citations
#76

A Multiscale Frequency Domain Causal Framework for Enhanced Pathological Analysis

Xiaoyu Cui, Weixing Chen, Jiandong Su

ICLR 2025
2
citations
#77

ALINE: Joint Amortization for Bayesian Inference and Active Data Acquisition

Daolang Huang, Xinyi Wen, Ayush Bharti et al.

NeurIPS 2025arXiv:2506.07259
amortized bayesian inferenceactive data acquisitiontransformer architecturereinforcement learning+3
2
citations
#78

Towards Source-Free Machine Unlearning

Sk Miraj Ahmed, Umit Basaran, Dripta S. Raychaudhuri et al.

CVPR 2025
2
citations
#79

AutoSciLab: A Self-Driving Laboratory for Interpretable Scientific Discovery

Saaketh Desai, Sadhvikas Addamane, Jeffrey Y. Tsao et al.

AAAI 2025
1
citations
#80

PyTDC: A multimodal machine learning training, evaluation, and inference platform for biomedical foundation models

Alex Velez-Arce, Marinka Zitnik

ICML 2025
1
citations
#81

ML4CFD Competition: Results and Retrospective Analysis

Mouadh Yagoubi, David Danan, Milad LEYLI ABADI et al.

NeurIPS 2025
1
citations
#82

LC-Opt: Benchmarking Reinforcement Learning and Agentic AI for End-to-End Liquid Cooling Optimization in Data Centers

Avisek Naug, Antonio Guillen-Perez, Vineet Kumar et al.

NeurIPS 2025
1
citations
#83

COGNATE: Acceleration of Sparse Tensor Programs on Emerging Hardware using Transfer Learning

Chamika Sudusinghe, Gerasimos Gerogiannis, Damitha Lenadora et al.

ICML 2025
1
citations
#84

Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models

Konstantin Donhauser, Kristina Ulicna, Gemma Moran et al.

ICML 2025
1
citations
#85

ADELA: Accelerating Evolutionary Design of Machine Learning Pipelines with the Accompanying Surrogate Model

Yang Gu, Jian Cao, Hengyu You et al.

AAAI 2025
1
citations
#86

Active Measurement: Efficient Estimation at Scale

Max Hamilton, Jinlin Lai, Wenlong Zhao et al.

NeurIPS 2025
1
citations
#87

THE ROBUSTNESS OF DIFFERENTIABLE CAUSAL DISCOVERY IN MISSPECIFIED SCENARIOS

Huiyang Yi, Yanyan He, Duxin Chen et al.

ICLR 2025
1
citations
#88

Towards Learning High-Precision Least Squares Algorithms with Sequence Models

Jerry Liu, Jessica Grogan, Owen Dugan et al.

ICLR 2025
1
citations
#89

Can Private Machine Learning Be Fair?

Joseph Rance, Filip Svoboda

AAAI 2025
1
citations
#90

GlobalTomo: A global dataset for physics-ML seismic wavefield modeling and FWI

Shiqian Li, Zhi Li, Zhancun Mu et al.

NeurIPS 2025
1
citations
#91

AiDE-Q: Synthetic Labeled Datasets Can Enhance Learning Models for Quantum Property Estimation

Xinbiao Wang, Yuxuan Du, Zihan Lou et al.

NeurIPS 2025
1
citations
#92

Understanding Generalization in Physics Informed Models through Affine Variety Dimensions

Takeshi Koshizuka, Issei Sato

NeurIPS 2025
not collected
#93

ML4CO-Bench-101: Benchmark Machine Learning for Classic Combinatorial Problems on Graphs

Jiale Ma, Wenzheng Pan, Yang Li et al.

NeurIPS 2025
not collected
#94

Accelerating Legacy Numerical Solvers by Non-intrusive Gradient-based Meta-solving

Sohei Arisaka, Qianxiao Li

ICML 2024
gradient estimation techniquemeta-learning techniqueshyperparameter selectionlegacy numerical solvers+3
not collected
#95

MMTU: A Massive Multi-Task Table Understanding and Reasoning Benchmark

Junjie Xing, Yeye He, Mengyu Zhou et al.

NeurIPS 2025
not collected
#96

Position: Is machine learning good or bad for the natural sciences?

David W. Hogg, Soledad Villar

ICML 2024
causal inferenceconfounder representationmachine learning ontologymachine learning epistemology+4
not collected
#97

EDBench: Large-Scale Electron Density Data for Molecular Modeling

Hongxin Xiang, Ke Li, Mingquan Liu et al.

NeurIPS 2025
not collected
#98

Semantic-KG: Using Knowledge Graphs to Construct Benchmarks for Measuring Semantic Similarity

Qiyao Wei, Edward R Morrell, Lea Goetz et al.

NeurIPS 2025
not collected
#99

CGBench: Benchmarking Language Model Scientific Reasoning for Clinical Genetics Research

Owen Queen, Harrison Zhang, James Zou

NeurIPS 2025
not collected
#100

Position: Mission Critical – Satellite Data is a Distinct Modality in Machine Learning

Esther Rolf, Konstantin Klemmer, Caleb Robinson et al.

ICML 2024
satellite data modalitymultispectral imagery analysisgeospatial machine learningtemporal data modeling+4
not collected