🧬Applications

Scientific Machine Learning

ML for scientific computing and discovery

100 papers1,601 total citations
Compare with other topics
Mar '24 Feb '26340 papers
Also includes: scientific machine learning, physics-informed, partial differential equations, pde, scientific computing

Top Papers

#1

Fast Machine Unlearning without Retraining through Selective Synaptic Dampening

Jack Foster, Stefan Schoepf, Alexandra Brintrup

AAAI 2024arXiv:2308.07707
machine unlearningselective synaptic dampeningfisher information matrixpost hoc unlearning+3
170
citations
#2

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

Jun Shern Chan, Neil Chowdhury, Oliver Jaffe et al.

ICLR 2025
127
citations
#3

Learning Smooth and Expressive Interatomic Potentials for Physical Property Prediction

Xiang Fu, Brandon Wood, Luis Barroso-Luque et al.

ICML 2025
87
citations
#4

MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine

Renrui Zhang, Xinyu Wei, Dongzhi Jiang et al.

ICLR 2025
74
citations
#5

Scaling Laws for Data Filtering— Data Curation cannot be Compute Agnostic

Sachin Goyal, Pratyush Maini, Zachary Lipton et al.

CVPR 2024
67
citations
#6

SWE-smith: Scaling Data for Software Engineering Agents

John Yang, Kilian Lieret, Carlos Jimenez et al.

NeurIPS 2025
64
citations
#7

CycleResearcher: Improving Automated Research via Automated Review

Yixuan Weng, Minjun Zhu, Guangsheng Bao et al.

ICLR 2025
62
citations
#8

DSBench: How Far Are Data Science Agents from Becoming Data Science Experts?

Liqiang Jing, Zhehui Huang, Xiaoyang Wang et al.

ICLR 2025arXiv:2409.07703
data science agentslarge language modelslarge vision-language modelsdata analysis tasks+4
62
citations
#9

LLM-SR: Scientific Equation Discovery via Programming with Large Language Models

Parshin Shojaee, Kazem Meidani, Shashank Gupta et al.

ICLR 2025arXiv:2404.18400
symbolic regressionscientific equation discoverylarge language modelsevolutionary search+3
55
citations
#10

MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled Images

Xurui Li, Ziming Huang, Feng Xue et al.

ICLR 2024
55
citations
#11

STI-Bench: Are MLLMs Ready for Precise Spatial-Temporal World Understanding?

Yun Li, Yiming Zhang, Tao Lin et al.

ICCV 2025
36
citations
#12

DiscoveryBench: Towards Data-Driven Discovery with Large Language Models

Bodhisattwa Prasad Majumder, Harshit Surana, Dhruv Agarwal et al.

ICLR 2025arXiv:2407.01725
data-driven discoverylarge language modelscode generationfunction calling+4
36
citations
#13

Scaling Wearable Foundation Models

Girish Narayanswamy, Xin Liu, Kumar Ayush et al.

ICLR 2025
33
citations
#14

ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities

Ezra Karger, Houtan Bastani, Chen Yueh-Han et al.

ICLR 2025arXiv:2409.19839
forecasting accuracy evaluationmachine learning systemsdynamic benchmarkingfuture event prediction+4
31
citations
#15

The dark side of the forces: assessing non-conservative force models for atomistic machine learning

Filippo Bigi, Marcel Langer, Michele Ceriotti

ICML 2025
29
citations
#16

From Mechanistic Interpretability to Mechanistic Biology: Training, Evaluating, and Interpreting Sparse Autoencoders on Protein Language Models

Etowah Adams, Liam Bai, Minji Lee et al.

ICML 2025
28
citations
#17

Machine Unlearning Fails to Remove Data Poisoning Attacks

Martin Pawelczyk, Jimmy Di, Yiwei Lu et al.

ICLR 2025
28
citations
#18

MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code

Zimu Lu, Aojun Zhou, Ke Wang et al.

ICLR 2025arXiv:2410.08196
mathematical reasoningcode generationcontinued pretrainingsynthetic data generation+2
27
citations
#19

HyperFast: Instant Classification for Tabular Data

David Bonet, Daniel Mas Montserrat, Xavier Giró-i-Nieto et al.

AAAI 2024arXiv:2402.14335
tabular data classificationhypernetwork architecturemeta-trained modelsinstant inference+4
26
citations
#20

Learning to design protein-protein interactions with enhanced generalization

Anton Bushuiev, Roman Bushuiev, Petr Kouba et al.

ICLR 2024
25
citations
#21

ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World

Weixiang Yan, Haitian Liu, Tengxiao Wu et al.

NeurIPS 2025
22
citations
#22

Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning

Somnath Basu Roy Chowdhury, Krzysztof Choromanski, Arijit Sehanobish et al.

ICLR 2025arXiv:2406.16257
machine unlearningexact unlearningparameter-efficient fine-tuningparameter isolation+4
22
citations
#23

Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians

Ishan Amin, Sanjeev Raja, Aditi Krishnapriyan

ICLR 2025
21
citations
#24

Adaptive Message Passing: A General Framework to Mitigate Oversmoothing, Oversquashing, and Underreaching

Federico Errica, Henrik Christiansen, Viktor Zaverkin et al.

ICML 2025
19
citations
#25

Benchmarking Predictive Coding Networks -- Made Simple

Luca Pinchetti, Chang Qi, Oleh Lokshyn et al.

ICLR 2025arXiv:2407.01163
predictive coding networksbio-plausible deep learningscalability in pcnsbenchmarking neural networks+1
18
citations
#26

Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data

Manuel Brenner, Elias Weber, Georgia Koppe et al.

ICLR 2025arXiv:2410.04814
dynamical systems reconstructionhierarchical modelingtime series analysismulti-domain learning+4
16
citations
#27

Apollo-MILP: An Alternating Prediction-Correction Neural Solving Framework for Mixed-Integer Linear Programming

Haoyang Liu, Jie Wang, Zijie Geng et al.

ICLR 2025arXiv:2503.01129
mixed-integer linear programmingneural solving frameworktrust-region searchproblem reduction+4
15
citations
#28

Learning MDL Logic Programs from Noisy Data

Céline Hocquette, Andreas Niskanen, Matti Järvisalo et al.

AAAI 2024arXiv:2308.09393
inductive logic programmingminimal description lengthnoisy data learningrecursive program synthesis+2
15
citations
#29

AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench

Edan Toledo, Karen Hambardzumyan, Martin Josifoski et al.

NeurIPS 2025arXiv:2507.02554
ai research agentsautomated machine learningsearch policiesmcts algorithms+4
15
citations
#30

BatteryML: An Open-source Platform for Machine Learning on Battery Degradation

Han Zhang, Xiaofan Gui, Shun Zheng et al.

ICLR 2024
11
citations
#31

LiveXiv - A Multi-Modal live benchmark based on Arxiv papers content

Nimrod Shabtay, Felipe Maia Polo, Sivan Doveh et al.

ICLR 2025arXiv:2410.10783
multi-modal modelsvisual question answeringtest data contaminationscientific document understanding+4
11
citations
#32

Adaptive Self-improvement LLM Agentic System for ML Library Development

Genghan Zhang, Weixin Liang, Olivia Hsu et al.

ICML 2025
10
citations
#33

RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics

Jie Zhang, Cezara Petrui, Kristina Nikolić et al.

NeurIPS 2025arXiv:2505.12575
mathematical reasoning evaluationresearch-level mathematicslanguage model benchmarkingautomated evaluation+2
10
citations
#34

Fast training and sampling of Restricted Boltzmann Machines

Nicolas BEREUX, Aurélien Decelle, Cyril Furtlehner et al.

ICLR 2025arXiv:2405.15376
restricted boltzmann machinesmarkov chain monte carloparallel trajectory temperingpartition function computation+4
10
citations
#35

MindLLM: A Subject-Agnostic and Versatile Model for fMRI-to-text Decoding

Weikang Qiu, Zheng Huang, Haoyu Hu et al.

ICML 2025
10
citations
#36

The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization

Jae-Won Chung, Jeff J. Ma, Ruofan Wu et al.

NeurIPS 2025arXiv:2505.06371
inference energy consumptionenergy measurement benchmarkgenerative ai servicesautomated optimization recommendations+2
10
citations
#37

Deep Nonlinear Sufficient Dimension Reduction

Yinfeng Chen, Yuling Jiao, Rui Qiu et al.

NeurIPS 2025
9
citations
#38

On Harmonizing Implicit Subpopulations

Feng Hong, Jiangchao Yao, YUEMING LYU et al.

ICLR 2024
8
citations
#39

Neural Auto-designer for Enhanced Quantum Kernels

Cong Lei, Yuxuan Du, Peng Mi et al.

ICLR 2024
8
citations
#40

Bridging the Semantic Latent Space between Brain and Machine: Similarity Is All You Need

Jiaxuan Chen, Yu Qi, Yueming Wang et al.

AAAI 2024
8
citations
#41

Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development

Daoyuan Chen, Haibin Wang, Yilun Huang et al.

ICML 2025
7
citations
#42

MLZero: A Multi-Agent System for End-to-end Machine Learning Automation

Haoyang Fang, Boran Han, Nick Erickson et al.

NeurIPS 2025arXiv:2505.13941
multi-agent systemsmachine learning automationmultimodal data processinglarge language models+4
7
citations
#43

MLE-STAR: Machine Learning Engineering Agent via Search and Targeted Refinement

Jaehyun Nam, Jinsung Yoon, Jiefeng Chen et al.

NeurIPS 2025arXiv:2506.15692
code generation agentsmachine learning engineeringfeature engineeringablation study guidance+4
7
citations
#44

Symbolic regression via MDLformer-guided search: from minimizing prediction error to minimizing description length

Zihan Yu, Jingtao Ding, Yong Li et al.

ICLR 2025arXiv:2411.03753
symbolic regressionminimum description lengthneural network estimationformula discovery+4
7
citations
#45

FlashMD: long-stride, universal prediction of molecular dynamics

Filippo Bigi, Sanggyu Chong, Agustinus Kristiadi et al.

NeurIPS 2025arXiv:2505.19350
molecular dynamics simulationhamiltonian dynamicsthermodynamic ensembleslong-stride prediction+3
7
citations
#46

Position: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process

Jing Yang

ICML 2025
6
citations
#47

Learning a Neural Solver for Parametric PDEs to Enhance Physics-Informed Methods

Lise Le Boudec, Emmanuel de Bézenac, Louis Serrano et al.

ICLR 2025
6
citations
#48

Causal Discovery from Conditionally Stationary Time Series

Carles Balsells-Rodas, Xavier Sumba, Tanmayee Narendra et al.

ICML 2025
6
citations
#49

Learning-Augmented Search Data Structures

Chunkai Fu, Brandon G. Nguyen, Jung Seo et al.

ICLR 2025arXiv:2402.10457
learning-augmented algorithmssearch data structuresskip listskd trees+4
6
citations
#50

ML-SemReg: Boosting Point Cloud Registration with Multi-level Semantic Consistency

Shaocheng Yan, Pengcheng Shi, Jiayuan Li

ECCV 2024
6
citations
#51

Scalable Bayesian Learning with posteriors

Samuel Duffield, Kaelan Donatella, Johnathan Chiu et al.

ICLR 2025
6
citations
#52

GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning

Minghao Xu, Yunteng Geng, Yihang Zhang et al.

ICLR 2025arXiv:2405.16206
glycan property predictionglycan function predictiongraph neural networksmulti-task learning+4
6
citations
#53

Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models

Daoyuan Chen, Yilun Huang, Xuchen Pan et al.

NeurIPS 2025
6
citations
#54

Understanding Generalization in Quantum Machine Learning with Margins

TAK HUR, Daniel Kyungdeock Park

ICML 2025
5
citations
#55

PINNsAgent: Automated PDE Surrogation with Large Language Models

Qingpo Wuwu, Chonghan Gao, Tianyu Chen et al.

ICML 2025
5
citations
#56

DOLPHIN: A Programmable Framework for Scalable Neurosymbolic Learning

Aaditya Naik, Jason Liu, Claire Wang et al.

ICML 2025
5
citations
#57

cDP-MIL: Robust Multiple Instance Learning via Cascaded Dirichlet Process

Yihang Chen, TSAI HOR CHAN, Guosheng Yin et al.

ECCV 2024arXiv:2407.11448
multiple instance learningwhole slide histopathologybayesian nonparametric frameworkdirichlet process+3
5
citations
#58

In-Context Learning of Stochastic Differential Equations with Foundation Inference Models

Patrick Seifner, Kostadin Cvejoski, David Berghaus et al.

NeurIPS 2025
5
citations
#59

SymMaP: Improving Computational Efficiency in Linear Solvers through Symbolic Preconditioning

Hong Wang, Jie Wang, Minghao Ma et al.

NeurIPS 2025
5
citations
#60

Scaling Physical Reasoning with the PHYSICS Dataset

Shenghe Zheng, Qianjia Cheng, Junchi Yao et al.

NeurIPS 2025
5
citations
#61

An LLM-Empowered Adaptive Evolutionary Algorithm for Multi-Component Deep Learning Systems

Haoxiang Tian, Xingshuo Han, Guoquan Wu et al.

AAAI 2025
4
citations
#62

Causal-StoNet: Causal Inference for High-Dimensional Complex Data

Yaxin Fang, Faming Liang

ICLR 2024
4
citations
#63

Towards Establishing Guaranteed Error for Learned Database Operations

Sepanta Zeighami, Cyrus Shahabi

ICLR 2024
4
citations
#64

MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering

Rushi Qiang, Yuchen Zhuang, Yinghao Li et al.

NeurIPS 2025
4
citations
#65

Online Continuous Generalized Category Discovery

Keon-Hee Park, Hakyung Lee, Kyungwoo Song et al.

ECCV 2024
4
citations
#66

X-Hacking: The Threat of Misguided AutoML

Rahul Sharma, Sumantrak Mukherjee, Andrea Šipka et al.

ICML 2025
4
citations
#67

SnowMaster: Comprehensive Real-world Image Desnowing via MLLM with Multi-Model Feedback Optimization

Jianyu LAI, Sixiang Chen, yunlong lin et al.

CVPR 2025
4
citations
#68

CellVerse: Do Large Language Models Really Understand Cell Biology?

Fan Zhang, Tianyu Liu, Zhihong Zhu et al.

NeurIPS 2025arXiv:2505.07865
single-cell datalarge language modelscell biology understandingmulti-omics data+4
4
citations
#69

Epistemic Monte Carlo Tree Search

Yaniv Oren, Viliam Vadocz, Matthijs T. J. Spaan et al.

ICLR 2025
4
citations
#70

Open-Insect: Benchmarking Open-Set Recognition of Novel Species in Biodiversity Monitoring

Yuyan Chen, Nico Lang, B. Schmidt et al.

NeurIPS 2025
3
citations
#71

ECD: A Machine Learning Benchmark for Predicting Enhanced-Precision Electronic Charge Density in Crystalline Inorganic Materials

Pin Chen, Zexin Xu, Qing Mo et al.

ICLR 2025
electronic charge densitydensity functional theorycrystalline materialsmachine learning prediction+3
3
citations
#72

No Equations Needed: Learning System Dynamics Without Relying on Closed-Form ODEs

Krzysztof Kacprzyk, Mihaela van der Schaar

ICLR 2025
3
citations
#73

MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?

Yunxiang Zhang, Muhammad Khalifa, Shitanshu Bhushan et al.

NeurIPS 2025
3
citations
#74

Microcanonical Langevin Ensembles: Advancing the Sampling of Bayesian Neural Networks

Emanuel Sommer, Jakob Robnik, Giorgi Nozadze et al.

ICLR 2025
3
citations
#75

A machine learning approach that beats Rubik's cubes

Alexander Chervov, Kirill Khoruzhii, Nikita Bukhal et al.

NeurIPS 2025
pathfinding problemdiffusion distance estimationbeam searchrubik's cube solver+2
3
citations
#76

AutoDiscovery: Open-ended Scientific Discovery via Bayesian Surprise

Dhruv Agarwal, Bodhisattwa Prasad Majumder, Reece Adamson et al.

NeurIPS 2025arXiv:2507.00310
autonomous scientific discoverybayesian surprisemonte carlo tree searchhypothesis generation+4
3
citations
#77

PhysGym: Benchmarking LLMs in Interactive Physics Discovery with Controlled Priors

Yimeng Chen, Piotr Piękos, Mateusz Ostaszewski et al.

NeurIPS 2025arXiv:2507.15550
scientific discovery capabilitiesinteractive physics environmentsprior knowledge controlhypothesis formulation+4
2
citations
#78

Common Task Framework For a Critical Evaluation of Scientific Machine Learning Algorithms

Philippe Wyder, Judah Goldfeder, Alexey Yermakov et al.

NeurIPS 2025
2
citations
#79

Prices, Bids, Values: One ML-Powered Combinatorial Auction to Rule Them All

Ermis Soumalias, Jakob Heiss, Jakob Weissteiner et al.

ICML 2025
2
citations
#80

AstroVisBench: A Code Benchmark for Scientific Computing and Visualization in Astronomy

Sebastian Joseph, Syed M. Husain, Stella Offner et al.

NeurIPS 2025arXiv:2505.20538
astronomy data visualizationscientific computing workflowsllm-as-a-judge evaluationastronomy research assistance+2
2
citations
#81

CosmoBench: A Multiscale, Multiview, Multitask Cosmology Benchmark for Geometric Deep Learning

Teresa Huang, Richard Stiskalek, Jun-Young Lee et al.

NeurIPS 2025
2
citations
#82

AutoSciDACT: Automated Scientific Discovery through Contrastive Embedding and Hypothesis Testing

Sam Bright-Thonney, Christina Reissel, Gaia Grosso et al.

NeurIPS 2025arXiv:2510.21935
novelty detectioncontrastive learninganomaly detectiondimensionality reduction+3
2
citations
#83

CarbonSense: A Multimodal Dataset and Baseline for Carbon Flux Modelling

Matthew Fortier, Mats L. Richter, Oliver Sonnentag et al.

ICLR 2025arXiv:2406.04940
carbon flux modellingmultimodal datasetsatellite imagerymeteorological predictors+3
2
citations
#84

The Catechol Benchmark: Time-series Solvent Selection Data for Few-shot Machine Learning

Toby Boyne, Juan Campos, Rebecca Langdon et al.

NeurIPS 2025
2
citations
#85

A Multiscale Frequency Domain Causal Framework for Enhanced Pathological Analysis

Xiaoyu Cui, Weixing Chen, Jiandong Su

ICLR 2025
2
citations
#86

PINP: Physics-Informed Neural Predictor with latent estimation of fluid flows

Huaguan Chen, Yang Liu, Hao Sun

ICLR 2025arXiv:2504.06070
physics-informed learningfluid dynamics predictionspatiotemporal predictionneural predictor+3
2
citations
#87

ALINE: Joint Amortization for Bayesian Inference and Active Data Acquisition

Daolang Huang, Xinyi Wen, Ayush Bharti et al.

NeurIPS 2025arXiv:2506.07259
amortized bayesian inferenceactive data acquisitiontransformer architecturereinforcement learning+3
2
citations
#88

Towards Source-Free Machine Unlearning

Sk Miraj Ahmed, Umit Basaran, Dripta S. Raychaudhuri et al.

CVPR 2025
2
citations
#89

PyTDC: A multimodal machine learning training, evaluation, and inference platform for biomedical foundation models

Alex Velez-Arce, Marinka Zitnik

ICML 2025
1
citations
#90

AutoSciLab: A Self-Driving Laboratory for Interpretable Scientific Discovery

Saaketh Desai, Sadhvikas Addamane, Jeffrey Y. Tsao et al.

AAAI 2025
1
citations
#91

COGNATE: Acceleration of Sparse Tensor Programs on Emerging Hardware using Transfer Learning

Chamika Sudusinghe, Gerasimos Gerogiannis, Damitha Lenadora et al.

ICML 2025
1
citations
#92

LC-Opt: Benchmarking Reinforcement Learning and Agentic AI for End-to-End Liquid Cooling Optimization in Data Centers

Avisek Naug, Antonio Guillen-Perez, Vineet Kumar et al.

NeurIPS 2025arXiv:2511.00116
reinforcement learning controlliquid cooling optimizationmulti-agent rlthermal management+4
1
citations
#93

ML4CFD Competition: Results and Retrospective Analysis

Mouadh Yagoubi, David Danan, Milad LEYLI ABADI et al.

NeurIPS 2025
1
citations
#94

Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models

Konstantin Donhauser, Kristina Ulicna, Gemma Moran et al.

ICML 2025
1
citations
#95

ADELA: Accelerating Evolutionary Design of Machine Learning Pipelines with the Accompanying Surrogate Model

Yang Gu, Jian Cao, Hengyu You et al.

AAAI 2025
1
citations
#96

Active Measurement: Efficient Estimation at Scale

Max Hamilton, Jinlin Lai, Wenlong Zhao et al.

NeurIPS 2025
1
citations
#97

Selective Learning for Deep Time Series Forecasting

Yisong Fu, Zezhi Shao, Chengqing Yu et al.

NeurIPS 2025arXiv:2510.25207
time series forecastingselective learninguncertainty maskinganomaly detection+4
1
citations
#98

THE ROBUSTNESS OF DIFFERENTIABLE CAUSAL DISCOVERY IN MISSPECIFIED SCENARIOS

Huiyang Yi, Yanyan He, Duxin Chen et al.

ICLR 2025
1
citations
#99

GlobalTomo: A global dataset for physics-ML seismic wavefield modeling and FWI

Shiqian Li, Zhi Li, Zhancun Mu et al.

NeurIPS 2025
1
citations
#100

Can Private Machine Learning Be Fair?

Joseph Rance, Filip Svoboda

AAAI 2025
1
citations