🧬Applications

Scientific Machine Learning

ML for scientific computing and discovery

100 papers1,520 total citations

Compare with other topics

Feb '24 — Jan '26345 papers

Top Conferences

ICLR: 34 NeurIPS: 32 ICML: 20 AAAI: 8 CVPR: 3 ECCV: 2

Top Papers

#1

Fast Machine Unlearning without Retraining through Selective Synaptic Dampening

Jack Foster, Stefan Schoepf, Alexandra Brintrup

AAAI 2024arXiv:2308.07707

machine unlearningselective synaptic dampeningfisher information matrixpost hoc unlearning+3

170

citations

#2

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

Jun Shern Chan, Neil Chowdhury, Oliver Jaffe et al.

Learning Smooth and Expressive Interatomic Potentials for Physical Property Prediction

Xiang Fu, Brandon Wood, Luis Barroso-Luque et al.

MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine

Renrui Zhang, Xinyu Wei, Dongzhi Jiang et al.

Scaling Laws for Data Filtering— Data Curation cannot be Compute Agnostic

Sachin Goyal, Pratyush Maini, Zachary Lipton et al.

SWE-smith: Scaling Data for Software Engineering Agents

John Yang, Kilian Lieret, Carlos Jimenez et al.

DSBench: How Far Are Data Science Agents from Becoming Data Science Experts?

Liqiang Jing, Zhehui Huang, Xiaoyang Wang et al.

ICLR 2025arXiv:2409.07703

data science agentslarge language modelslarge vision-language modelsdata analysis tasks+4

62

citations

#8

CycleResearcher: Improving Automated Research via Automated Review

Yixuan Weng, Minjun Zhu, Guangsheng Bao et al.

LLM-SR: Scientific Equation Discovery via Programming with Large Language Models

Parshin Shojaee, Kazem Meidani, Shashank Gupta et al.

MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled Images

Xurui Li, Ziming Huang, Feng Xue et al.

STI-Bench: Are MLLMs Ready for Precise Spatial-Temporal World Understanding?

Yun Li, Yiming Zhang, Tao Lin et al.

DiscoveryBench: Towards Data-Driven Discovery with Large Language Models

Bodhisattwa Prasad Majumder, Harshit Surana, Dhruv Agarwal et al.

ICLR 2025arXiv:2407.01725

data-driven discoverylarge language modelscode generationfunction calling+4

36

citations

#13

Scaling Wearable Foundation Models

Girish Narayanswamy, Xin Liu, Kumar Ayush et al.

The dark side of the forces: assessing non-conservative force models for atomistic machine learning

Filippo Bigi, Marcel Langer, Michele Ceriotti

From Mechanistic Interpretability to Mechanistic Biology: Training, Evaluating, and Interpreting Sparse Autoencoders on Protein Language Models

Etowah Adams, Liam Bai, Minji Lee et al.

Machine Unlearning Fails to Remove Data Poisoning Attacks

Martin Pawelczyk, Jimmy Di, Yiwei Lu et al.

MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code

Zimu Lu, Aojun Zhou, Ke Wang et al.

ICLR 2025arXiv:2410.08196

mathematical reasoningcode generationcontinued pretrainingsynthetic data generation+2

27

citations

#18

HyperFast: Instant Classification for Tabular Data

David Bonet, Daniel Mas Montserrat, Xavier Giró-i-Nieto et al.

AAAI 2024arXiv:2402.14335

tabular data classificationhypernetwork architecturemeta-trained modelsinstant inference+4

26

citations

#19

Learning to design protein-protein interactions with enhanced generalization

Anton Bushuiev, Roman Bushuiev, Petr Kouba et al.

ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World

Weixiang Yan, Haitian Liu, Tengxiao Wu et al.

Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning

Somnath Basu Roy Chowdhury, Krzysztof Choromanski, Arijit Sehanobish et al.

ICLR 2025arXiv:2406.16257

machine unlearningexact unlearningparameter-efficient fine-tuningparameter isolation+4

22

citations

#22

Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians

Ishan Amin, Sanjeev Raja, Aditi Krishnapriyan

Adaptive Message Passing: A General Framework to Mitigate Oversmoothing, Oversquashing, and Underreaching

Federico Errica, Henrik Christiansen, Viktor Zaverkin et al.

Benchmarking Predictive Coding Networks -- Made Simple

Luca Pinchetti, Chang Qi, Oleh Lokshyn et al.

ICLR 2025arXiv:2407.01163

predictive coding networksbio-plausible deep learningscalability in pcnsbenchmarking neural networks+1

18

citations

#25

Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data

Manuel Brenner, Elias Weber, Georgia Koppe et al.

AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench

Edan Toledo, Karen Hambardzumyan, Martin Josifoski et al.

NeurIPS 2025arXiv:2507.02554

ai research agentsautomated machine learningsearch policiesmcts algorithms+4

15

citations

#27

Learning MDL Logic Programs from Noisy Data

Céline Hocquette, Andreas Niskanen, Matti Järvisalo et al.

AAAI 2024arXiv:2308.09393

inductive logic programmingminimal description lengthnoisy data learningrecursive program synthesis+2

15

citations

#28

Apollo-MILP: An Alternating Prediction-Correction Neural Solving Framework for Mixed-Integer Linear Programming

Haoyang Liu, Jie Wang, Zijie Geng et al.

ICLR 2025arXiv:2503.01129

mixed-integer linear programmingneural solving frameworktrust-region searchproblem reduction+4

15

citations

#29

BatteryML: An Open-source Platform for Machine Learning on Battery Degradation

Han Zhang, Xiaofan Gui, Shun Zheng et al.

LiveXiv - A Multi-Modal live benchmark based on Arxiv papers content

Nimrod Shabtay, Felipe Maia Polo, Sivan Doveh et al.

ICLR 2025arXiv:2410.10783

multi-modal modelsvisual question answeringtest data contaminationscientific document understanding+4

11

citations

#31

Adaptive Self-improvement LLM Agentic System for ML Library Development

Genghan Zhang, Weixin Liang, Olivia Hsu et al.

The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization

Jae-Won Chung, Jeff J. Ma, Ruofan Wu et al.

NeurIPS 2025arXiv:2505.06371

inference energy consumptionenergy measurement benchmarkgenerative ai servicesautomated optimization recommendations+2

10

citations

#33

MindLLM: A Subject-Agnostic and Versatile Model for fMRI-to-text Decoding

Weikang Qiu, Zheng Huang, Haoyu Hu et al.

Fast training and sampling of Restricted Boltzmann Machines

Nicolas BEREUX, Aurélien Decelle, Cyril Furtlehner et al.

Deep Nonlinear Sufficient Dimension Reduction

Yinfeng Chen, Yuling Jiao, Rui Qiu et al.

Neural Auto-designer for Enhanced Quantum Kernels

Cong Lei, Yuxuan Du, Peng Mi et al.

Bridging the Semantic Latent Space between Brain and Machine: Similarity Is All You Need

Jiaxuan Chen, Yu Qi, Yueming Wang et al.

On Harmonizing Implicit Subpopulations

Feng Hong, Jiangchao Yao, YUEMING LYU et al.

Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development

Daoyuan Chen, Haibin Wang, Yilun Huang et al.

FlashMD: long-stride, universal prediction of molecular dynamics

Filippo Bigi, Sanggyu Chong, Agustinus Kristiadi et al.

NeurIPS 2025arXiv:2505.19350

molecular dynamics simulationhamiltonian dynamicsthermodynamic ensembleslong-stride prediction+3

7

citations

#41

Learning-Augmented Search Data Structures

Chunkai Fu, Brandon G. Nguyen, Jung Seo et al.

ICLR 2025arXiv:2402.10457

learning-augmented algorithmssearch data structuresskip listskd trees+4

6

citations

#42

Causal Discovery from Conditionally Stationary Time Series

Carles Balsells-Rodas, Xavier Sumba, Tanmayee Narendra et al.

Position: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process

Jing Yang

ML-SemReg: Boosting Point Cloud Registration with Multi-level Semantic Consistency

Shaocheng Yan, Pengcheng Shi, Jiayuan Li

Learning a Neural Solver for Parametric PDEs to Enhance Physics-Informed Methods

Lise Le Boudec, Emmanuel de Bézenac, Louis Serrano et al.

Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models

Daoyuan Chen, Yilun Huang, Xuchen Pan et al.

Scalable Bayesian Learning with posteriors

Samuel Duffield, Kaelan Donatella, Johnathan Chiu et al.

PINNsAgent: Automated PDE Surrogation with Large Language Models

Qingpo Wuwu, Chonghan Gao, Tianyu Chen et al.

Understanding Generalization in Quantum Machine Learning with Margins

TAK HUR, Daniel Kyungdeock Park

DOLPHIN: A Programmable Framework for Scalable Neurosymbolic Learning

Aaditya Naik, Jason Liu, Claire Wang et al.

In-Context Learning of Stochastic Differential Equations with Foundation Inference Models

Patrick Seifner, Kostadin Cvejoski, David Berghaus et al.

SymMaP: Improving Computational Efficiency in Linear Solvers through Symbolic Preconditioning

Hong Wang, Jie Wang, Minghao Ma et al.

Scaling Physical Reasoning with the PHYSICS Dataset

Shenghe Zheng, Qianjia Cheng, Junchi Yao et al.

X-Hacking: The Threat of Misguided AutoML

Rahul Sharma, Sumantrak Mukherjee, Andrea Šipka et al.

MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering

Rushi Qiang, Yuchen Zhuang, Yinghao Li et al.

Towards Establishing Guaranteed Error for Learned Database Operations

Sepanta Zeighami, Cyrus Shahabi

An LLM-Empowered Adaptive Evolutionary Algorithm for Multi-Component Deep Learning Systems

Haoxiang Tian, Xingshuo Han, Guoquan Wu et al.

Online Continuous Generalized Category Discovery

Keon-Hee Park, Hakyung Lee, Kyungwoo Song et al.

Epistemic Monte Carlo Tree Search

Yaniv Oren, Viliam Vadocz, Matthijs T. J. Spaan et al.

SnowMaster: Comprehensive Real-world Image Desnowing via MLLM with Multi-Model Feedback Optimization

Jianyu LAI, Sixiang Chen, yunlong lin et al.

Causal-StoNet: Causal Inference for High-Dimensional Complex Data

Yaxin Fang, Faming Liang

Open-Insect: Benchmarking Open-Set Recognition of Novel Species in Biodiversity Monitoring

Yuyan Chen, Nico Lang, B. Schmidt et al.

MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?

Yunxiang Zhang, Muhammad Khalifa, Shitanshu Bhushan et al.

Microcanonical Langevin Ensembles: Advancing the Sampling of Bayesian Neural Networks

Emanuel Sommer, Jakob Robnik, Giorgi Nozadze et al.

ECD: A Machine Learning Benchmark for Predicting Enhanced-Precision Electronic Charge Density in Crystalline Inorganic Materials

Pin Chen, Zexin Xu, Qing Mo et al.

ICLR 2025

electronic charge densitydensity functional theorycrystalline materialsmachine learning prediction+3

3

citations

#66

No Equations Needed: Learning System Dynamics Without Relying on Closed-Form ODEs

Krzysztof Kacprzyk, Mihaela van der Schaar

AutoDiscovery: Open-ended Scientific Discovery via Bayesian Surprise

Dhruv Agarwal, Bodhisattwa Prasad Majumder, Reece Adamson et al.

NeurIPS 2025arXiv:2507.00310

autonomous scientific discoverybayesian surprisemonte carlo tree searchhypothesis generation+4

3

citations

#68

PhysGym: Benchmarking LLMs in Interactive Physics Discovery with Controlled Priors

Yimeng Chen, Piotr Piękos, Mateusz Ostaszewski et al.

Common Task Framework For a Critical Evaluation of Scientific Machine Learning Algorithms

Philippe Wyder, Judah Goldfeder, Alexey Yermakov et al.

AstroVisBench: A Code Benchmark for Scientific Computing and Visualization in Astronomy

Sebastian Joseph, Syed M. Husain, Stella Offner et al.

CosmoBench: A Multiscale, Multiview, Multitask Cosmology Benchmark for Geometric Deep Learning

Teresa Huang, Richard Stiskalek, Jun-Young Lee et al.

Prices, Bids, Values: One ML-Powered Combinatorial Auction to Rule Them All

Ermis Soumalias, Jakob Heiss, Jakob Weissteiner et al.

AutoSciDACT: Automated Scientific Discovery through Contrastive Embedding and Hypothesis Testing

Sam Bright-Thonney, Christina Reissel, Gaia Grosso et al.

CarbonSense: A Multimodal Dataset and Baseline for Carbon Flux Modelling

Matthew Fortier, Mats L. Richter, Oliver Sonnentag et al.

ICLR 2025arXiv:2406.04940

carbon flux modellingmultimodal datasetsatellite imagerymeteorological predictors+3

2

citations

#75

The Catechol Benchmark: Time-series Solvent Selection Data for Few-shot Machine Learning

Toby Boyne, Juan Campos, Rebecca Langdon et al.

A Multiscale Frequency Domain Causal Framework for Enhanced Pathological Analysis

Xiaoyu Cui, Weixing Chen, Jiandong Su

ALINE: Joint Amortization for Bayesian Inference and Active Data Acquisition

Daolang Huang, Xinyi Wen, Ayush Bharti et al.

NeurIPS 2025arXiv:2506.07259

amortized bayesian inferenceactive data acquisitiontransformer architecturereinforcement learning+3

2

citations

#78

Towards Source-Free Machine Unlearning

Sk Miraj Ahmed, Umit Basaran, Dripta S. Raychaudhuri et al.

AutoSciLab: A Self-Driving Laboratory for Interpretable Scientific Discovery

Saaketh Desai, Sadhvikas Addamane, Jeffrey Y. Tsao et al.

PyTDC: A multimodal machine learning training, evaluation, and inference platform for biomedical foundation models

Alex Velez-Arce, Marinka Zitnik

ML4CFD Competition: Results and Retrospective Analysis

Mouadh Yagoubi, David Danan, Milad LEYLI ABADI et al.

LC-Opt: Benchmarking Reinforcement Learning and Agentic AI for End-to-End Liquid Cooling Optimization in Data Centers

Avisek Naug, Antonio Guillen-Perez, Vineet Kumar et al.

COGNATE: Acceleration of Sparse Tensor Programs on Emerging Hardware using Transfer Learning

Chamika Sudusinghe, Gerasimos Gerogiannis, Damitha Lenadora et al.

Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models

Konstantin Donhauser, Kristina Ulicna, Gemma Moran et al.

ADELA: Accelerating Evolutionary Design of Machine Learning Pipelines with the Accompanying Surrogate Model

Yang Gu, Jian Cao, Hengyu You et al.

Active Measurement: Efficient Estimation at Scale

Max Hamilton, Jinlin Lai, Wenlong Zhao et al.

THE ROBUSTNESS OF DIFFERENTIABLE CAUSAL DISCOVERY IN MISSPECIFIED SCENARIOS

Huiyang Yi, Yanyan He, Duxin Chen et al.

Towards Learning High-Precision Least Squares Algorithms with Sequence Models

Jerry Liu, Jessica Grogan, Owen Dugan et al.

Can Private Machine Learning Be Fair?

Joseph Rance, Filip Svoboda

GlobalTomo: A global dataset for physics-ML seismic wavefield modeling and FWI

Shiqian Li, Zhi Li, Zhancun Mu et al.

AiDE-Q: Synthetic Labeled Datasets Can Enhance Learning Models for Quantum Property Estimation

Xinbiao Wang, Yuxuan Du, Zihan Lou et al.

Understanding Generalization in Physics Informed Models through Affine Variety Dimensions

Takeshi Koshizuka, Issei Sato

ML4CO-Bench-101: Benchmark Machine Learning for Classic Combinatorial Problems on Graphs

Jiale Ma, Wenzheng Pan, Yang Li et al.

Accelerating Legacy Numerical Solvers by Non-intrusive Gradient-based Meta-solving

Sohei Arisaka, Qianxiao Li

ICML 2024

gradient estimation techniquemeta-learning techniqueshyperparameter selectionlegacy numerical solvers+3

—

not collected

#95

MMTU: A Massive Multi-Task Table Understanding and Reasoning Benchmark

Junjie Xing, Yeye He, Mengyu Zhou et al.

Position: Is machine learning good or bad for the natural sciences?

David W. Hogg, Soledad Villar

ICML 2024

causal inferenceconfounder representationmachine learning ontologymachine learning epistemology+4

—

not collected

#97

EDBench: Large-Scale Electron Density Data for Molecular Modeling

Hongxin Xiang, Ke Li, Mingquan Liu et al.

Semantic-KG: Using Knowledge Graphs to Construct Benchmarks for Measuring Semantic Similarity

Qiyao Wei, Edward R Morrell, Lea Goetz et al.

CGBench: Benchmarking Language Model Scientific Reasoning for Clinical Genetics Research

Owen Queen, Harrison Zhang, James Zou

Position: Mission Critical – Satellite Data is a Distinct Modality in Machine Learning

Esther Rolf, Konstantin Klemmer, Caleb Robinson et al.

ICML 2024

satellite data modalitymultispectral imagery analysisgeospatial machine learningtemporal data modeling+4

—

not collected

Scientific Machine Learning

Top Conferences

Related Topics (Applications)

Top Papers

Fast Machine Unlearning without Retraining through Selective Synaptic Dampening

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

Learning Smooth and Expressive Interatomic Potentials for Physical Property Prediction

MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine

Scaling Laws for Data Filtering— Data Curation cannot be Compute Agnostic

SWE-smith: Scaling Data for Software Engineering Agents

DSBench: How Far Are Data Science Agents from Becoming Data Science Experts?

CycleResearcher: Improving Automated Research via Automated Review

LLM-SR: Scientific Equation Discovery via Programming with Large Language Models

MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled Images

STI-Bench: Are MLLMs Ready for Precise Spatial-Temporal World Understanding?

DiscoveryBench: Towards Data-Driven Discovery with Large Language Models

Scaling Wearable Foundation Models

The dark side of the forces: assessing non-conservative force models for atomistic machine learning

From Mechanistic Interpretability to Mechanistic Biology: Training, Evaluating, and Interpreting Sparse Autoencoders on Protein Language Models

Machine Unlearning Fails to Remove Data Poisoning Attacks

MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code

HyperFast: Instant Classification for Tabular Data

Learning to design protein-protein interactions with enhanced generalization

ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World

Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning

Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians

Adaptive Message Passing: A General Framework to Mitigate Oversmoothing, Oversquashing, and Underreaching

Benchmarking Predictive Coding Networks -- Made Simple

Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data

AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench

Learning MDL Logic Programs from Noisy Data

Apollo-MILP: An Alternating Prediction-Correction Neural Solving Framework for Mixed-Integer Linear Programming

BatteryML: An Open-source Platform for Machine Learning on Battery Degradation

LiveXiv - A Multi-Modal live benchmark based on Arxiv papers content

Adaptive Self-improvement LLM Agentic System for ML Library Development

The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization

MindLLM: A Subject-Agnostic and Versatile Model for fMRI-to-text Decoding

Fast training and sampling of Restricted Boltzmann Machines

Deep Nonlinear Sufficient Dimension Reduction

Neural Auto-designer for Enhanced Quantum Kernels

Bridging the Semantic Latent Space between Brain and Machine: Similarity Is All You Need

On Harmonizing Implicit Subpopulations

Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development

FlashMD: long-stride, universal prediction of molecular dynamics

Learning-Augmented Search Data Structures

Causal Discovery from Conditionally Stationary Time Series

Position: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process

ML-SemReg: Boosting Point Cloud Registration with Multi-level Semantic Consistency

Learning a Neural Solver for Parametric PDEs to Enhance Physics-Informed Methods

Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models

Scalable Bayesian Learning with posteriors

PINNsAgent: Automated PDE Surrogation with Large Language Models

Understanding Generalization in Quantum Machine Learning with Margins

DOLPHIN: A Programmable Framework for Scalable Neurosymbolic Learning

In-Context Learning of Stochastic Differential Equations with Foundation Inference Models

SymMaP: Improving Computational Efficiency in Linear Solvers through Symbolic Preconditioning

Scaling Physical Reasoning with the PHYSICS Dataset

X-Hacking: The Threat of Misguided AutoML

MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering

Towards Establishing Guaranteed Error for Learned Database Operations

An LLM-Empowered Adaptive Evolutionary Algorithm for Multi-Component Deep Learning Systems

Online Continuous Generalized Category Discovery

Epistemic Monte Carlo Tree Search

SnowMaster: Comprehensive Real-world Image Desnowing via MLLM with Multi-Model Feedback Optimization

Causal-StoNet: Causal Inference for High-Dimensional Complex Data

Open-Insect: Benchmarking Open-Set Recognition of Novel Species in Biodiversity Monitoring

MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?

Microcanonical Langevin Ensembles: Advancing the Sampling of Bayesian Neural Networks

ECD: A Machine Learning Benchmark for Predicting Enhanced-Precision Electronic Charge Density in Crystalline Inorganic Materials

No Equations Needed: Learning System Dynamics Without Relying on Closed-Form ODEs

AutoDiscovery: Open-ended Scientific Discovery via Bayesian Surprise

PhysGym: Benchmarking LLMs in Interactive Physics Discovery with Controlled Priors

Common Task Framework For a Critical Evaluation of Scientific Machine Learning Algorithms

AstroVisBench: A Code Benchmark for Scientific Computing and Visualization in Astronomy

CosmoBench: A Multiscale, Multiview, Multitask Cosmology Benchmark for Geometric Deep Learning

Prices, Bids, Values: One ML-Powered Combinatorial Auction to Rule Them All

AutoSciDACT: Automated Scientific Discovery through Contrastive Embedding and Hypothesis Testing

CarbonSense: A Multimodal Dataset and Baseline for Carbon Flux Modelling

The Catechol Benchmark: Time-series Solvent Selection Data for Few-shot Machine Learning

A Multiscale Frequency Domain Causal Framework for Enhanced Pathological Analysis