Improving machine-learning models in materials science through large datasets

As an researcher in this field, I see a few big problems in this study. The authors sometimes overstate their findings.

1. The ML potentials are NOT “universal”. They call their force fields “universal,” but their own results show many failures and huge errors for some materials. A universal model should not crash on parts of the chemistry it was built for.
2. The data misses key physics. Their database ignores entropy and temperature effects. This is a major problem for predicting stable materials. The “stable” crystals they find might only be stable at absolute zero temperature (0 K), not in a real lab. This makes their stability predictions less useful.
3. Some accuracy claims are misleading. For example, reporting a “33% error” on band gaps sounds dramatic, but the error is small because most materials in their data are metals with zero band gap. The model is likely bad for predicting actual insulators.
4. The 2D material analysis is flawed. They use the “convex hull” method for 2D materials, but this method is designed for 3D bulk crystals. In reality, 2D materials are made on substrates that stabilize them. Their method could miss many real, stable 2D materials.

In short, the paper shows the power of big data for AI training. But the models are better at copying their own DFT calculations than at predicting real-world material behavior. The biggest issue is ignoring temperature and entropy, which limits the practical use of their findings.

ScienceGuardians

Did You Know?

Improving machine-learning models in materials science through large datasets

ScienceGuardians

Did You Know?

Welcome to ScienceGuardians, the First Fully Verified Journal Club,Safeguarding the Integrity of Science

Improving machine-learning models in materials science through large datasets

Welcome to ScienceGuardians, the First Fully Verified Journal Club,
Safeguarding the Integrity of Science