en de

Online Magazine

How ML simulates ice nucleation

How does water turn into ice? The understanding of this process has long been limited because it was not well visible to the human eye. Pablo Piaggi and his team from Princeton University have now managed to simulate the process on the computer in as high an accuracy as never before. A conversation about machine learning simulations, and how the knowledge on ice formation could improve food processing and climate models.



Mr. Piaggi, you managed, with the help of a machine learning algorithm, to simulate the “nucleation of ice”. What exactly does this term mean?
Ice nucleation is the very initial stage of the freezing of water. At first, water molecules are moving around relatively freely and at some point, they start interacting more strongly and form this network that is ice. Once that the ice cluster is formed, it grows. But the very initial process is called nucleation and that is what we were able to simulate.

How did you come up with the idea to use machine learning for this kind of simulation?
This idea has actually been present in the field for about 15 years. However, it only really took off around 3 or 4 years ago, when several software developments, like libraries from Google and Meta, made the necessary machine learning (ML) tools more accessible. Furthermore, we only now have fast-enough hardware for this type of algorithms.

These two factors combined were a real revolution in the field and a lot of researchers doing molecular simulations are starting to use ML models. Some years ago, I wrote a project about using them to simulate ice nucleation for the first time with quantum accuracy. Then I came to Princeton because here they had developed a software library to use this type of algorithm.

And why did you decide on simulating the nucleation of ice specifically, and not another substance?
Well, first of all, water is one of the most common substances on our planet, so the formation of ice affects many processes that we deal with every day – things like precipitation and everything else that is connected to the water cycle. On top of that, water is an interesting subject of study because it exhibits certain abnormal behavior. And finally, there are several faculty members here at Princeton who have worked on water and ice throughout their career.

How exactly do you produce these simulations using machine learning? And what is the concrete benefit of this approach?
First, we needed to obtain the interatomic forces between atoms and molecules. They are the quantities that will drive the dynamics. These forces were derived from first principles, meaning from the behavior of the electrons. After that, we trained a machine learning model so that it learned the forces that are exerted on each atom based on the positions of their neighboring atoms. Using this model we can then drive the dynamics and simulate the formation of ice.

The huge benefit of using an ML model is that it is much cheaper than if you had to solve the equations directly on the computer – which is what we had to do for a few thousand of small configurations to create the training data for the model. Once we have a well-trained model, we can study much larger systems and simulate for a longer period of time.


By preventing satellites from colliding with space debris.
Read more in the interview with Marlon Nuske.

By detecting pneumonia on X-ray images.
Read about how it works here.

By making visible regions of the moon that neither humans nor robots have ever seen.
Find out more in the interview with Valentin Bickel.

Larger systems and a longer period of time you say – but doesn’t nucleation take place very rapidly and at a very small size?
(Laughs) You’re right! The system size that one can simulate is in the order of tenths of nanometers – a billion times less than a meter – which is really small. However, the systems still consist of around 300.000 atoms. The time is in fact also rather short, only a few tenths of nano seconds – a billion times less than a second. Still, it’s a thousand times longer a duration than what we were able to simulate before. So, it’s very significant.

Can you explain in more detail what is so revolutionary about your approach?
While we are not the first ones to simulate the nucleation of ice, we managed to do it for the first time using machine learning (a deep neural network, to be more exact) and what we call first principles calculations. Thus, we could achieve a very high accuracy which is indeed revolutionary: so far, simulating ice nucleation with quantum accuracy was thought to be impossible due to the huge computational cost of quantum-mechanical calculations. By letting an ML model do the calculations, these costs can be significantly reduced.

Moreover, since our predictions are derived from first principles, one doesn’t have to know anything about the behavior of a substance in the real world. That is true for what we studied (ice), but the method can be applied to other substances as well and in many cases obtains the same high quality of prediction.

Until now, simulating ice nucleation with quantum accuracy was thought to be impossible due to the huge computational cost of quantum-mechanical calculations. By letting an ML model do the calculations, these costs can be significantly reduced.

First principles calculations – what exactly do you mean with that?
First principles, in physics, are the fundamental laws of nature – the most basic principles that don’t have to be broken down or proved any further. Starting from these, we calculated several properties that are connected to the phenomenon of nucleation. In particular, we were interested in a property called “nucleation rate” – the speed with which a cluster of ice forms. This property can be measured experimentally, so we could directly compare our predictions with experiments. Having the opportunity not only to predict this property with the computer but also to do it from first principles was very exciting.

Why are experiments not able to observe the phenomenon directly?
As we already discussed, nucleation takes place in a very short time and at the size of only nanometers. This means that it is very hard to see with the naked eye what is going on during the actual process in an experiment. Thus, simulations are very popular in this context: they give insights into molecular mechanisms that are otherwise impossible to obtain – that is true in general for the study of crystallization and nucleation.

Figure 1: Simulation of an ice cluster surrounded by liquid water driven by the machine learning model. Water molecules are depicted as opaque if they have ice-like atomic environments and as semi-transparent if they have liquid-like atomic environments. Water molecules are composed of one oxygen atom (red) and two hydrogen atoms (white).

Now that you can simulate the nucleation of ice with quantum accuracy – how can this be used?
There are 3 main areas that may benefit from this. First, cryopreservation – the idea that one can preserve living cells and tissue by rapidly cooling the sample. Understanding how ice forms in this context could bring a technological advantage. Second, food processing – as food is often frozen in order to distribute it. Moreover, knowledge on the formation of ice could improve climate models especially once we are able to simulate how ice forms in the atmosphere.

How will you achieve that?
We are currently trying to understand how ice forms in more realistic environments. So far, we have just studied how ice forms in pure water, but in nature, it also happens that ice forms e.g., at the surface of particles. That is thought to play a crucial role in the formation of ice in the atmosphere of our planet. And we could use the exact same method that we did to simulate the nucleation of ice in water.

So, what’s the next step?
We have just identified which particles are mostly responsible for the formation of ice in the atmosphere. It’s a mineral called feldspar which is very abundant in the Earth’s crust. And now, we’re studying how ice forms at these surfaces. How exciting would it be to unravel the mystery of how ice forms in the atmosphere!

About Pablo Piaggi

Pablo Piaggi (*1990) currently holds the position of a postdoctoral research associate in the Department of Chemistry at Princeton University. He received his Ph.D. in Materials Science and Engineering in 2019 from EPFL (Switzerland). Piaggi was awarded a postdoctoral mobility fellowship from the Swiss National Science Foundation which brought him to Princeton. For his work on the use of advanced simulation methods to predict the crystal structure of materials, he has received several awards, including the 2021 IBM Research Award.


In conversation with
Sustainability AI in research Machine learning AI

AI scares off wolves
In conversation with
AI for good AI in research Machine learning

AI monitors armed conflicts
In conversation with
Sustainability AI for good AI in research AI

AI & the climate