Dependencies
- TinNet
- The Alloy-Theoretic Automated Toolkit (ATAT)
- PyTorch (v1.3.1)
- Pymatgen
- Python3
- Numpy
Background
Surface properties of materials are important for catalysis applications. Recent advances in computing power make density functional theory (DFT) to be possible
to gain insights into the reactions happening on the catalyst surfaces. However, the surface environments are way more complex and adsorbates on surfaces can have
many configurations. One possible way to solve this issue is by employing the cluster expansion to consider the adsorbate-adsorbate interactions.
Cluster expansion
Cluster expansion is developed based on the Ising model. With infinite terms of interactions considered, the exact energy of the system is obtained.
The Hamiltonian describing the energy of the system can be espressed as
\(
\begin{align}
\begin{split}
E = \pi * V
\end{split}
\end{align}
\)
π is the correlation matrix which is obtained through ATAT package and V is the interaction energy which can be fitted by using DFT true values. Although it works
well with pure metal surfaces[], the complexity increase dramatically when considering a alloy surface with various adsorbates. So a nonlinear CNN fitting model
is employed to consider not only the ads-ads interactions on the surface layer, but the local environment of the metal layer, which helps improve the accuracy of
predicting the formation energy of adsorbates.
DFT database pool
DFT database is provided by Bajpai et al. including pure Pt (111) with NO and O adsorbates.There are 648 images in total. The energy to train in this field is usually the formation energy per site, \(E_F(\sigma)\).
\(
\begin{align}
\begin{split}
Pt + \frac{\theta_O}{2}O_2 + \theta_{NO}NO \rightarrow PtO_{\theta_O}NO_{\theta_{NO}} \> \> \> (\sigma)
\end{split}
\end{align}
\)
\(
\begin{align}
\begin{split}
E_F(\sigma) = \frac{E_0(\sigma)-E_0(0)}{N} - \frac{\theta_O}{2}E_{0,O_2} - \theta_{NO}E_{0,NO}
\end{split}
\end{align}
\)
ASE db file or any other file format. Containing the input images and energy to train
Image format is pretty flexible as long as it can be recognized by Voronoi. The energy
Pure Pt(111) with NO and O adsorbates
From LCNN there are preprocessed data samples to be used. In structure
Cu dopant in Pt(111) with NO and O adsorbates
In this work, for convenience the VASP POSCAR format is used since it can be easily converted to other formats.
Also there is a function Cvrt_Cart_Frac.py which can convert the coordinates from Cartisan to fractional (Direct) coordinate for the convenience of next step since ATAT use fractional coordinate.
CM.txt file containing the needed correlation matrix for each image.
Prepare the dataset into the str.in format which can be recognized by the ATAT package.
The format of str.in can be simply obtained from the ATAT manual.
Once all str.in format structures are ready, one can use the Str_to_arr.py to generate the needed CM.txt file. The specific code to implement is called corrdump
which can read in the lat.in and str.in, and use criterion as the cut off to choose how many clusters to include.
Can choose to use clusters from literature or generate them from the ATAT package directly.
corrdump
requires several input options:
- reads the lattice file (-l option)
- find all symmetrically distinct clusters satisfying the conditions specified by two ways:
- distance between nodes: -2 through -6 (-2=2.1 means pairs shorter than 2.1 units is selected.)
- read in clusters directly from other sources: -c -cf=filename(usually clusters.out)
- reads the structure file (-s option)
With all inputs, it determines the corresponding correlation factor for each cluster for this specific structure read in. The output is in one line with the same order as the clusters.out.
While for a dataset with multiple images/strctures, multiple lines of correlation factors are generated and a m*n array can be generated where m is the number of images and n is the number of clusters included
Preparing the data images and energies
For convenience, the ASE db format is used to be the dataset which can have the images in various struture format and have the according energy for each image.
python Gen_formation_db.py
Formation energy per site which has been mentioned before is used as the training target.
For Cu1Pt(111) system, it was calculated by…
Training and Evaluating
Hyperparameter optimization
ray tune is employed to tune the hyperparameters in the model.
Tianyou Mou
Chemical Engineer and Data Scientist
Comments