Grand Canonical Monte Carlo (GCMC)

General info

Tip

Take a look at the GCMC tutorial and learn how to setup a GCMC calculation.

About Monte Carlo / the Grand Canonical Ensemble

It is best to read a bit about Monte Carlo and ensembles before working with the GCMC code. Almost every book or review text on molecular simulations will do, for example: Frenkel D, Smit B. Understanding molecular simulation: from algorithms to applications. Academic Press; 2002. 672 p.

Wikipedia also has some pages of interest:

It is important to note that this method heavily relies on random numbers, and simulations are thus non-repeatable in detail, but should converge to the same answer.

About the Reaxff GCMC code

The GCMC code for reaxff was originally developed by Thomas Senftle, working as a Graduate Student at Penn State University under the supervision of Dr. Adri van Duin. The original version was a wrapper code that called an external executable to perform the reaxff minimization step and energy calculation, and relied on file modification and parsing to steer the reaxff code and get the results back.

A rewrite of the code, made by Hans van Schoot (SCM) in close collaboration with Thomas Senftle, is now available in the ADF package. The rewrite directly integrates into the ADF-ReaxFF code, solving performance issues of the original code by removing the calling overhead of the reaxff executable and the relatively slow file management. It also merged several modifications of the original code to support the usage of whole molecules for Monte Carlo moves, and supports the usage of multiple atom/molecule types during the simulation. Other improvements were made on the input options, the accessible volume calculation, the MC acceptance prefactor calculation and the writing of logfiles.

The relevant papers are:

Thomas P. Senftle, Randall J. Meyer, Michael J. Janik and Adri C.T. van Duin, Development of a ReaxFF potential for Pd/O and application to palladium oxide formation, J. Chem. Phys. 139, 044109 (2013)

Thomas P. Senftle, Adri C.T. van Duin, Michael J. Janik, Determining in situ phases of a nanoparticle catalyst via grand canonical Monte Carlo simulations with the ReaxFF potential, Catalysis Communications volume 52, 5 July 2014, Pages 72–77

Input

A worked out GCMC exercise is available from a 2-day ReaxFF workshop.

Overview

The GCMC code in ADF-ReaxFF needs the following input files to run:

  • control_MC : The GCMC control file, which holds MC settings and the atoms/molecules to insert/move/delete

  • control : A reaxff control file, in which only a small number of parameters is of interest

  • ffield : A standard reaxff forcefield file

  • geo : A geometry file, preferably in biograph format (code not yet tested with xyz)

  • iopt : Text file that should only contain a “5” (without the quotes)

  • insertData_MC : (optional file) Table used when restarting GCMC simulations

the control_MC file

Lines in the control_MC file that start with ! or # will be ignored, so those can be used for comments. Empty lines are also ignored, so feel free to leave some in the file. Lines with keywords should have their value in the first 8 columns, followed by a couple of spaces (at least 1), followed by the 6-character keyword. The order is free, except for the nmols keyword, which should be the last one. The nmols keyword signals the parser that the next section of control_MC will define X new MC Molecule Types.

This is an example for the control_MC file:

! GCMC control file example
      0   iensmb !select MC ensemble (0=Mu-NVT with fixed volume, 1=Mu-NPT with variable volume)
   5000   niter  !number of MC iterations to do this simulation
      0   nstart !start the iteration counter with an offset, useful for restarts to avoid double files
  300.0   mctemp !Temperature of the simulation, affects acceptance rate for steps that increase the energy
    0.0   mcpres !NPT pressure in GPa (set to zero for incompressible solid systems unless at very high pressures)
    3.0   rmaxpl !Max radius for atom placement on insert/displace move
    1.2   rminpl !Min radius for atom placement on insert/displace move
   2000   nmctry !Maximum number of trials allowed when inserting or moving a molecule. If the
!                !  rmaxpl and rminpl variables are very strict, this number needs to be large
      1   igcfac !include GC prefactor in probabilities? 0 = no 1 = yes
      0   ivol   !select MC volume calculation technique:
!                !  0: vvacu needed! volume = total volume - occupied volume - specified vacuum volume (vvacu)
!                !  1:               volume = total cell volume
!                !  2: vacc needed!  volume = specified accessible volume (vacc)
!                !  3:               volume = total cell volume - occupied volume
!                !  4: vacc needed!  volume = specified accessible volume (vacc) - occupied volume
  435.0   vacc   !if ivol=2 or ivol=4, specify Vacc in angstroms^3
    0.0   vvacu  !if ivol=0 specify non-accessible (vacuum) volume Vvacu in angstroms^3
   0.25   ivlim  !volume change limit (value between between 0 and 1, Vnew = ((1+ivlim)*V1)
      1   resopt !write restart files: 0=no, 1=yes
      1   resfrq !frequency of writing restart files (MC code only writes files if the MC move is accepted)
      0   debug  !print debug output if set to 1, print even more debug output when set to 2
      5   nmols  !Number of MC molecule types, must match the number of molecule blocks that follow!
!
! Molecule Specific Data: C2H2 example
!    This part is fixed format!
!    We need cmpot on line 1,
!    possibly followed by the noinsr on line two,
!    and forced to be ended with nmatom on line 2 or 3, followed by nmatom lines of coordinates.
! the coordinates are FIXED FORMAT! (24d.15,1x,A2) x,y,z (24 wide, 15 after decimal), 1 space, 2chars symbol)
 -75.00   cmpot   !chemical potential of molecule
      4   nmatom  !number of atoms in molecule
      12.180480000000000       0.421696000000000       1.316689000000000 C
      13.124731000000000       0.376902000000000       0.568360000000000 C
      11.349475000000000       0.459560000000000       1.988208000000000 H
      13.957314000000000       0.335843000000000      -0.101000000000000 H

!Molecule Specific Data: C2H4 example
 -75.00   cmpot   !chemical potential of molecule
      6   nmatom  !number of atoms in molecule
      13.989222000000000       0.405391000000000       1.000150000000000 C
      13.316784000000000       0.399646000000000       0.885795000000000 C
      11.494513000000000       0.461837000000000       1.970612000000000 H
      11.335219000000000       0.353577000000000       0.129581000000000 H
      13.811701000000000       0.340224000000000      -0.084000000000000 H
      13.970561000000000       0.453325000000000       1.756236000000000 H

!Molecule Specific Data: H2O Example
 -75.00   cmpot   !chemical potential of molecule
      1   noinsr  !setting this to 1 disables insert/deletion moves. If it is set to 1 for all types, the ensemble becomes NVT/NPT
      3   nmatom  !number of atoms in molecule
      39.996720000000000      40.747660000000000      40.512210000000000 H
      40.000210000000000      39.999520000000000      39.934730000000000 O
      40.000030000000000      39.259880000000000      40.523700000000000 H

!Molecule Specific Data: H2 Example
 -75.00   cmpot   !chemical potential of molecule
      2   nmatom  !number of atoms in molecule
       5.025812000000000      0.0000000000000000       0.000000000000000 H
       5.774188000000000      0.0000000000000000       0.000000000000000 H

!Molecule Specific Data: Single atom example
 -75.00   cmpot   !chemical potential of molecule
      1   nmatom  !number of atoms in molecule
       0.000000000000000      0.0000000000000000       0.000000000000000 H

The Molecule Specific Data blocks define the molecules (or atoms) that can be inserted/moved/deleted with the MC code. The atoms named here should of course be in the forcefield files, and the coordinates should form a reasonable structure. The MC code uses these coordinates during the insertion step by giving them a random rotation, followed by a random translation to generate a random position of the molecule inside the box. Currently, there is no check to make sure the molecule stays inside the boundaries of the box, the code only checks that the rmaxpl/rminpl values are satisfied. If you plan on inserting large molecules, make sure there is enough room in the rmaxpl value, otherwise the code will stop with an error message.

The chemical potential (cmpot) keyword

The cmpot keyword sets the chemical potential of the molecule (or atom) reservoir, and is employed when calculating the Boltzmann accept/reject criteria after a MC move is executed. This value can be derived from first principles using statistical mechanics, or equivalently, it can be determined from thermochemical tables available in literature sources.

For example, the proper chemical potential for a GCMC simulation in which single oxygen atoms are exchanged with a reservoir of O2 gas, should equal 1/2 the chemical potential of O2 at the temperature and pressure of the reservoir:

cmpot = Mu_O(T,P) = 1/2*Mu_O2(T,P) = 1/2 * [Mu_ref(T,P_ref) + kT*Log(P/Pref) - E_diss]

where the reference chemical potential [Mu_ref(T,P_ref)] is the experimentally determined chemical potential of O2 at T and Pref, kT*Log(P/Pref) is the pressure correction to the free energy, and E_diss is the dissociation energy of the O2 molecule.

The no insert (noinsr) keyword

The noinsr keyword tells the GCMC code to keep the number of molecules/atoms of this type fixed. It will thus disable Insert/Delete moves on this type, meaning it can only do a displacement move, or volume change move (if the iensmb keyword is set to 1).

the control file

The control file is a regular reaxff control file and it influences the minimization step after an MC trial move. Because of this, only a small number of the reaxff keywords are used during the GCMC simulation.

An example of the control file:

# some of the parameters that influence the minimization step in the GCMC code
      1 icentr     Put the center of mass at the center of the cube
      1 igeofo     0:xyz-input geometry 1: Biograf input geometry 2: xmol-input geometry
2.50000 endmm      End point criterion for MM energy minimization
    500 imaxit     Maximum number of iterations
      0 icelop     Optimize cell parameters 0=no 1=yes
1.00050 celopt     Cell parameter change
      0 imaxmo     In this case: 0: POLAK_RIBIERE Conj.Grad method, 1: Limited-memory BFGS method

The code has been tested with various imaxit and endmm values, the other options have not been fully tested. Other reaxff keywords might also influence the minimization procedure, but those are best left to their default settings

the ffield file

The ffield file should be an normal reaxff forcefield file, as described in the reaxff documentation by A. van Duin (visit the documentation section on the SCM website to obtain this document).

the geo file

The GCMC code has been tested with biograph input files, but other input formats might work. The details of this file are also described in the original reaxff documentation by van Duin.

the iopt file

The iopt file is a text file with a single digit inside that selects the execution mode of the reaxff code. To run the GCMC code, this file should contain a “5” (without the quotes).

the insertData_MC file

The GCMC code can insert multiple atom/molecule types in a single simulation, so it needs to keep track of what atom belongs to which insert. This information is automatically stored and updated when insertion/deletion/moving of atoms or molecules during the simulation, but is by default unknown of the atoms of the starting geometry. The GCMC code will therefore by default not modify the atoms in the original input in the MC trial moves (keep in mind that they can move around during the minimization step). the insertData_MC file can be used to tell the GCMC code what atoms in the geo file belong to which molecule.

An example of the insertData_MC file:

#   atomNumber  MCInsMolType  MCInsertNmbr
      30       1       1
      40       2       1
      46       2       1
      47       1       2
      48       1       3

This example specifies 4 molecules/atoms that are modifiable by the GCMC code, belonging to 2 different GCMC molecules/atoms that are defined in the control_MC file. The first “molecule” in the control_MC file should thus consist of a single atom (if this doesn’t match, the code will most likely crash!). It was inserted three times (atom 30, 47 and 48) The second molecule has two atoms, and was inserted once.

The atoms do not have a fixed order, and not all atoms have to be defined. If an atom is not appointed to a certain MCInsMolType and MCInsertNmbr, if will simply not be modified during the MC moves. The insertData_MCXXXXXX files generated by the restart option of the code can be directly used as valid insertData_MC files, just remove the digits from the filename and replace the geo file with the corresponding geo_MCXXXXXX file.

Output

Overview

The GCMC code writes a couple of output files, each described in this section. It also produces a number of reaxff output files, and some of these are described in the original reaxff documentation by van Duin. Keep in mind that these files might not provide a complete or correct picture of the simulation, as they could also contain data originating from rejected MC trial moves.

geo_MCXXXXXX

This file is generated every X accepted MC moves and contains the current geometry of the system in biograph format (X is set with the resfrq keyword in the control_MC file).

insertData_MCXXXXXX

This file contains a table of all the atoms in the system with their MC Molecule Type and MC Insert Number. This data can be used to map atoms to an inserted molecule, and is needed if you want to restart your calculation from an accepted MC step. The table contains -1 values for atoms that were in the original input and did not get a manually assigned MCInsert Molecule Type and MC Insert Number, the GCMC code will not modify these atoms during the MC steps.

Also see the section on insertData_MC file.

MCstats

The MCstats file is a logfile that contains the statistics of the MC simulation. The GCMC code writes a single line to it after every MC step, containing the number of: Tried MC moves (tried), Accepted MC moves (accept), Rejected MC moves (reject), Accepted Insertion/Deletion/Moving/Volume change MC moves (addAcc/delAcc/mvAcc/volAcc), Rejected Insertion/Deletion/Moving/Volume change MC moves (addRej/delRej/mvRej/volRej)

An example of the MCstats file:

tried accept reject    addAcc delAcc  mvAcc volAcc    addRej delRej  mvRej volRej
    0      1      0         1      0      0      0         0      0      0      0
    1      1      1         1      0      0      0         0      0      1      0
    2      1      2         1      0      0      0         0      0      2      0
    3      1      3         1      0      0      0         0      1      2      0
    4      1      4         1      0      0      0         1      1      2      0
    5      1      5         1      0      0      0         1      1      3      0
    6      1      6         1      0      0      0         1      2      3      0
    7      1      7         1      0      0      0         1      3      3      0
    8      1      8         1      0      0      0         1      3      4      0
    9      1      9         1      0      0      0         1      4      4      0
   10      1     10         1      0      0      0         1      4      5      0
   11      2     10         2      0      0      0         1      4      5      0
   12      3     10         3      0      0      0         1      4      5      0

Elog

The Elog file contains the Volume and energies of the accepted MC steps. The energies in this logfile are the pure ReaxFF energy of the system (RxFFEnergy) and the MC corrected energy, which is used in determining if the step should be accepted or not (see the section on calculating energies for details).

An example of the Elog file:

Iteration Naccepted      Volume   MC Energy  RxFFEnergy
        0         1    15625.00    -3098.88    -3179.88
        2         2    15625.00    -3107.92    -3269.92
        3         3    15625.00    -3130.13    -3373.13
        4         4    15625.00    -3160.05    -3484.05
        6         5    15625.00    -3169.77    -3493.77
       11         6    15625.00    -3200.13    -3605.13

reaxout.kf

This is the binary logfile generated by the GCMC code. Its contents can be viewed with the KFBrowser utility, or it can be loaded into AMSmovie to view the geometries in the file. Only the data of successful MC moves is written to this file.

Code Details

Overview

The GCMC code will perform niter (control_MC file option) Grand Canonical Monte Carlo trial moves, and accept or reject them based on the Energy produced by the ReaxFF minimization step of the trial geometry. The Monte Carlo algorithm will always accept a step if it results in a decrease of the energy, and accept steps that go up in energy with a probability. This section will give some details about how the code works.

MC Moves (Insert/Delete/Move/Volume)

The GCMC code currently supports 4 types of MC Moves: Insert, Delete, Move (displace), Volume change. The first three moves always change a whole “molecule” of the system, as defined in the control_MC file (a molecule can of course contain only a single atom). Every MC iteration selects one MC Molecule Type from the defined molecules in control_MC at random, followed by a random MC Move (unless there are no molecules of the type in the system, in that case it will do an insert move).

The Insert and Displace (move) MC Moves will generate a random rotation and position for the molecule, and then check if the random positions are within the “RminPl” and “RmaxPl” boundaries (this means no atom in the molecule can be closer to any atom currently in the system than “RminPl”, and it should be within “RmaxPl” distance to an atom in the system). If the conditions are not satisfied, a new set of coordinates is generated and the code checks again. This is repeated a maximum number of “nmctry” times before stopping with an error.

The volume change is controlled by the “ivlim” variable in control_MC. The ivlim sets the volume change limit, and it should be a value between between 0 and 1. The new volume will be calculated like this: Vnew = (1+ivlim)*Vold.

Calculating energies

Because the GCMC simulation adds and deletes atoms or molecules during the runtime, it cannot directly compare the ReaxFF energies for the MC acceptance criteria: inserting a molecule will usually lower the total energy of the system, causing the MC to always accept it, and always reject a deletion. To balance this out, the GCMC code calculates a “corrected” MC energy to compare the trial reaxFF energy with, consisting of the previously accepted ReaxFF energy + the chemical potential (cmpot in control)MC) for the inserted molecule, or the ReaxFF energy - the chemical potential for the deleted molecule. The volume change energy is also corrected, using the following formula:

E_MC_Corr = E_reaxff_last_accept - (Pressure * 0.1439 * (newV-oldV)) + ((1.0/beta) * nInsertedMols * log(newVavail/oldVavail))

where newVavail and oldVavail are calculated from the MC available volume (see the section calculating volumes).

Calculating volumes

The GCMC code can calculate the available volume in a couple of different ways, depending on the ivol setting in control_MC:

  • ivol = 0: volume = total volume - occupied volume - specified vacuum volume (vvacu)

  • ivol = 1: volume = total cell volume

  • ivol = 2: volume = specified accessible volume (vacc)

  • ivol = 3: volume = total cell volume - occupied volume

  • ivol = 4: volume = specified accessible volume (vacc) - occupied volume

Where the occupied volume is calculated by summing up the volumes of the atoms in the geo file that are not specified to be part of an MC type molecule. The volume of an atom is calculated using the average of the covalent atomic radius and the vd Waals radius of the atom, which are found in the reaxff forcefield file (ffield).

the vacc and vvacu options can be specified in the control_MC file to get a more accurate available volume.

Acceptance criteria

An MC move is always accepted if the reaxff energy is lower than the corrected MC energy of the last accepted MC move, or if the energy increase is small enough. If the new energy is higher, the code generates a random number between 0 and 1, and accepts the move if the random number is bigger than:

prob = preFactor * exp(-Beta*deltaE)

Where the prefactor is calculated (for insert and delete moves) using the deBroglie wavelength of the inserted molecules, the number of inserted molecules and the available MC volume of the system.