A simple tutorial

Here we give a simple step-by-step guide through an example simulation, exploring some of the more generally useful options that i-PI offers and making no assumptions about previous experience with this code or other MD codes. Excerpts from the relevant input files are reproduced here, for explanation purposes, but to get the most out of this tutorial the user is strongly encouraged to work through it themselves. For this purpose, the input files have been included with the i-PI distribution, in the “demos/para-h2-tutorial/” directory.

The chosen problem is that of a small NVT simulation of para-hydrogen, using the Silvera-Goldman potential [SG78]. We will take (\(N\),\(P\),\(T\)) = (108, 0, 25 K).

Part 1 - NVT Equilibration run

In this tutorial, we consider the problem of how to use i-PI to run a NVT simulation of para-hydrogen. The first thing that is required is a client code that is capable of calculating the potential interactions of para-hydrogen molecules. Fortunately, one of the client codes distributed with i-PI has an appropriate empirical potential already hard-coded into it, and so all that is required is to create the “i-pi-driver” file compiling the code in the “drivers” directory, using the UNIX utility make.

We will first go over the creation of the i-PI input file and afterwards the commands of how to run i-PI and the client code. If you want to skip the first section, against our recommendation if you are a newcomer you can use the input file provided in the “tutorial-1” directory and skip directly to Running the simulation

This client code can be used for several different problems (see Built-in, fortran client), some of which are explored in the “examples” directory, but for the current problem we will use the Silvera-Goldman potential with a cut-off radius of 15 Bohr.

Creating the xml input file

Now that the client code is ready, an appropriate xml input file needs to be created from which the host server and the simulation data can be initialized. Here, we will go step by step through the creation of a minimal input file for a simple NVT equilibration run. Note that the working final version is held within the “tutorial-1” directory for reference.

Firstly, when reading the input file the i-PI xml functions look for a simulation tag as a sign to start reading data. For those familiar with xml jargon, we have defined simulation as the root tag, so all the input data read in must start and end with a tag, as shown below:

<simulation> Input data here... </simulation>

xml syntax requires a set of hierarchically nested tags, each of which contain data and/or more tags. Also, i-PI itself requires certain tags to be present, and keeps track of which tags are supposed to be where. More information about which tags are available can be found in input tags, more information on xml syntax can be found in Input file format and structure, and possible errors which can occur if the input file is not well-formed can be found in the troubleshooting section.

For the sake of this first tutorial however, we will simply discuss those tags which are needed for a single NVT equilibration run. The most important tags are initialize, ensemble, motion, dynamics, “total_steps”, and forces. These correspond to the tag to initialize the atom configurations (initialize), the tag ensemble defines the properties of the statistical ensemble considered (like temperature and pressure). While motion contains everything regarding the motion of the atoms in general and allows to specify which type of computation is expected (dynamics, geometry optimization, etc.), the tag dynamics contains all the specifics relative to the molecular dynamics (such as the barostat, thermostat and the time step, whose total number is defined in the total_step tags.). In theory it would be possible to join many different forces into the tag forces. Each of the forces needs a force field socket (ffsocket tag) that specify who is in charge of computing the forces and sending them to i-PI. Note that the attribute “forcefield” of the force tag must correspond to the attribute “name” of one of the ffsocket tag. All the tags that define the actual simulation must be within the system block. We will also discuss Output file tags, which is used to define what output data is generated by the code.

After this short introduction, let’s get down to work. We start with an input file that simply looks like this:

<simulation verbosity='high'>
   ...
</simulation>

and in the next subsections, we describe and show code snippets for all the other sections.

Initializing the configurations

First, we consider the initialize tag within the system block. As the name suggests, this initializes the state of the system, so this is where we will specify the atom positions and the cell parameters. Firstly, this takes an attribute which specifies the number of replicas of the system, called “nbeads”. An attribute is a particular type of xml syntax designed to specify a single bit of data, and has the following syntax:

<initialize nbeads=’4’> ... </initialize>

Note that an attribute forms part of the opening tag, and that the value being assigned to it is held within quotation marks. In this case, we have set the number of replicas, or beads, to 4. To run classical molecular dynamics, just set this value to one (nbeads=1).

Next, we must specify the atomic configuration. Rather than initialize the atom positions manually, we will instead use a separate configuration file for this purpose. Here we will discuss two of the input formats that are compatible with i-PI: xyz files and pdb files.

Note that, for the sake of this tutorial, we have included valid xyz and pdb input files in the “tutorial-1” directory called “our_ref.xyz” and “our_ref.pdb”, respectively.

The xyz format is the simplest input format for a configuration file that i-PI accepts, and has the following syntax:

natoms
# CELL(abcABC): a b c A B C cell{angstrom} postions{angstrom}
atom1 x1 y1 z1
atom2 x2 y2 z2
...

where “natoms” is replaced by an integer giving the total number of atoms, in this case 108, atom1 is a label for atom 1, in this case H2 (since we are simulating para-hydrogen), and (x1, y1, z1) are the x, y and z components of atom 1 respectively. The second line is the comment line, and can also contain the cell parameters (a,b, and c are the lattice vectors, and A, B,C the angles) In the example above we use the syntax “cell{angstrom}” and “postions{angstrom}” to indicate the cell parameters and the position coordinates are provided in angstroms.

Note that we are treating the para-hydrogen molecules isotropically here, i.e. as spherical psuedo-atoms. For the current system this is a good approximation, since at the state point under consideration every molecule is in its rotational ground state. For further details on this potential, and a demonstration of its application to quantum dynamics, see [SG78] and [MM05].

Other than its simplicity, the main advantage of this type of file is that it is free-formatted, and so there is no set precision to which each value must be written. This greatly simplifies both reading and writing these files.

The other file format that we can use is the pdb format. This has the following structure:

TITLE <insert title here> position{angstrom} cell{angstrom}
CRYST1 a b c  A B C P 1 1
ATOM 1 atom1 1 1 x1 y1 z1 0.00 0.00 0
ATOM 2 atom2 1 1 x2 y2 z2 0.00 0.00 0
...

where a, b and c are the cell vector lengths, A, B and C are the angles between them, atom1 and atom2 are the labels for atoms 1 and 2, and (x1, y1, z1) and (x2, y2, z2) give the position vectors of atoms 1 and 2.

Note that this is fixed-formatted, so the number of spaces matters. Essentially, the above format needs to be copied verbatim, using the same column widths and all the same keywords. For an exact specification of the file format (of which only a subset is implemented with i-PI) see https://www.wwpdb.org/documentation/file-format

Here we will show how to specify the xml input file in both of these cases, assuming that the user has already created the configuration file themselves. Note that these file formats can be read by visualization programs such as VMD, and so it is generally advised when making your own input files to use such software to make sure that the configuration is as expected.

To use a configuration file the file tag in initialize should be used. This will take an input file with a given name and use it to initialize all relevant data. Both of these formats have the atom positions and labels, so this will initialize the positions, labels and masses of all the particles in the system, with the masses being implicitly set based on the atom label. The pdb configuration file will also be used to set the cell parameters.

Let us take these two file types in turn, and form the appropriate input sections. First, the xyz file. There are two attributes which are relevant to the file tag for our current problem, “mode” and “units”. “mode” is used to describe what kind of data is being used to initialize from, and so in this case will be “xyz”. “units” specifies which units the file is given in, and so in this case is given by “angstrom”, which are the standard units of both xyz and pdb files. Note that if no units are specified then atomic units are assumed. For more information on the i-PI unit conversion libraries, and the available units, see Overriding default units.

The “units” attribute is now deprecated and will be removed in the future version of i-PI. The alternative, and the only one available in the future, is to specify the units within the comment line of the xyz or the TITLE line of the pdb formats (as shown in the examples above). It is also important to put the units only in one place: if the units are present in both, the configuration file with the tag “units” and in the input files (xyz or pdb) the conversion will be applied twice.

A further comment on the cell units and parameters

It is important to note that the units of the cell parameters and the units of the content of the files are specified separately (“positionunits” specify the units of the data and “cellunits” specify the units of the cell). This is necessary because the xyz format can be used to also store quantities which have a different dimension than length (velocities, forces, etc.). Even the cell parameters can now be specified directly within the xyz format. The comment line is parsed looking for a cell specification in the following format:

  • “CELL{abcABC}:” followed by six float numbers.

  • “CELL{H}:” followed by nine float numbers.

  • “CELL{GENH}:” followed by nine float numbers.

The “CELL{abcABC}” must be followed by the length of the vector cell and the three angles between them (as in the CRYST1 field of the pdb format -see above-). The other two must be followed by nine floats specifying, respectively, all the values of the cell matrix (flattened) or all the value of the inverse of the cell matrix (flattened).

Since the units are already specified into the xyz and pdb files, the config file will contain:

If the cell parameters are not specified in the xyz file, then, in the configuration file we must specify them separately. To initialize just the cell parameters, we use the cell tag. These could in theory be set using a separate file, but here we will initialize them manually. Taking a cubic cell with cell parameter 17.847 angstroms, we can specify this using the cell tag in three different ways:

<cell mode=’manual’ units=’angstrom’> [17.847, 0, 0, 0, 17.847, 0, 0,
0, 17.847] </cell>
<cell mode=’abcABC’ units=’angstrom’> [17.847, 17.847, 17.847, 90,
90, 90] </cell>
<cell mode=’abc’ units=’angstrom’> [17.847, 17.847, 17.847] </cell>

If the xyz already contains the cell parameters, i-PI will use those which are read the last in the config file (if the “cell” tag follows the “file” specification then the cell parameters are those defined in the “cell” tag. If, otherwise, the “cell” tag compares in the config file before the “file” specification, then the cell parameters of the xyz file are used).

Note the use of the different “mode” attributes, “manual”, “abcABC” and “abc”. The first creates the cell vector matrix manually, the second takes the length of the three unit vectors and the angles between them in degrees, and the last assumes an orthorhombic cell and so only takes the length of the three unit vectors as arguments. We will take the last version for brevity, giving as our final initialize section:

<system>
  <initialize nbeads='4'>
    <file mode='xyz'> our_ref.xyz </file>
    <cell mode='abc' units='angstrom'>
      [17.847, 17.847, 17.847]
    </cell>
    ...
  </initialize>
</system>

The pdb file is specified in a similar way, except that no cell tag needs to be specified and the “mode” tag should be set to “pdb” (the units should be specified into the pdb file as shown in the example above):

<system>
  <initialize nbeads='4'>
     <file mode='pdb'> our_ref.pdb </file>
     ...
   </initialize>
</system>

As well as initializing all the atom positions, this section can also be used to set the atom velocities. Rather than setting these manually, it is usually simpler to sample these randomly from a Maxwell-Boltzmann distribution. This can be done using the velocities tag by setting the “mode” attribute to “thermal”. This then takes an argument specifying the temperature to initialize the velocities. With this, the final initialize section is:

<system>
  <initialize nbeads='4'>
       <file mode='pdb'> our_ref.pdb </file>
       <velocities mode='thermal' units='kelvin'> 25 </velocities>
  </initialize>
</system>

Generating the correct dynamics

We continue within the system block and consider the motion tag, which determines the computation i-pi will perform. Since we wish to run molecular dynamics, the attribute “mode” of the “motion” tag must be equal to “dynamics”. The details of the dynamics integration are given within dynamics. Since we wish to do a NVT simulation, we set the “mode” attribute to “nvt” (note that we use lower case, and that the tags are case sensitive), and must specify the temperature using the appropriate tag:

<system>
  ...
  <motion mode=’dynamics’>
    <dynamics mode=’nvt’> ... </dynamics>
  </motion>
</system>

This defines the computation that will be performed. We also must decide which integration algorithm to use, and how large the time step should be. In general, the time step should be made as large as possible without there being a drift in the conserved quantity. Usually, we would take a few short runs with different time steps to try and optimize this, but for the sake of this tutorial we will use a safe value of 1 femtosecond, giving:

<system>
  ...
  <dynamics mode=’nvt’>
       ...
       <timestep units=’femtosecond’> 1 </timestep>
  </dynamics>
</system>

Finally, while the microcanonical part of the integrator is initialized automatically, there are several different options for the constant temperature sampling algorithm, specified by thermostat. For simplicity we will use (the global version of) the path-integral Langevin equation (PILE) algorithm [CPMM10], which is specifically designed for path integral simulations. This is specified by the “mode” tag “pile_g”. This integrator also has to be initialized with a time scale parameter, “tau”, which determines how strong the thermostat is, which we will set to 25 femtoseconds. Putting all of this together, we get:

<system>
  ...
 <dynamics mode='nvt'>
     <thermostat mode='pile_g'>
        <tau units='femtosecond'> 25 </tau>
     </thermostat>
     <timestep units='femtosecond'> 1 </timestep>
  </dynamics>
</system>

Now that we have decided on the time step, we will decide the total number of steps to run the simulation for. Equilibrating the system is likely to take around 5 picoseconds, so we will take 5000 time steps, using:

<total_steps> 5000 </total_steps>

The temperature must be specified within the ensemble:

<system>
    ...
    <ensemble>
        <temperature units=’kelvin’> 300 </temperature>
    </ensemble>
    ...
</system>

To recap, at this point the input file looks as follows

<simulation verbosity='high'>
   <total_steps> 5000 </total_steps>
   <system>
     <initialize nbeads='4'>
       <file mode='pdb'> our_ref.pdb </file>
       <velocities mode='thermal' units='kelvin'> 25 </velocities>
     </initialize>
     <motion mode='dynamics'>
       <dynamics mode='nvt'>
         <thermostat mode='pile_g'>
           <tau units='femtosecond'> 25 </tau>
         </thermostat>
         <timestep units='femtosecond'> 1 </timestep>
       </dynamics>
     </motion>
   </system>
   <ensemble>
       <temperature units='kelvin'> 25 </temperature>
   </ensemble>
</simulation>

Please make sure you understand all the lines in the input file before continuing.

Defining the forces

We continue within the the system block, and now consider the forces tag that defines each of (the possibly many) components of the forces. Within this tag, the user can specify many different “force” tags and the final force will be the sum of the contribution from each “force” tag. In this simple example, however, we consider only one force component, and the corresponding section of the input reads

<system>
  ...
  <forces>
     <force forcefield='driver'> </force>
  </forces>
<system>

The attribute “forcefield” of the tag force is simply a label that allows i-PI to match that particular force component with the corresponding server socket. Note that this apparent unnecessary-complicated syntax makes possible complex setups required by more advanced simulations.

Creating the server socket

Next let us consider the ffsocket which deals with communication with the client codes. In this example, we only need to specify a single ffsocket tag:

<ffsocket> ... </ffsocket>

A socket is specified with three parameters; the port number, the hostname and whether it is a unix or an internet socket. These are specified by the “port” and “address” tags and the “mode” attribute respectively. To match up with the client socket specified above, we will take an internet socket on the hostname localhost and use port number 31415.

This gives the final ffsocket section:

<ffsocket mode="inet" name="driver">
    <address> localhost </address>
    <port> 31415 </port>
</ffsocket>

and by adding it to the previous sections we have

<simulation verbosity='high'>
   <total_steps> 5000 </total_steps>
   <ffsocket mode="inet" name="driver">
     <address> localhost </address>
     <port> 31415 </port>
   </ffsocket>
   <system>
     <initialize nbeads='4'>
       <file mode='pdb'> our_ref.pdb </file>
       <velocities mode='thermal' units='kelvin'> 25 </velocities>
     </initialize>
     <motion mode='dynamics'>
       <dynamics mode='nvt'>
         <thermostat mode='pile_g'>
           <tau units='femtosecond'> 25 </tau>
         </thermostat>
         <timestep units='femtosecond'> 1 </timestep>
       </dynamics>
     </motion>
   </system>
   <ensemble>
       <temperature units='kelvin'> 25 </temperature>
   </ensemble>
</simulation>

Note that the ffsocket section lives outside the system block.

Customizing the output

So far, we have only considered how to set up the simulation, and not the data we wish to obtain from it. However, there are a wide variety of properties of interest that i-PI can calculate and a large number of different output options, so to avoid confusion let us go through them one at a time.

First, we have the standard output. For this output, the amount of data can be adjusted with the “verbosity” attribute of simulation:

<simulation verbosity=’high’> ... </simulation>

By default the verbosity is set to “low”, which only outputs important warning messages and information, and some statistical information every 1000 time steps. Here we will set it to “high”, which will tell i-PI to output the wall time required for the last step at each step and information related to the socket communication.

Second, we have output written to file(s). The content of such output(s) is specified by the output tag. There are three types of files; properties files, trajectory files and checkpoint files, which are specified with properties, trajectory and checkpoint tags respectively. For an in-depth discussion on these three types of output files see Output files, but for now let us just explain the rationale behind each of these output file types in turn.

checkpoint files:

These give a snapshot of the state of the simulation. If used as an input file for a new i-PI simulation, this simulation will start from the point where the checkpoint file was created in the old simulation.

trajectory files:

These are used to print out properties relevant to all the atoms, such as the velocities or forces, for each degree of freedom. These can be useful for calculating correlation functions or radial distribution functions, but possibly their most useful feature is that visualization programs such as VMD can read them, and then use this data to show a movie of how the simulation is progressing.

properties files:

These are usually used to print out system level properties, such as the timestep, temperature, or kinetic energy. Essentially these are used to keep track of a small number of important properties, either to visualize the progress of the simulation using plotting programs such as gnuplot, or to be used to get ensemble averages.

Now that we know what each input file is used for, let’s analyze the output section as provided in “tutorial-1/tutorial-1.xml” which reads

<output prefix='tut1'>
  <checkpoint filename='checkpoint' stride='1000' overwrite='True'> </checkpoint>
  <properties filename='md' stride='1'>
      [step, time{picosecond}, conserved{kelvin}, temperature{kelvin},
       potential{kelvin}, kinetic_cv{kelvin}]
  </properties>
  <trajectory filename='pos' stride='100' format='pdb' cell_units='angstrom'>
           positions{angstrom}
  </trajectory>
  <trajectory filename='forces' stride='100'> forces  </trajectory>
</output>

This setup will create 11 files:

checkpoint file: “tut1.checkpoint”

properties file: “tut1.md”

position trajectory files (1 file per bead): “tut1.pos_0.pdb”, “tut1.pos_1.pdb”, “tut1.pos_2.pdb”, and “tut1.pos_3.pdb”

forces trajectory files (1 file per bead): “tut1.forces_0.xyz”, “tut1.forces_2.xyz”, “tut1.forces_1.xyz”, and “tut1.forces_3.xyz”

The filenames are created using the syntax “prefix”.“filename”[_(file specifier)][.(file format)], where the file specifier is added to separate similar files. For example, in the above case the different position trajectories for each bead are given a file specifier corresponding to the appropriate bead index.

The “stride” attribute sets how often data is output to each file; so in the above case the properties are written out every 10 time steps, the trajectories every 100, and the checkpoints every 1000. The “format” attribute sets the format of the trajectory files, and the “overwrite” attribute sets whether each checkpoint file overwrites the previous one or not.

There are several options we can use to customize the output data. Firstly, the “prefix” attribute should be set to something which can be used to distinguish the files from different simulation runs. In the previous snippet we set it to “tut1”:

As for the input parameters, the default units are always atomic units. However, this can be modified by the user by specifying an appropriate unit in curly braces after the name of the property or trajectory of interest. In the previous snippet, we have for example set the temperature units to kelvin and position coordinates to angstroms

When using ‘pdb’ format, it is important to add the “cell_units” attribute to the trajectory tag, so that the cell parameters are consistent with the position output.

Finally, let us suppose that we wished to output another output property to a different file. One example of when this might be necessary is if there were an output property which was more expensive to calculate than the others, and so it would be impractical to output at every time step. With i-PI this is easy to do, all that is required is to add another properties tag with a different filename.

For demonstration purposes, we will choose to print out the forces acting on one tagged bead, since this requires an argument to be passed to the function that calculates it. The i-PI syntax for doing this is to have the arguments to be passed to the function between standard braces, separated by semi-colons.

To print out the forces acting on one bead we need the “atom_f” property, which takes two arguments, “atom” and “bead”, giving the index of the atom and bead tagged respectively. The appropriate syntax is then given below:

<output prefix='tut1'>
  ...
  <properties filename='force' stride='20'> [atom_f{piconewton}(atom=0;bead=0)] </properties>
</output>

This will print out the force vector acting on bead 0 of atom 0.

Input file tutorial-1

If you reached this point, you have been able to specify the input file from scratch, well done! Hopefully, this has helped you to understand the most important syntaxes of the i-PI input file. However, we recommend that next time you use one of the many input files provided within the “examples” and “demos” folder.

For the sake of completeness, we copy the input file we have just created:

<simulation verbosity='high'>
  <output prefix='tut1'>
    <checkpoint filename='checkpoint' stride='1000' overwrite='True'> </checkpoint>
    <properties filename='md' stride='1'>
        [step, time{picosecond}, conserved{kelvin}, temperature{kelvin},
         potential{kelvin}, kinetic_cv{kelvin}]
    </properties>
    <trajectory filename='pos' stride='100' format='pdb' cell_units='angstrom'>
             positions{angstrom}
    </trajectory>
    <trajectory filename='forces' stride='100'> forces  </trajectory>
    <properties filename='force' stride='20'> [atom_f{piconewton}(atom=0;bead=0)] </properties>
  </output>
  <total_steps> 5000 </total_steps>
  <ffsocket mode='inet' name='driver'>
    <address>localhost</address>
    <port> 31415 </port>
  </ffsocket>
  <system>
    <initialize nbeads='4'>
      <file mode='pdb'> our_ref.pdb </file>
      <velocities mode='thermal' units='kelvin'> 25 </velocities>
    </initialize>
    <forces>
      <force forcefield='driver'> </force>
    </forces>
    <ensemble>
      <temperature units='kelvin'> 25 </temperature>
    </ensemble>
    <motion mode='dynamics'>
      <dynamics mode='nvt'>
        <thermostat mode='pile_g'>
          <tau units='femtosecond'> 25 </tau>
        </thermostat>
        <timestep units='femtosecond'> 1 </timestep>
      </dynamics>
    </motion>
  </system>
</simulation>

Running the simulation

If you haven’t already, please check out the Installing i-PI section of this documentation to set up i-PI.

In the following, we assume that you are in the “demos/para-h2-tutorial/tutorial-1” folder. Now that we have a valid input file, we can run the first part of the tutorial using:

> i-pi tutorial-1.xml

This will start the i-PI simulation, creating the server socket and initializing the simulation data. This should at this point print out a header message to standard output, followed by a few information messages that end with “starting the polling thread main loop”, which signifies that the server socket has been opened and is waiting for connections from client codes.

At this point, the driver code is run in a new terminal from the “drivers” directory using the command:

> i-pi-driver -m sg -a localhost -o 15 -p 31415

The option “-m” is followed by the empirical potential required, in this case we use “sg” for Silvera-Goldman, “-a localhost” sets up the client hostname (address) as “localhost”, “-o 15” sets the cut-off to 15 Bohr, and “-p 31415” sets the port number to 31415.

The i-PI code should now output a message saying that a new client code has connected, and started running the simulation.

Output data

Once the simulation is finished (which should take about half an hour) it should have output “tut1.md”, “tut1.force”, “tut1.pos_0.xyz”, “tut1.pos_1.xyz”, “tut1.pos_2.xyz”, “tut1.pos_3.xyz”, “tut1.checkpoint”, “tut1.forces_1.xyz”, “tut1.forces_2.xyz”, “tut1.forces_3.xyz”, “tut1.forces_4.xyz”, and “RESTART”.

Firstly, we consider the checkpoint files, “tut1.checkpoint” and “RESTART”. As mentioned before, these files can be used as a means of restarting the simulation from a previous point. As an example, the last checkpoint should have been at step 4999, and so we could rerun the last step using

> i-pi tut1.checkpoint

followed by running “i-pi-driver” as before.

The difference between these two files is that, while “tut1.checkpoint” was specified by the user, “RESTART” is automatically generated at the end of every i-PI run. This file then is what we will need to initialize the NPT run, since it contains the state of the system after equilibration.

Next, let us look at the trajectory files. Since we have printed out the positions, these should tell us how the spatial distribution has equilibrated, and give us some insight into the atom dynamics. The easiest way to use these files, as discussed earlier, is to use the trajectory files as input to a visualization program such as VMD.

If we do this with these files, we see that the simulation started from a crystalline configuration and then over the course of the simulation began to melt. Since the state point considered here with the potential given is a liquid [SG78], this is what we would expect.

Finally, let us check the “tut1.md” file. For the current problem, i.e. checking that we have a suitably equilibrated system, we should do two tests. Firstly, we should check that the conserved quantity does not exhibit any major drift, and second we should check to see if the properties of interest have converged. Using gnuplot, we can plot the relevant graphs using:

> gnuplot -persist  -e "plot 'tut1.md' u 1:3" # step vs conserved quantity
> gnuplot -persist  -e "plot 'tut1.md' u 1:4" # step vs temperature
> gnuplot -persist  -e "plot 'tut1.md' u 1:5" # step vs potential energy
> gnuplot -persist  -e "plot 'tut1.md' u 1:6" # step vs kinetic  energy

This will show that the conserved quantity has only a small drift upwards, the kinetic and potential energies have equilibrated, and the thermostat is keeping the temperature at the specified value. We have therefore specified a sufficiently short time step, chosen the thermostat parameters sensibly, and have equilibrated the properties of interest. Therefore this stage of the simulation is done, and we are ready to continue with the second part and start the NPT run.

Part 2 - NPT simulation

Now that we have converged NVT simulation data, we can use this to initialize a NPT simulation. There are two ways of doing this, both of which involve using the RESTART file generated at the end of the NVT run as a starting point. Note that for simplicity we will again take \(N=108, T=25 K\), and use \(P=0\).

Modifying the RESTART file

Firstly, you can use the RESTART file directly, modifying it so that instead of continuing with the original NVT simulation it will instead start a new NPT simulation. We have included in the “tutorial-2” directory both a RESTART file from tutorial 1 and an adjusted file which will run NPT dynamics, “tutorial-2a.xml”

These adjustments start with resetting the “step” tag, so that it starts with the value 0. This can be done by simply removing the tag. Similarly, we can increase the total number of steps so that it is more suitable for collecting the necessary amount of NPT data, in this case we will set “total_steps” to 100000.

We will also update the output files, first by setting the filenames to start with “tut2a” rather than “tut1”, and secondly by adding the volume and pressure to the list of computed properties so that we can check that the ensemble is being sampled correctly. Putting this together gives:

<simulation verbosity='high'>
   <output prefix='tut2a'>
     <properties filename='md' stride='1'>
           [ step, time{picosecond}, conserved{kelvin}, temperature{kelvin}, potential{kelvin},
             kinetic_cv{kelvin}, pressure_cv{megapascal}, volume ]
     </properties>
     <properties filename='force' stride='20'> [atom_f{piconewton}(atom=0;bead=0)] </properties>
     <trajectory filename='pos' stride='100' format='pdb' cell_units='angstrom'>
             positions{angstrom}
     </trajectory>
     <checkpoint filename='checkpoint' stride='1000' overwrite='True'/>
   </output>
   <total_steps>100000</total_steps>
  ...
 <simulation verbosity='high'>

Finally, we must change the ensemble and dynamics the tags so that the correct ensemble is sampled. The first thing that must be done is adding a “pressure” tag in the ensemble:

<ensemble>
   <pressure> 0 </pressure>
    ...
</ensemble>

Then, we must also specify the constant pressure algorithm, using the tag barostat within the dynamics environment. Do not forget to change the mode attribute of the dynamics from “nvt” to “npt”. This example uses a stochastic barostat to apply pressure to an isotropic system, which can be specified with the option “isotropic”. See the documentation of the barostat object and the examples to see how to apply an anisotropic stress, or to allow for cell shape fluctuations.

The isotropic barostat also requires a thermostat to deal with the volume degree of freedom, which we will take to be a simple Langevin thermostat. This thermostat is specified in the same way as the one which does the constant temperature algorithms for the atomic degrees of freedom, and we will take its time scale to be 250 femtoseconds:

<system>
  ...
  <ensemble>
     <pressure> 0 </pressure>
  </ensemble>
  <motion mode=’dynamics’>
     <dynamics mode=’npt’>
         <barostat mode=’isotropic’>
              <thermostat mode=’langevin’>
                 <tau units=’femtosecond’> 250 </tau>
              </thermostat>
          </barostat>
          ...
     </dynamics>
     ...
   </motion>
</system>

Finally, we will take the barostat time scale to be 250 femtoseconds also, giving:

<system>
  ...
  <ensemble>
     <pressure> 0 </pressure>
  </ensemble>
  <motion mode=’dynamics’>
     <dynamics mode=’npt’>
         <barostat mode=’isotropic’>
              <thermostat mode=’langevin’>
                 <tau units=’femtosecond’> 250 </tau>
              </thermostat>
              <tau units='femtosecond'> 250 </tau>
          </barostat>
          ...
     </dynamics>
     ...
   </motion>
</system>

with the rest of the ensemble and dynamics tags being the same as before. Note that in a NPT simulation, we have two thermostats, one applied to the nuclear degrees of freedom and one applied to the volume degrees of freedom.

Initialization from RESTART

A different way of initializing the simulation is to use the RESTART file as a configuration file, in the same way that the xyz/pdb files were used previously.

Firstly, the original input file “tutorial-1.xml” needs to be modified so that it will do a NPT simulation instead of NVT. This involves modifying the “total_steps” output and ensemble tags as above. Next, we replace the tag initialize section with:

<system>
  <initialize nbeads='4'>
    <file mode='chk'> tutorial-1_RESTART </file>
  </initialize>
  ...
</system>

Note that the “mode” attribute has been set to “chk” to specify that the file is a checkpoint file. This will then use the RESTART file to initialize the bead configurations and velocities and the cell parameters.

Again, there is a file in the “tutorial-2” directory for this purpose, “tutorial-2b.xml”.

Running the simulation

Whichever method is used to create the input file, the simulation is run in the same way as before, using either “tutorial-2a.xml” or “tutorial-2b.xml” as the input file. Note how the volume fluctuates with time, as it is no longer held constant in this ensemble.

Part 3 - A fully converged simulation

As a final example, we note that at this state point 16 replicas and at least 172 particles are actually required to provide converged results. As a last tutorial then, you should repeat tutorials 1 and 2 with this number of replicas and atoms.

The directory “tutorial-3” contains NVT and NPT input files which can be used to do a fully converged NPT simulation from scratch, except that they are missing some of the necessary input parameters.

If these are chosen correctly and the simulation is run properly the volume will be 31 \(\textrm{cm}^3\)/mol and the total energy should be -48 K [MHT99].

With this number of beads and atoms, the force calculation is likely to take much longer than it did in either tutorial 1 or 2. To help speed this up, we will now discuss how to parallelize the calculation over the sockets, and how to speed up the data transfer.

Firstly, in this simple case where we are calculating an isotropic, pair-wise interaction, the data transfer time is likely to be a significant proportion of the total calculation time. To help speed this up, there is the option to use a unix domain socket rather than an internet socket. These are optimized for local communication between processes on a single computer, and so for the current problem they will be much faster than internet sockets.

To specify this, we simply set the “mode” attribute of the ffsocket tag to “unix”:

<ffsocket mode=’unix’ name="driver"> ... </ffsocket>

We then specify that the client code should connect to a unix socket using the -u flag:

> i-pi-driver -u -m sg -a localhost -o 15 -p 31415

Parallelizing the force calculation over the different replicas of the system is similarly easy, all that is required is to run the above command multiple times. For example, if we wish to run 4 client codes, we would use:

> for a in 1 2 3 4; do > i-pi-driver -u -m sg -a localhost -o 15 -p
31415 & > done

Using these techniques should help speed up the calculation considerably, at least in this simple case. Note however, that using unix domain sockets would give a negligible gain in speed in most simulations, since the force calculation usually takes much longer than the data transfer.