The input file
Setting up a ONETEP job involves creating a main input file with the suffix “.dat” which contains all the required information to describe both the system and the parameters of the job. The specification uses the “Electronic Structure Data Format” which some users may be familiar from the CASTEP code. This requires the user to provide input in the form of keywords and blocks.
Keywords are written in the form,
keyword : value [unit]
For example, to specify that the task we wish the code to perform is a Single-Point energy calculation, we would add:
task : SinglePoint
to our input file (note that capitalisation is irrelevant). If we wish to specify a cutoff energy of 500 eV for our standard grid, we would add:
cutoff_energy : 500 eV
The value in eV’s will be converted internally to atomic units (Hartree in this case). If a keyword is not specified in the input file, it is given a default value which is intended to work across a broad range of systems. A full list of keywords and blocks, giving their meaning, syntax and default values, can be found in the Documentation section of this website.
Blocks are used to define the values of input parameters which need to contain multiple records,
such as the definition of the unit cell. They take the form:
a1 a2 a3
b1 b2 b3
Most blocks tend not to have a meaningful default value, and must be specied if the related functionality is to be used. Comments can be added to input les using the `#’ or ` !’ characters. Anything after these characters on a given line will be ignored.
We assume that the code has been successfully compiled on your architecture. For the purpose of this tutorial we refer to the ONETEP root directory as ~/ONETEP/. The executable code is therefore to be find in the directory ~/ONETEP/bin/. Create a working directory in an appropriate location (eg the work drives on a cluster) which to run the present tutorial. This subdirectory might be called tutorial1.
We will start by running a simple job on a silane molecule SiH4. Again, by sake of clarity, you might want to create a subdirectory,
Create a new input file called silane.dat in your favourite text editor e.g.
nedit silane.dat &
If using nedit, it will say that the file does not exist – click on New file to create it. You might like to
put a comment at the top explaining what this input file is for e.g.
# Simple ONETEP input file for a silane molecule
The first thing is to specify the simulation cell. The simplest choice is a cubic box with sides of
about 40 bohr. Enter the 3-component cell vectors, one per line, within the lattice_cart block structure,
40.0 0.0 0.0
0.0 40.0 0.0
0.0 0.0 40.0
Second, the atomic species need to be specified, in this case silicon and hydrogen. This information
needs to be provided in the species block structure. This block contains five pieces of information per species, separated by spaces:
(i) your symbol for the atomic species (this can be the same as the element symbol)
(ii) the element symbol itself
(iii) the atomic number Z
(iv) the number of NGWFs per atom
(v) the NGWF radius.
Si Si 14 4 6.0
H H 1 1 6.0
The number of NGWFs required can usually be judged from the symmetry of the atomic orbitals involved. In this case four for silicon and one for hydrogen will be adequate (can you think why?). For this molecule, 6 bohr should be a reasonable starting point for the NGWF radii.
Each atomic species in our calculation needs a pseudopotential file. You can use the hydrogen.recpot and silicon.recpot files provided with ONETEP. Copy them to your working directory (or make a symbolic link),
cp ~/ONETEP/pseudo/hydrogen.recpot .
cp ~/ONETEP/pseudo/silicon.recpot .
and fill the species_pot block structure accordingly,
Next, we need to specify the atomic positions with the positions_abs block structure. There is one line per atom. Remember to use your symbol for the atomic species as defined in the species block. The coordinates are assumed to be given in bohr unless specified otherwise. Note that in version older than 4.9.3 it is a requirement in ONETEP that all the atoms should lie within the simulation cell. The Si-H bond length is about 2.76 bohr and silane is a tetrahedral molecule. The simplest way to work out the coordinates is to note that tetrahedral bonds can be chosen to lie along unit vectors (a, b, 0), (-a, b, 0), (0,-b, a) and (0,-b,-a) where a = (2/3)1/2 and b = (1/3)1/2. For example, the vector for the first Si-H bond is (2.2535, 1.5935, 0.0000) bohr. Add these offsets to the position of your silicon atom to create the SiH4 molecule,
Si 15.0000 15.00000 15.0000
H 17.2535 16.5935 15.0000
H 12.7465 16.5935 15.0000
H 15.0000 13.4065 17.2535
H 15.0000 13.4065 12.7465
The last essential parameter to specify is the kinetic energy cutoff parameter for the psinc basis
CUTOFF_ENERGY 300 eV
A reasonable value to start with is 300 eV. Use the cutoff_energy keyword and remember to specify the energy unit as well as the value.
For now, you have to run the code. By default, the name of the executable is onetep.config_name, where config.name refers to the configuration file used for the compilation. This could be onetep.hector
if you are running on the UK national supercomputer HECToR. The executable file can be found in the directory ~/ONETEP/bin/. The following command executed in the working directory will run ONETEP with the silane.dat input file.
~/ONETEP/bin/onetep.config_name silane > silane.out &
The calculation should take a few seconds. An output file named silane.out has been created in the working directory. If you have followed these instructions precisely the calculation should converge within 8 iterations to a total energy of around -6.1897 Hartree.
Just as in any form of traditional DFT, we must ensure that our calculation results are converged with respect to the size of the basis. In ONETEP, convergence with basis size is controlled by a small number of parameters, with respect to which the total energy is variational. In this context, that means the total energy at a given value of the parameter will be an upper bound to the true, converged total energy, and increasing the parameter will monotonically decrease the total energy, which asymptotically tends to its converged value.
31/03/2015 Update: A set of bash scripts is now available which assists in the task of generating convergence testing data by automating convergence test input-file generation, and collates the energies and total forces of completed calculations into a simple .csv file.
Cutoff Energy :
The first parameter will be familiar to anyone who has carried out plane-wave DFT calculations the cutoff energy. This specifies the kinetic energy of the maximum G-vector of the reciprocal-space grid, and therefore the spacing of the real-space grid. With a 40 bohr cell and a 300 eV cutoff, ONETEP will have chosen a 48×48×48 grid, hence a grid spacing of 0.833 bohr. This maybe too coarse: move your old output file to a new name (eg SiH4.out_Ec300) and try changing the cutoff energy to 500 eV, then re-run the job script. You may wish to turn the output_detail parameter to VERBOSE, to see exactly what grids are being used at each cutoff.
Comparing the two outputs, you should see that the total energy has decreased by around 0.03 Hartree (nearly 1 eV, or 0.2 eV/atom). This suggests 300 eV was too low initially. Try increasing the cutoff in steps of 100 eV (you may wish to automate this, by having a loop in your job script in which the input file is updated and the job run for each update, if you are sufficiently familiar with bash scripting).
Plot the total energy as a function of cutoff energy. You should see a monotonic decrease in ET as a function of the cutoff energy. Try to evaluate at what value you think the total energy is converged to about 0.1 eV/atom of its asymptotic limit. Note that the calculation time increases rapidly with the cutoff energy, because the number of grid points in each FFTbox is growing rapidly with the cutoff energy, and thus each FFT takes longer, so do not try going beyond around 1200 eV.
In few cases in reality do we require strict convergence of the total energy. It is more usual that we require convergence of some measurable quantity such as a binding energy. In that case, we do not require the total energy to be converged only the difference between total energies of very similar systems. This may converge much faster than the total energy itself, presuming the same species are present in both systems. Always consider what it is that you need converging before you start running enormous calculations! For more advice on this topic, check out the CASTEP tutorial available at http://www.castep.org.
NGWF radius :
Next, we will investigate convergence with respect to the NGWF radius. Pick a value of cutoff energy for which you can perform reasonably fast calculations (say, 500 eV) and try increasing the NGWF radius from 6.0 to 10.0 in 1.0 bohr steps. Plot the total energy against NGWF radius. Again, you should see a monotonic decrease. Note that for NGWF radii larger than 6.0 bohr, the FFT box is as large as the simulation cell. In a larger cell the size of the FFT box would keep growing, and the calculation time would increase rapidly. Also, you should notice that the number of NGWF Conjugate Gradients iterations grows with the size of the localisation region. This is natural since with larger spheres there are more NGWF coefficients to simultaneously optimise.
Eventually, you may also wish to converge the total energy with respect to the number of NGWFs per atom (e.g. try 9 NGWFs on the Silicon). In some systems, notably crystalline solids, this can be crucial to achieving good convergence of the NGWFs themselves.
This SiH4 system is too small to investigate convergence with respect to the cutoff of the density kernel. In larger systems, truncation of the density kernel can be a good way to speed up the calculation. Indeed, asymptotically it is only by truncating the kernel that true `linear-scaling’ behaviour of the computational effort will be observed. The kernel cutoff is controlled by the kernel cutoff keyword. Its default value is 1000 bohr (i.e. effectively infinite). Density kernel truncation should be used with a degree of caution: generally speaking, one would want to be able to run a full calculation for a fairly large system first, with an infinite cutoff, to establish a known baseline. Then, try decreasing the kernel cutoff from that point and see what the effect is on the total energy, on the level of NGWF convergence (as measured by the NGWF RMS gradient), and on the computation time. If significant time savings can be achieved without trading in too much accuracy, it may be worthwhile to bring down the cutoff for all similar calculations. Proceed with care, though calculations with a truncated kernel tend to converge in a less stable manner.