modified on 9 August 2010 at 14:49 ••• 143,800 views


From NWChem

Revision as of 14:49, 9 August 2010 by Marat (Talk | contribs)
Jump to: navigation, search


Combined quantum and molecular mechanics(QM/MM)

Combined or hybrid Quantum Mechanics and Molecular Mechanics (QM/MM) is a simulation methodology that is about 15 years old but in all the literature there are cautions that calibration computations must be done to validate the model for each particular chemical system studied. This is not a black box style computation and the NWChem users are advised that without calibration QM/MM may not give the appropriate results. Since both quantum-mechanical and classical molecular mechanics are involved in the calculation good working knowledge of the two methods is required to ensure meaningful results.

The QM/MM module is invoked with the following task directive.

 task qmmm <string qmtheory> <string operation> [numerical] [ignore]

where qmtheory specifies quantum method for the calculation of the quantum region. It is expected that most of QM/MM simulations will be performed with with HF or DFT theories, but any other QM theory supported by NWChem should also work. Currently the supported operations for QM/MM runs are energy, optimize, saddle, dynamics, numerical hessian, and numerical frequencies.

Unlike pure quantum mechanical calculations the information about the chemical system for QM/MM simulations is contained not in the geometry block but in the externally prepared topology and restart files. These files have to be present prior to any QM/MM simulation. The input file for QM/MM simulation can be divided into three major parts - specification of the molecular mechanics parameters for the classical region, specification of the quantum mechanical method for the quantum region, and the parameters of the interaction between quantum and classical methods. All this discussed in detail in the sections below.

Preparation of the restart and topology files

Generated by the prepare module restart and topology files contain information about the classical force field as well as the coordinates of quantum (QM) and molecular mechanics (MM) regions. In a typical setting this "preparation stage" would be run separately from main QM/MM simulation. This will require a properly formatted PDB file for the system. In more complex cases (e.g.non-standard residues or nucleotides) additional fragment and parameter files might have to be provided by the user. The definition of the quantum region in the input for the prepare module is specified by either modify atom directive:

modify atom <integer isgm>:<string atomname> quantum

or modify segment directive

modify segment <integer isgm> quantum

Here isgm and atomname refer to the residue number and atom name record as given in the PDB file. It is important to note that that the leading blanks in atom name record should be indicated with underscores. Per PDB format quidelines the atom name record starts at column 13. If, for example, the atom name record "OW" starts in the 14th column in PDB file, it will appear as "_OW" in the modify atom directive in the prepare block. In the current implementation only solute atoms can be declared as quantum. If part of the solvent has to be treated quantum mechanically then it has to redeclared to be solute. In addition to modify commands the prepare input block should also contain update lists and ignore directives. There are other options that can be used in the input block for the prepare module ( e.g. solvating the structure, etc ), those discussed in more details in Prepare. The successful run of the prepare module will result in generation of topology and restart files. Similar to classical MD, both files are required for QM/MM simulations and have to be placed in the same directory as the input file. Here is an example input file that will generate QM/MM restart and topology files for the ethanol molecule

Input example 1:

title "Prepare QM/MM calculation of ethanol"
start etl

#--name of the pdb file
   source etl0.pdb                    
#--generate new topology and sequence file
   new_top new_seq                    
#--generate new restart file
#--define quantum region (note the use of underscore)
   modify atom 1:_C1  quantum         
   modify atom 1:2H1  quantum         
   modify atom 1:3H1  quantum          
   modify atom 1:4H1  quantum         
 update lists
#--save restart file   
 write etl_ref.rst
#--generate pdb file
 write etl_ref.pdb
task prepare

These are contents of etl0.pdb file used in the above input file.

ATOM      1  O   etl     1       1.201  -0.271  -0.000  1.00  0.00           O
ATOM      2  H   etl     1       1.995   0.329  -0.000  1.00  0.00           H
ATOM      3  C1  etl     1      -1.180  -0.393   0.000  1.00  0.00           C
ATOM      4 2H1  etl     1      -2.128   0.155  -0.000  1.00  0.00           H
ATOM      5 3H1  etl     1      -1.130  -1.030   0.887  1.00  0.00           H
ATOM      6 4H1  etl     1      -1.130  -1.030  -0.887  1.00  0.00           H
ATOM      7  C2  etl     1       0.006   0.573   0.000  1.00  0.00           C
ATOM      8 2H2  etl     1      -0.042   1.220   0.890  1.00  0.00           H
ATOM      9 3H2  etl     1      -0.042   1.220  -0.890  1.00  0.00           H

Running the input shown above will produce (among other things) the topology file (, the restart file (etl_ref.rst), and the pdb file (etl_ref.pdb). The prefix for the topology file follows after the rtdb name specified in the start directive in the input (i.e. "start etl"), while the names for the restart and pdb files were specified explicitly in the input file. In the absence of the explicit write statement for the restart file, it would be generated under the name "etl_md.rst". The pdb file would only be written in the presence of the explicit write statement.

Tip: It is strongly recommended practice to check the correctness of the generated pdb file versus the original "source" pdb file to catch possible errors in the formatting of the pdb and fragment files.

Here is more complicated example where the entire ethanol molecule is declared quantum and solvated in a box of spce waters:

title "Prepare QM/MM calculation of solvated ethanol"
start etl
source etl0.pdb
new_top new_seq
#center and orient prior to solvation
#solvation in 1 nm by 2 nm by 3 nm box
solvate box 1.0 2.0 3.0
#the whole ethanol is declared quantum now 
modify segment 1 quantum

update lists
write etl_ref.rst
write etl_ref.pdb
task prepare

Fixing atoms outside a certain distance from the QM region can also be accomplished using prepare module. These constraints will then be permanently embedded in the resulting restart file, which may be advantageous for certain types of QM/MM simulations. The actual format for the constraint directive to fix whole residues is

fix segments beyond  <real radius>  <integer residue number>:<string atom name>

or to fix on atom basis

fix atoms beyond  <real radius>  <integer residue number>:<string atom name>

The following input file illustrated the use of fix segments directive

start etl 

source etl0.pdb
new_top new_seq
#solvation in 40 A cubic box
solvate cube 4.0
modify segment 1 quantum
#fix residues more than 20 A away from ethanol oxygen atom
fix segments beyond 2.0 1:_O
update lists
write etl_ref.rst
write etl_ref.pdb
task prepare

Molecular Mechanics Parameters

The molecular mechanics parameters are given in the form of standard MD input block as used by the MD module. This input block is required for QM/MM simulations. It specifies the restart and topology file that will be used in the calculation. It also contains information relevant to the calculation of the classical region (e.g. cutoff distances, constraints, optimization and dynamics parameters, etc) in the system. In this input block one can also set fixed atom constraints on both classical and quantum atoms. Continuing with our example for ethanol molecule here is a simple input block that may be used for this system.

# this specifies that etl_md.rst will be used as a restart file
#  and will be a topology file
  system etl_md
# if we ever wanted to fix C1 atom 
  fix solute 1 _C1

Quantum Mechanical Parameters

The parameters defining calculation of the QM region (including basis sets) must be present in the traditional NWChem input format except for the geometry block. The geometrical information will be constructed automatically using information contained in the restart file.

QM/MM interface parameters

The QM/MM interface parameters define the interaction between classical and quantum regions. The input follows standard NWChem format:

 [ eref <double precision default 0.0d0>]
 [ bqzone <double precision default 9.0d0>]
 [ mm_charges [exclude <(none||all||linkbond||linkbond_H) default none>]
          [ expand  <none||all||solute||solvent> default none]
          [ update  <integer default 0>] 
 [ link_atoms <(hydrogen||halogen) default hydrogen>]
 [ link_ecp  <(auto||user) default auto>]
 [ region   < [region1]  [region2]  [region3] > ]
 [ method   [method1]  [method2]  [method3]  ]
 [ maxiter  [maxiter1] [maxiter2] [maxiter3] ]
 [ ncycles  < [number] default 1 > ]
 [ density  [espfit] [static] [dynamical] default dynamical ]

Detailed explanation of the subdirectives in the QM/MM input block is given below:

eref <double precision default 0.0d0>

This directive sets the relative zero of energy for the QM component of the system. The need for this directive arizes from different definitions of zero energy for QM and MM methods. Most QM methods define the zero of energy for the system as vacuum. The zero of energy for the MM system is by definition of most parameterized force fields the separated atom energy. Therefore in many cases the energetics of the QM system will likely overshadow the MM component of the system. This imbalance can be corrected by suitably chosen value of eref

cutoff <double precision default 9.0d0>

This directive defines the radius of the zone (in angstroms) around the quantum region where classical residues/segments will be allowed to interact with quantum region both electrostatically and through Van der Waals interactions. It should be noted that classical atoms interacting with quantum region via bonded interactions are always included (this is true even if bqzone is set to $0.0$). In addition, even if one atom of a given charged group is in the bqzone (residues are typically treated as one charged group) then the whole group will be included.

mm_charges [exclude <(none||all||linkbond||linkbond_H) default none>]
           [expand  <none||all||solute||solvent> default none]
           [update  <integer default 0>]

This directive controls treatment of classical point (MM) charges that are interacting with QM region. For most QM/MM applications the use of directive will be not be necessary. Its absence would be simply mean that all MM charges within the cuttof distance ( as specified by cutoff ) as well those belonging to the charges groups directly bonded to QM region will be allowed to interact with QM region.

Keyword exclude specifies the subset MM charges that will be specifically excluded from interacting with QM region.

  • none default value reverts to the original set of MM charges as described above.
  • all excludes all MM charges from interacting with QM region ("gas phase" calculation).
  • linkbond excludes MM charges that are connected to a quantum region by at most two bonds,
  • linkbond_H similar to linkbond but excludes only hydrogen atoms.

Keyword expand expands the set MM charges interacting with QM region beyond the limits imposed by cutoff value.

  • none default value reverts to the original set of MM charges
  • solute expands electrostatic interaction to all solute MM charges
  • solvent expands electrostatic interaction to all solvent MM charges
  • all expands electrostatic interaction to all MM charges

Keyword update specifies how often list of MM charges will be updated in the course of the calculation. Default behavior is not to update.

link_atoms <(hydrogen||halogen) default halogen>

This directive controls the treatment of bonds crossing the boundary between quantum and classical regions. The use of hydrogen keyword will trigger truncation of such bonds with hydrogen link atoms. The position of the hydrogen atom will be calculated from the coordinates of the quantum and classical atom of the truncated bond using the following expression

\mathbf{R}_{hlink} = (1-g)\mathbf{R}_{quant} + g*\mathbf{R}_{class}

where g is the scale factor set at 0.709

Setting link_atoms to halogen will result in the modification of the quantum atom of the truncated bond to to the fluoride atom. This fluoride atom will typically carry an effective core potential (ECP) basis set as specified in link_ecp directive.

link_ecp  <(auto||user) default auto>

This directive specifies ECP basis set on fluoride link atoms. If set to auto the ECP basis set given by Zhang, Lee, Yang for 6-31G* basis.35.2 will be used. Strictly speaking, this implies the use of 6-31G* spherical basis as the main basis set. If other choices are desired then keyword user should be used and ECP basis set should be entered separatelly following the format given in section 8. The name tag for fluoride link atoms is F_L.

region  < [region1]  [region2]  [region3] >

This directive specifies active region(s) for optimization, dynamics, and frequency calculations. Up to three regions can be specified, those are limited to

  • "qm" - all quantum atoms
  • "qmlink" - quantum and link atoms,
  • "mm_solute" - all classical atoms excluding link atoms
  • "solvent" all solvent atoms
  • "solute" - all solute atoms quantum or not,
  • "mm" all classical solute and solvent atoms, excluding link atoms
  • "all" all atoms

Only the first region will be used in dynamics and frequency calculations. In the geometry optimizations, all three regions will be optimized in succession possibly with different optimization algorithms (see directives method, maxiter, ncycles below).

method   [method1]  [method2]  [method3]

This directive controls which optimization algorithm will be used for the regions as defined by regions directive. The allowed values are "bfgs" aka driver, "lbfgs" limited memory version of quasi-newton, and "sd" simple steepest descent algorithm. The "bfgs" is the most expensive algorithm and is not recomended for more than 100 particles (this essentially restricts its usage only to "qm" or "qmlink" regions). The "lbfgs" algorithm is recomended for both small and large regions and should be used whenever is possible (in many cases it outperforms "bfgs"). Finally "sd" the most ineficient and slow way to optimize regions, yet it is the only option available for the optimization of solvent regions. The default is to assign "sd" to optimization involving solvent region (if any), and "lbfgs" to all others.

maxiter  [maxiter1] [maxiter2] [maxiter3]

This directive controls maximum number of iterations for the optimizations of regions as defined by by regions directive.

density  [espfit] [static] [dynamical] default dynamical

This directive controls the way electron density of the quantum region is treated when calculating electrostatic interactions with classical charges. This directive affects QM/MM calculations where both quantum and link atoms are inactive (i.e. active region is "mm_solute","solvent", or "mm" ). If set to "espfit" electron density of the quantum region will be approximated to by point charges fitted by ESP procedure. This can dramatically speed up both dynamical and optimization tasks that involve purely classical regions. Note that "espfit" option does require movecs file which for example can be obtained by running qmmm energy calculation prior to optimization. The "static" option will freeze the electron density when quantum and link atoms are inactive. It is more accurate than "espfit" option but also more expensive. Finally, the default "dynamical" option means that the electron density is treated the normal way throught the solution of Schrodinger equation. All calculations that involve any geometry changes in quantum or links atoms will automatically use this option.

ncycles  < [number] default 1 >

This directive controls the number of optimization cycles where the defined regions will be optimized in succession. See also etol

load  < esp > [<filename>]

This directive instructs to load external file (located in permanent directory) containing esp charges for QM region. If filename is not provided it will be constructed from the name of the restart file by replacing ".rst" suffix with ".esp". Note that file containing esp charges is always generated whenever esp charge calculation is performed,

convergence  < double precision etol default 1.0d-4>

This directive controls convergence of geometry optimization. The optimization is deemed converged if absolute difference in total energies between consequitive optimization cycles becomes less than etol.