modified on 7 January 2016 at 17:29 ••• 447,741 views

Benchmarks

From NWChem

(Difference between revisions)
Jump to: navigation, search
(Parallel performance of the CR-EOMCCSD(T) method (triples part))
(Timings of the CCSD/EOMCCSD runs for the oligoporphyrin dimer)
Line 44: Line 44:
   CCSD correlation energy / hartree =        -7.995344180957357
   CCSD correlation energy / hartree =        -7.995344180957357
   CCSD total energy / hartree      =    -2418.570838364838890
   CCSD total energy / hartree      =    -2418.570838364838890
-
 
+
   EOM-CCSD right-hand side iterations
   EOM-CCSD right-hand side iterations
   --------------------------------------------------------------
   --------------------------------------------------------------
       Residuum      Omega / hartree  Omega / eV    Cpu    Wall
       Residuum      Omega / hartree  Omega / eV    Cpu    Wall
   --------------------------------------------------------------
   --------------------------------------------------------------
-
 
+
  Iteration  1 using    5 trial vectors
  Iteration  1 using    5 trial vectors
   0.7254630898708  0.2656229931076    7.22797  4471.5  5151.3
   0.7254630898708  0.2656229931076    7.22797  4471.5  5151.3
-
 
+
  Iteration  2 using    6 trial vectors
  Iteration  2 using    6 trial vectors
   0.1584284659595  0.0882389635508    2.40111  865.3  1041.2
   0.1584284659595  0.0882389635508    2.40111  865.3  1041.2
-
 
+
  Iteration  3 using    7 trial vectors
  Iteration  3 using    7 trial vectors
   0.0575982107592  0.0810948687618    2.20670  918.0  1042.2
   0.0575982107592  0.0810948687618    2.20670  918.0  1042.2
-
 
-
Iteration  4 using    8 trial vectors
 
-
  0.0328916254756  0.0796533734500    2.16748  890.2  1047.5
 
-
 
-
Iteration  5 using    9 trial vectors
 
-
  0.0176131701295  0.0792461293913    2.15640  890.6  1045.4
 
-
 
-
Iteration  6 using  10 trial vectors
 
-
  0.0115801986372  0.0787956012212    2.14414  925.5  1051.0
 
-
 
-
Iteration  7 using  11 trial vectors
 
-
  0.0057936568693  0.0785738035876    2.13810  852.1  1048.3
 
-
 
-
Iteration  8 using  12 trial vectors
 
-
  0.0032410832210  0.0785593787935    2.13771  904.9  1058.0
 
-
 
-
Iteration  9 using  13 trial vectors
 
-
  0.0023986154359  0.0785689250162    2.13797  923.0  1054.3
 
-
 
-
Iteration  10 using  14 trial vectors
 
-
  0.0014534063229  0.0785715567101    2.13804  921.6  1058.2
 
-
 
-
Iteration  11 using  15 trial vectors
 
-
  0.0006741254352  0.0785773725807    2.13820  902.8  1050.2
 
-
largest EOMCCSD amplitudes: R1 and R2
 
-
 
-
Singles contributions
 
-
  404au  (alpha) ---    67b1u (alpha)        0.4272398354
 
-
  587b1u (alpha) ---    35au  (alpha)      -0.4390800823
 
-
  588b1u (alpha) ---    35au  (alpha)      -0.1504073684
 
-
  646b2g (alpha) ---  107b3g (alpha)      -0.4726606832
 
-
  834b3g (alpha) ---    76b2g (alpha)      -0.4027477563
 
-
 
-
Doubles contributions
 
-
--------------------------------------------------------------
 
=Development codes: Performance of the  GPGPU implementation of the Reg-CCSD(T) method=
=Development codes: Performance of the  GPGPU implementation of the Reg-CCSD(T) method=

Revision as of 11:50, 10 September 2010

Add for each benchmark we have:

  • Short description
  • Input deck
  • Graph


Contents

Parallel performance of the CR-EOMCCSD(T) method (triples part)

Creomccsd t.png


An example of the scalability of the triples part of the CR-EOMCCSD(T) approach for Green Fluorescent Protein Chromophore (GFPC) described by cc-pVTZ basis set (648 basis functions) as obtained from NWChem. Timings were determined from calculations on the Franklin Cray-XT4 computer system at NERSC. See the Media:input_gfpc.nw input file for details.

Timings of the CCSD/EOMCCSD runs for the oligoporphyrin dimer

Input file Media:input_p2ta.nw

CCSD/EOMCCSD timings for oligoporphyrin dimer (942 basis set functions, 270 correlated electrons, D2h symmetry, excited-state calculations were perfromed for state of b1g symmetry, in all test calculation covergence threshold was relaxed, 1024 cores were used)

--------------------------------------------------------
 Iter          Residuum       Correlation     Cpu    Wall
 --------------------------------------------------------
   1   0.7187071521175  -7.9406033677717   640.9   807.7
   2   0.2324364531569  -7.7250622086466   650.5   826.0
   3   0.1141748336279  -8.0072740512529   661.1   823.7
   4   0.0688913795193  -7.9503011202597   650.2   822.7
   5   0.0467548207575  -8.0036868822419   669.7   846.9
 MICROCYCLE DIIS UPDATE: 5 5
   6   0.0099626203484  -7.9968580114622   661.4   823.7
   7   0.0072165320866  -7.9945157146832   661.6   824.4
   8   0.0047936300464  -7.9945034979815   648.3   820.2
   9   0.0053957873651  -7.9949925734659   730.8   828.5
  10  0.0047996568854  -7.9950283121291   687.0   825.5
 MICROCYCLE DIIS UPDATE: 10 5
  11   0.0009737920958  -7.9953441809574   691.1   822.2
 --------------------------------------------------------
 Iterations converged
 CCSD correlation energy / hartree =        -7.995344180957357
 CCSD total energy / hartree       =     -2418.570838364838890

 EOM-CCSD right-hand side iterations
 --------------------------------------------------------------
      Residuum       Omega / hartree  Omega / eV    Cpu    Wall
 --------------------------------------------------------------

Iteration   1 using    5 trial vectors
  0.7254630898708   0.2656229931076    7.22797  4471.5  5151.3

Iteration   2 using    6 trial vectors
  0.1584284659595   0.0882389635508    2.40111   865.3  1041.2

Iteration   3 using    7 trial vectors
  0.0575982107592   0.0810948687618    2.20670   918.0  1042.2

Development codes: Performance of the GPGPU implementation of the Reg-CCSD(T) method

Gpu scaling spiro.png

Scalability of the triples part of the Reg-CCSD(T) approach for Spiro cation described by the Sadlej's TZ basis set (POL1). The calculations were perfromed using Barracuda cluster at EMSL.

Gpu speedup uracil.png

Speedup of GPU over CPU of the (T) part of the (T) part of the Reg-CCSD(T) approach as a function of the tilesize. The calculations were perfromed for the uracil molecule.

Development codes: iterative CCSD and EOMCCSD implementations based on alternative task schedulers

Ccsd eomccsd new.png

Comparison of the CCSD/EOMCCSD iteration times for BacterioChlorophyll (BChl) for various tilesizes. Calculations were perfromed for 3-21G basis set (503 basis functions, C1 symmetry, 240 correlated electrons, 1020 cores).

Bchl 6 311G ccsd.png

Time per CCSD iteration for BChl in 6-311G basis set (733 basis functions, C1 symmetry, 240 correlated electrons, 1020 cores) as a function of tilesize.

Ccsd scaling ic.png

Scalability of the CCSD code for BChl in 6-311G basis set (733 basis functions; tilesize=40, C1 symmetry, 240 correlated electrons).