modified on 7 January 2016 at 17:29 ••• 452,133 views

Benchmarks

From NWChem

(Difference between revisions)
Jump to: navigation, search
(Current developments for high accuracy methods: GPGPU implementation and alternative task schedulers)
Line 57: Line 57:
   0.0575982107592  0.0810948687618    2.20670  918.0  1042.2
   0.0575982107592  0.0810948687618    2.20670  918.0  1042.2
-
=Current developments for high accuracy methods: GPGPU implementation and alternative task schedulers=
+
=Current developments for high accuracy: GPGPU and alternative task schedulers=
Currently various development efforts are underway for high accuracy methods that will be available in future releases of NWChem. The examples below shows the first results of the performance of the triples part of Reg-CCSD(T) on GPGPUs (left two examples) and of using alternative task schedules for the iterative CCSD and EOMCCSD.
Currently various development efforts are underway for high accuracy methods that will be available in future releases of NWChem. The examples below shows the first results of the performance of the triples part of Reg-CCSD(T) on GPGPUs (left two examples) and of using alternative task schedules for the iterative CCSD and EOMCCSD.
Line 70: Line 70:
File:ccsd_scaling_ic.png|<small>''Scalability of the CCSD code for BChl in 6-311G basis set (733 basis functions; tilesize=40, C1 symmetry, 240 correlated electrons).</small>
File:ccsd_scaling_ic.png|<small>''Scalability of the CCSD code for BChl in 6-311G basis set (733 basis functions; tilesize=40, C1 symmetry, 240 correlated electrons).</small>
</gallery>
</gallery>
-
 
-
=Development codes: iterative CCSD and EOMCCSD implementations based on alternative task schedulers =
 
-
 
-
[[File:ccsd_eomccsd_new.png||400px| ]]
 
-
 
-
Comparison of the CCSD/EOMCCSD iteration times  for BacterioChlorophyll  (BChl)  for various tilesizes. Calculations were perfromed for 3-21G basis set (503 basis  functions, C1 symmetry, 240 correlated electrons, 1020 cores).
 
-
 
-
[[File:bchl_6_311G_ccsd.png||300px| ]]
 
-
 
-
Time per CCSD iteration for BChl in 6-311G basis set (733 basis functions, C1 symmetry, 240 correlated electrons, 1020 cores) as a function of tilesize.
 
-
 
-
[[File:ccsd_scaling_ic.png||300px| ]]
 
-
 
-
Scalability of the CCSD code for BChl in 6-311G basis set (733 basis functions; tilesize=40, C1 symmetry, 240 correlated electrons).
 

Revision as of 12:01, 10 September 2010

Add for each benchmark we have:

  • Short description
  • Input deck
  • Graph


Parallel performance of the CR-EOMCCSD(T) method (triples part)

Creomccsd t.png


An example of the scalability of the triples part of the CR-EOMCCSD(T) approach for Green Fluorescent Protein Chromophore (GFPC) described by cc-pVTZ basis set (648 basis functions) as obtained from NWChem. Timings were determined from calculations on the Franklin Cray-XT4 computer system at NERSC. See the Media:input_gfpc.nw input file for details.

Timings of CCSD/EOMCCSD for the oligoporphyrin dimer

CCSD/EOMCCSD timings for oligoporphyrin dimer (942 basis functions, 270 correlated electrons, D2h symmetry, excited-state calculations were performed for state of b1g symmetry, in all test calculation convergence threshold was relaxed, 1024 cores were used). See the Media:input_p2ta.nw input file for details.

--------------------------------------------------------
 Iter          Residuum       Correlation     Cpu    Wall
 --------------------------------------------------------
   1   0.7187071521175  -7.9406033677717   640.9   807.7
   2   0.2324364531569  -7.7250622086466   650.5   826.0
   3   0.1141748336279  -8.0072740512529   661.1   823.7
   4   0.0688913795193  -7.9503011202597   650.2   822.7
   5   0.0467548207575  -8.0036868822419   669.7   846.9
 MICROCYCLE DIIS UPDATE: 5 5
   6   0.0099626203484  -7.9968580114622   661.4   823.7
   7   0.0072165320866  -7.9945157146832   661.6   824.4
   8   0.0047936300464  -7.9945034979815   648.3   820.2
   9   0.0053957873651  -7.9949925734659   730.8   828.5
  10  0.0047996568854  -7.9950283121291   687.0   825.5
 MICROCYCLE DIIS UPDATE: 10 5
  11   0.0009737920958  -7.9953441809574   691.1   822.2
 --------------------------------------------------------
 Iterations converged
 CCSD correlation energy / hartree =        -7.995344180957357
 CCSD total energy / hartree       =     -2418.570838364838890

 EOM-CCSD right-hand side iterations
 --------------------------------------------------------------
      Residuum       Omega / hartree  Omega / eV    Cpu    Wall
 --------------------------------------------------------------

Iteration   1 using    5 trial vectors
  0.7254630898708   0.2656229931076    7.22797  4471.5  5151.3

Iteration   2 using    6 trial vectors
  0.1584284659595   0.0882389635508    2.40111   865.3  1041.2

Iteration   3 using    7 trial vectors
  0.0575982107592   0.0810948687618    2.20670   918.0  1042.2

Current developments for high accuracy: GPGPU and alternative task schedulers

Currently various development efforts are underway for high accuracy methods that will be available in future releases of NWChem. The examples below shows the first results of the performance of the triples part of Reg-CCSD(T) on GPGPUs (left two examples) and of using alternative task schedules for the iterative CCSD and EOMCCSD.