CCSD(T) Calculation with Quadruple Zeta Basis Set -- Memory Issue

From NWChem

Viewed 319 times, With a total of 14 Posts

Forum >> NWChem's corner >> Running NWChem

Srhhh Member
Profile
Send PM

Clicked A Few Times

Threads 1
Posts 8

6:30:54 PM PDT - Mon, Jun 18th 2018

Hello NWChem Developer,

I am trying to run some CCSD(T) energy calculations with quadruple-zeta basis set of a 5-atom system but it seems the memory requirement of the 2-e file size is a little off the chart (> 100GB). The input file reads:

Quote:username

echo
memory stack 1300 mb heap 200 mb global 1500 mb

start im

title "im"
charge 1

geometry units angstroms print xyz noautosym noautoz

C                  -2.23423902     0.59425408    -0.03224283

O                  -1.12129315     1.09129114    -0.09445519

O                  -3.30588587     0.19083810     0.02028232

Br                  1.41553615    -0.39477191     0.02227492

H                  -0.18608027     0.45084374    -0.04234683

end

basis

C  library aug-cc-pvqz

H  library aug-cc-pvqz

O  library aug-cc-pvqz

BASIS SET: (15s,12p,13d,3f,2g) -> [7s,6p,5d,3f,2g]

Br S

 78967.5000000              0.0000280             -0.0000110

 11809.7000000              0.0002140             -0.0000860

  2687.1400000              0.0010560             -0.0004350

   760.0360000              0.0036880             -0.0014570

   241.8110000              0.0079340             -0.0033810

    38.4914000              0.1528680             -0.0576580

    24.0586000             -0.2786020              0.1123250

    14.3587000             -0.2188500              0.0756730

... (to keep it short)
end

ECP
Br nelec 10
Br ul
2 1.0000000 0.0000000
Br S
...
end

scf

 doublet

 THRESH 1.0e-5

 MAXITER 100

 TOL2E 1e-7

end

tce

 ccsd(t)

 FREEZE atomic

 thresh 1e-6

 maxiter 100

end

task tce

The error message reads

Quote:username

2-e (intermediate) file size =    106977507300

2-e (intermediate) file name = ./im.v2i

 available GA memory             1572841816  bytes

 available GA memory             1572841816  bytes

 available GA memory             1572841816  bytes

 available GA memory             1572841816  bytes

 available GA memory             1572841816  bytes

 available GA memory             1572841816  bytes

 available GA memory             1572841816  bytes

 available GA memory             1572841816  bytes

 available GA memory             1572841816  bytes

 available GA memory             1572841816  bytes

createfile: failed ga_create size/nproc bytes          5348875365

------------------------------------------------------------------------

------------------------------------------------------------------------

I could change the memory options at the beginning of the file but it just seems a little unrealistic to have GA as large as 100G for the nodes that I am using (two Intel Xeon E5-2680v2 “Ivy Bridge” 10-core, 2.8GHz processors, which is 20 cores total, and 128 GB of memory, 6.8GB per core).

I have also tried different "IO" and "2emet" options, for example,

Quote:username

tce

 ccsd(t)

 FREEZE atomic

 thresh 1e-6

 maxiter 100

 2eorb

 2emet 13

 tilesize 10

 attilesize 40

end
set tce:xmem 100

and

Quote:username

tce

 tilesize 2

 io ga

 2EORB

 2EMET 15

 idiskx 1

 ccsd(t)

 FREEZE atomic

 thresh 1e-6

 maxiter 100

end

but the job seems to hang there after printing out "v2 file size = "

Any insight on this issue is greatly appreciated!

Thank you in advance,
Rui

Edoapra Forum:Admin, Forum:Mod, bureaucrat, sysop
Profile
Send PM

Forum Vet

Threads 9
Posts 1522

9:47:59 AM PDT - Tue, Jun 19th 2018

createfile: failed ga_create size/nproc bytes          5348875365

5348875365=5348875365/1024/1024/1024=4.98 GB

Please change the memory line to



memory stack 1300 mb heap 200 mb global 6000 mb

Srhhh Member
Profile
Send PM

Clicked A Few Times

Threads 1
Posts 8

10:50:04 PM PDT - Tue, Jun 19th 2018
Thank you for the prompt response, Edoapra. I had to adjust the memory to Quote:username memory stack 1000 mb heap 100 mb global 5300 mb so it does not exceed the memory of the core (6.8 GB) but now run into an error like the following: Quote:username slurmstepd: error: Step 3840722.0 exceeded memory limit (123363455 > 122880000), being killed slurmstepd: error: Step 3840722.0 exceeded memory limit (123618673 > 122880000), being killed slurmstepd: error: Step 3840722.0 exceeded memory limit (123451708 > 122880000), being killed slurmstepd: error: * STEP 3840722.0 ON prod2-0143 CANCELLED AT 2018-06-20T04:05:00 * slurmstepd: error: Exceeded job memory limit slurmstepd: error: Exceeded job memory limit slurmstepd: error: Exceeded job memory limit srun: Job step aborted: Waiting up to 122 seconds for job step to finish. srun: error: prod2-0148: tasks 100-119: Killed srun: error: prod2-0150: tasks 140-159: Killed srun: error: prod2-0149: tasks 120-139: Killed slurmstepd: error: _get_pss: ferror() indicates error on file /proc/156552/smaps slurmstepd: error: _get_pss: ferror() indicates error on file /proc/135960/smaps srun: error: prod2-0145: tasks 41,43,45,47,49,51,53,55,57,59: Killed srun: error: prod2-0146: tasks 63,65,69,71,75,77,79: Killed slurmstepd: error: _get_pss: ferror() indicates error on file /proc/234980/smaps srun: error: prod2-0145: tasks 40,42,44,46,48,50,52,54,56,58: Killed srun: error: prod2-0146: tasks 61,67,73: Killed srun: error: prod2-0143: tasks 0-19: Killed slurmstepd: error: _get_pss: ferror() indicates error on file /proc/77821/smaps srun: error: prod2-0146: tasks 60,62,64,66,68,70,72,74,76,78: Killed srun: error: prod2-0144: tasks 20-39: Killed slurmstepd: error: _get_pss: ferror() indicates error on file /proc/17624/smaps srun: error: prod2-0147: tasks 80-99: Killed I have also tried another memory allocation Quote:username memory stack 400 mb heap 100 mb global 6000 mb and it yielded a different error Quote:username 2-e (intermediate) file size = 107432197225 2-e (intermediate) file name = ./vim.v2i tce_ao2e: MA problem k_ijkl 18 ------------------------------------------------------------------------ ------------------------------------------------------------------------ current input line : 0: ------------------------------------------------------------------------ ------------------------------------------------------------------------ ------------------------------------------------------------------------ For more information see the NWChem manual at http://www.nwchem-sw.org/index.php/NWChem_Documentation For further details see manual section: Currently I am using 160 cores -- Do you think I should try to use more cores so the GA allocation on each core is less? Thank you very much, Rui

Srhhh Member
Profile
Send PM

Clicked A Few Times

Threads 1
Posts 8

4:28:48 AM PDT - Wed, Jun 20th 2018
More CPUs, still failed
In the hope of reducing the memory requirement on each core, I tested the job with 200 cores (increased 160 cores). However, it seems the computer could not allocate the correct amount of memory for MA. For example, the memory line reads: Quote:username memory stack 900 mb heap 200 mb global 4300 mb but the error message shows: Quote:username tce_ao2e: fast2e=1 half-transformed integrals in memory 2-e (intermediate) file size = 107432197225 2-e (intermediate) file name = ./vim.v2i Cpu & wall time / sec 214.7 266.1 available GA memory 211394680 bytes ------------------------------------------------------------------------ createfile: failed ga_create size/nproc bytes 3079838825 ------------------------------------------------------------------------ ------------------------------------------------------------------------ current input line : 129: task tce even though clearly the input file was trying to allocate 4300 mb for GA. Would you please let me know how to fix this? Thank you, Rui

Edoapra Forum:Admin, Forum:Mod, bureaucrat, sysop
Profile
Send PM

Forum Vet

Threads 9
Posts 1522

10:04:21 AM PDT - Wed, Jun 20th 2018
Please report the tce input block you are currently using and number of processors

Srhhh Member
Profile
Send PM

Clicked A Few Times

Threads 1
Posts 8

2:27:15 PM PDT - Wed, Jun 20th 2018
The TCE input block reads: Quote:username tce ccsd(t) FREEZE atomic thresh 1e-6 maxiter 100 end I am currently using 10 nodes with 20 cores per node. The memory on each core is 6GB. The job script reads: Quote:username #!/bin/bash SBATCH --job-name=vim SBATCH --partition=kill.q SBATCH --exclusive SBATCH --nodes=10 SBATCH --tasks-per-node=20 SBATCH --cpus-per-task=1 SBATCH --error=%A.err SBATCH --time=0-10:59:59 ## time format is DD-HH:MM:SS SBATCH --output=%A.out export I_MPI_FABRICS=shm:tmi export I_MPI_PMI_LIBRARY=/opt/local/slurm/default/lib64/libpmi.so source /global/opt/intel_2016/mkl/bin/mklvars.sh intel64 module load intel_2016/ics intel_2016/impi export NWCHEM_TARGET=LINUX64 CHANGE TO THE CORRECT PATH export ARMCI_DEFAULT_SHMMAX=8096 export MPIRUN_PATH="srun" export MPIRUN_NPOPT="-n" export INPUT="vim" $MPIRUN_PATH $MPIRUN_NPOPT ${SLURM_NTASKS} $NWCHEM_EXECUTABLE $INPUT.nw Thank you!

Edoapra Forum:Admin, Forum:Mod, bureaucrat, sysop
Profile
Send PM

Forum Vet

Threads 9
Posts 1522

5:39:56 PM PDT - Wed, Jun 20th 2018

Please try the following input

echo
permanent_dir /global/cscratch1/sd/apra/arar
memory stack 1300 mb heap 200 mb global 7000 mb
start im
title "im"
charge 1
geometry #units angstroms print xyz noautosym noautoz
 C                  -2.23423902     0.59425408    -0.03224283
 O                  -1.12129315     1.09129114    -0.09445519
 O                  -3.30588587     0.19083810     0.02028232
 Br                  1.41553615    -0.39477191     0.02227492
 H                  -0.18608027     0.45084374    -0.04234683
end
basis spherical
 C  library aug-cc-pvqz
 H  library aug-cc-pvqz
 O  library aug-cc-pvqz
 Br  library aug-cc-pvqz-pp
end
ECP
 Br  library aug-cc-pvqz-pp
end
scf
  doublet
  THRESH 1.0e-5
  MAXITER 100
  TOL2E 1e-12
end
tce
  ccsd(t)
  FREEZE atomic
  tilesize 8
  attilesize 12
  thresh 1e-6
  maxiter 100
end
task tce

Srhhh Member
Profile
Send PM

Clicked A Few Times

Threads 1
Posts 8

7:41:10 PM PDT - Wed, Jun 20th 2018
Thank you, Edoapra. Just want to make sure I understand this correctly. I should try to use 200 cores and each core should allocate the following amount of memory? Quote:username memory stack 1300 mb heap 200 mb global 7000 mb

Edoapra Forum:Admin, Forum:Mod, bureaucrat, sysop
Profile
Send PM

Forum Vet

Threads 9
Posts 1522

10:41:41 AM PDT - Thu, Jun 21st 2018
Quote:Srhhh Jun 20th 6:41 pm Thank you, Edoapra. Just want to make sure I understand this correctly. I should try to use 200 cores and each core should allocate the following amount of memory? Quote:username memory stack 1300 mb heap 200 mb global 7000 mb You should use only 10 tasks-per-node for a total of 100 cores since you mentioned that you have 6GB/core

Srhhh Member
Profile
Send PM

Clicked A Few Times

Threads 1
Posts 8

12:29:10 PM PDT - Thu, Jun 21st 2018
Thank you, Edoapra. I managed to get more core (20 nodes, 400 cores) so I was able to run without MA allocation issue with the following memory line: Quote:username memory stack 1000 mb heap 200 mb global 4400 mb Everything else in the input file is identified as in your previous comment. It took about 5 mins for the calculation to go to Quote:username tce_ao2e: fast2e=1 half-transformed integrals in memory 2-e (intermediate) file size = 105684005025 2-e (intermediate) file name = ./vim.v2i Cpu & wall time / sec 144.8 184.8 tce_mo2e: fast2e=1 2-e integrals stored in memory but the calculation has been hanging there for over eight hours -- nothing got written into the folder or output file at all. I also noticed there were some vim.aoints.x files that seem to have not been cleaned up properly. Is the behavior normal for this size of a calculation or this QZ calculation is pushing the limit of NWChem? Thanks again.

Srhhh Member
Profile
Send PM

Clicked A Few Times

Threads 1
Posts 8

6:52:31 PM PDT - Thu, Jun 21st 2018

Unstable CCSD iterations

The test run in the previous comment actually went to the CCSD iterations part (each iteration takes about 1 hour wall time) but the iterations seem unstable. Please see below:

Quote:username

t2 file handle = -995

CCSD iterations

-----------------------------------------------------------------

Iter          Residuum       Correlation     Cpu    Wall    V2*C2

-----------------------------------------------------------------

   1   0.3745619466040  -1.0830661992146  1975.7  3034.0   759.3

   2   0.3338130779425  -1.0377329617715  1992.9  3058.0   760.8

   3   7.2614902105214  -1.0607684520852  1991.8  3049.5   762.0

   4  60.1400573985661  -1.0597624893767  1986.2  3038.5   759.7

   51384.5956104600380  -1.0695691959406  1993.2  3050.9   765.8

MICROCYCLE DIIS UPDATE:                     5                     5

The geometry of this calculation was optimized from ccsd(t)/aug-cc-pvTZ basis set so this error should not be from a bad geometry.

Thank you!

Edoapra Forum:Admin, Forum:Mod, bureaucrat, sysop
Profile
Send PM

Forum Vet

Threads 9
Posts 1522

11:18:26 AM PDT - Fri, Jun 22nd 2018
Quote:Srhhh Jun 21st 5:52 pm The test run in the previous comment actually went to the CCSD iterations part (each iteration takes about 1 hour wall time) but the iterations seem unstable. Thank you! Did you use a spherical or cartesian basis?

Srhhh Member
Profile
Send PM

Clicked A Few Times

Threads 1
Posts 8

12:50:42 PM PDT - Fri, Jun 22nd 2018
I had a Cartesian basis and now changed to spherical. I will update you how this test goes. Another problem just happened: Quote:username [25] Received an Error in Communication: (-991) 25:nga_get_common:cannot locate region: ./vim.r1.d1 [18591:18511 ,1:1 ]: [212] Received an Error in Communication: (-991) 212:nga_get_common:cannot locate region: ./vim.r1.d1 [18526:18511 ,1:1 ]: application called MPI_Abort(comm=0x84000000, -991) - process 212 [173] Received an Error in Communication: (-991) 173:nga_get_common:cannot locate region: ./vim.r1.d1 [18721:18511 ,1:1 ]: application called MPI_Abort(comm=0x84000000, -991) - process 173 application called MPI_Abort(comm=0x84000000, -991) - process 25 [179] Received an Error in Communication: (-991) 179:nga_get_common:cannot locate region: ./vim.r1.d1 [18656:18511 ,1:1 ]: application called MPI_Abort(comm=0x84000000, -991) - process 179 srun: error: prod2-0101: task 212: Exited with exit code 33 srun: error: prod2-0029: task 25: Exited with exit code 33 srun: error: prod2-0096: tasks 173,179: Exited with exit code 33 From what I can find online this seems to be also related to memory (even though MA 'test' passed) and CCSD iterations started. Does DIIS require additional memories? Thank you

Xiongyan21 Member
Profile
Send PM

Forum Regular

Threads 45
Posts 216

6:11:06 PM PDT - Fri, Jun 22nd 2018

This calculation is very hardware-demanding. I have tried NWCHEM6.8 on MAC to using aug-cc-pvdz.

...

Iterations converged

CCSD correlation energy / hartree = ...       

CCSD total energy / hartree       =    ...

Singles contributions

Doubles contributions

...

CCSD[T]  correction energy / hartree =       ...

CCSD[T] correlation energy / hartree =       ...

CCSD(T)  correction energy / hartree =       ... 

CCSD(T) correlation energy / hartree =       ...

CCSD(T) total energy / hartree       =          ...

 ...

                                    CITATION

                                    --------

               Please cite the following reference when publishing

                          results obtained with NWChem:

                M. Valiev, E.J. Bylaska, N. Govind, K. Kowalski,

             T.P. Straatsma, H.J.J. van Dam, D. Wang, J. Nieplocha,

                       E. Apra, T.L. Windus, W.A. de Jong

                "NWChem: a comprehensive and scalable open-source

                 solution for large scale molecular simulations"

                     Comput. Phys. Commun. 181, 1477 (2010)

                          doi:10.1016/j.cpc.2010.04.018

                                     AUTHORS

                                     -------

         E. Apra, E. J. Bylaska, W. A. de Jong, N. Govind, K. Kowalski,

      T. P. Straatsma, M. Valiev, H. J. J. van Dam, D. Wang, T. L. Windus,

       J. Hammond, J. Autschbach, K. Bhaskaran-Nair, J. Brabec, K. Lopata,

   S. A. Fischer, S. Krishnamoorthy, M. Jacquelin, W. Ma, M. Klemm, O. Villa,

     Y. Chen, V. Anisimov, F. Aquino, S. Hirata, M. T. Hackler, V. Konjkov,

           D. Mejia-Rodriguez, T. Risthaus, M. Malagoli, A. Marenich,

  A. Otero-de-la-Roza, J. Mullin, P. Nichols, R. Peverati, J. Pittner, Y. Zhao,

       P.-D. Fan, A. Fonari, M. J. Williamson, R. J. Harrison, J. R. Rehr,

     M. Dupuis, D. Silverstein, D. M. A. Smith, J. Nieplocha, V. Tipparaju,

   M. Krishnan, B. E. Van Kuiken, A. Vazquez-Mayagoitia, L. Jensen, M. Swart,

     Q. Wu, T. Van Voorhis, A. A. Auer, M. Nooijen, L. D. Crosby, E. Brown,

     G. Cisneros, G. I. Fann, H. Fruchtl, J. Garza, K. Hirao, R. A. Kendall,

     J. A. Nichols, K. Tsemekhman, K. Wolinski, J. Anchell, D. E. Bernholdt,

     P. Borowski, T. Clark, D. Clerc, H. Dachsel, M. J. O. Deegan, K. Dyall,

   D. Elwood, E. Glendening, M. Gutowski, A. C. Hess, J. Jaffe, B. G. Johnson,

    J. Ju, R. Kobayashi, R. Kutteh, Z. Lin, R. Littlefield, X. Long, B. Meng,

     T. Nakajima, S. Niu, L. Pollack, M. Rosing, K. Glaesemann, G. Sandrone,

     M. Stave, H. Taylor, G. Thomas, J. H. van Lenthe, A. T. Wong, Z. Zhang.

Edited On 10:22:07 PM PDT - Sat, Jul 14th 2018 by Xiongyan21

Xiongyan21 Member
Profile
Send PM

Forum Regular

Threads 45
Posts 216

9:06:09 PM PDT - Fri, Jun 22nd 2018

I have tried aug-cc-pvtz, which I think is adequate for many practical purposes, with "ROHF"; and others added into the proper groups.
I am very much afraid that the original calculation employing aug-cc-pvqz only could be successful on an excellent performance supercomputer with official NWCHEM installed in a USA national lab.

NWCHEM6.8 on MAC gave
...

Iterations converged

CCSD correlation energy / hartree =  ...     

CCSD total energy / hartree       =     ...

Singles contributions

...

                                    CITATION

                                    --------

               Please cite the following reference when publishing

                          results obtained with NWChem:

                M. Valiev, E.J. Bylaska, N. Govind, K. Kowalski,

             T.P. Straatsma, H.J.J. van Dam, D. Wang, J. Nieplocha,

                       E. Apra, T.L. Windus, W.A. de Jong

                "NWChem: a comprehensive and scalable open-source

                 solution for large scale molecular simulations"

                     Comput. Phys. Commun. 181, 1477 (2010)

                          doi:10.1016/j.cpc.2010.04.018

                                     AUTHORS

                                     -------

         E. Apra, E. J. Bylaska, W. A. de Jong, N. Govind, K. Kowalski,

      T. P. Straatsma, M. Valiev, H. J. J. van Dam, D. Wang, T. L. Windus,

       J. Hammond, J. Autschbach, K. Bhaskaran-Nair, J. Brabec, K. Lopata,

   S. A. Fischer, S. Krishnamoorthy, M. Jacquelin, W. Ma, M. Klemm, O. Villa,

     Y. Chen, V. Anisimov, F. Aquino, S. Hirata, M. T. Hackler, V. Konjkov,

           D. Mejia-Rodriguez, T. Risthaus, M. Malagoli, A. Marenich,

  A. Otero-de-la-Roza, J. Mullin, P. Nichols, R. Peverati, J. Pittner, Y. Zhao,

       P.-D. Fan, A. Fonari, M. J. Williamson, R. J. Harrison, J. R. Rehr,

     M. Dupuis, D. Silverstein, D. M. A. Smith, J. Nieplocha, V. Tipparaju,

   M. Krishnan, B. E. Van Kuiken, A. Vazquez-Mayagoitia, L. Jensen, M. Swart,

     Q. Wu, T. Van Voorhis, A. A. Auer, M. Nooijen, L. D. Crosby, E. Brown,

     G. Cisneros, G. I. Fann, H. Fruchtl, J. Garza, K. Hirao, R. A. Kendall,

     J. A. Nichols, K. Tsemekhman, K. Wolinski, J. Anchell, D. E. Bernholdt,

     P. Borowski, T. Clark, D. Clerc, H. Dachsel, M. J. O. Deegan, K. Dyall,

   D. Elwood, E. Glendening, M. Gutowski, A. C. Hess, J. Jaffe, B. G. Johnson,

    J. Ju, R. Kobayashi, R. Kutteh, Z. Lin, R. Littlefield, X. Long, B. Meng,

     T. Nakajima, S. Niu, L. Pollack, M. Rosing, K. Glaesemann, G. Sandrone,

     M. Stave, H. Taylor, G. Thomas, J. H. van Lenthe, A. T. Wong, Z. Zhang.

Edited On 4:57:02 AM PDT - Mon, Jul 9th 2018 by Xiongyan21

Forum >> NWChem's corner >> Running NWChem

Who's here now Members 0 Guests 1 Bots/Crawler 0

AWC's: 2.5.10 MediaWiki - Stand Alone Forum Extension
Forum theme style by: AWC

Search

Navigation

SEARCH

TOOLBOX

LANGUAGES

Forum Menu

CCSD(T) Calculation with Quadruple Zeta Basis Set -- Memory Issue

From NWChem

Toolbox