Bug during b3lyp optimization

From NWChem

Viewed 1574 times, With a total of 13 Posts
Jump to: navigation, search

Clicked A Few Times
Threads 5
Posts 15
Hi...
I'm using NWChem-6.6 and I have applied all 20 patch in order as they appeared. Then, I done a simple b3lyp optimization for tartaric acid and this is what I got



INPUT FILE:


echo

start molecule

title "Tartaric Acid"
charge 0

geometry units angstroms print xyz autosym
  O       -2.62014       -0.94784        0.24280
C -2.46312 0.24289 0.00934
C -1.15883 0.84386 -0.49410
C -0.01246 0.26157 0.34196
C 1.29193 0.86272 -0.16104
O 1.44858 2.05331 -0.39548
O 2.25658 -0.07054 -0.34014
H 1.85536 -0.95279 -0.13882
O 0.12523 -1.16790 0.20133
H -0.75972 -1.56248 0.38148
H -0.10589 0.49775 1.40729
O -1.29640 2.27336 -0.35357
H -0.41151 2.66785 -0.53421
H -1.06561 0.60754 -1.55942
O -3.42756 1.17628 0.18891
H -3.02624 2.05849 -0.01236
end

basis
 * library 6-31G*
end

dft
 xc b3lyp
mult 1
end

task dft optimize



OUTPUT FILE:

          Memory utilization after 1st SCF pass: 
Heap Space remaining (MW): 94.38 94384542
Stack Space remaining (MW): 123.37 123366596

  convergence    iter        energy       DeltaE   RMS-Dens  Diis-err    time
---------------- ----- ----------------- --------- --------- --------- ------
d= 0,ls=0.0,diis 1 -607.3859414149 -1.18D+03 5.59D-05 4.26D-05 577.7
d= 0,ls=0.0 2 NaN NaN 2.53D-03 582.7
d= 0,ls=0.0 3 -607.3759831778 NaN 2.82D-03 587.3
d= 0,ls=0.0,diis 4 -607.3844754988 -8.49D-03 5.73D-04 1.27D-02 592.1
d= 0,ls=0.0,diis 5 -607.3859506493 -1.48D-03 1.31D-05 1.71D-06 597.0
d= 0,ls=0.0,diis 6 -607.3859504144 2.35D-07 7.78D-06 4.98D-06 602.3

                        DFT ENERGY GRADIENTS

   atom               coordinates                        gradient
x y z x y z
1 O -2.722163 3.965332 -0.096206 -0.000242 0.000026 NaN
2 C -0.446187 3.626429 -0.151085 0.000184 0.000036 NaN
3 C 0.710903 1.039053 -0.747258 -0.000400 0.000557 -0.000091
4 C -0.710903 -1.039053 0.747258 0.000400 -0.000557 0.000091
5 C 0.446187 -3.626429 0.151085 NaN NaN NaN
6 O 2.722163 -3.965332 0.096206 NaN NaN NaN
7 O -1.211068 -5.462902 -0.290552 -0.000279 -0.000203 0.000299
8 H -2.895660 -4.683507 -0.283496 0.000333 -0.000066 -0.000046
9 O -3.301454 -1.143834 0.124106 0.000086 0.000459 0.000076
10 H -3.915921 0.608361 0.051944 -0.000107 -0.000061 -0.000155
11 H -0.420588 -0.683736 2.779197 -0.000385 0.000178 -0.000100
12 O 3.301454 1.143834 -0.124106 -0.000086 -0.000459 -0.000076
13 H 3.915921 -0.608361 -0.051944 0.000107 0.000061 0.000155
14 H 0.420588 0.683736 -2.779197 0.000385 -0.000178 0.000100
15 O 1.211068 5.462902 0.290552 0.000279 0.000203 -0.000299
16 H 2.895660 4.683507 0.283496 -0.000333 0.000066 0.000046

                ----------------------------------------
| Time | 1-e(secs) | 2-e(secs) |
----------------------------------------
| CPU | 0.02 | 4.04 |
----------------------------------------
| WALL | 0.05 | 4.13 |
----------------------------------------

 Step       Energy      Delta E   Gmax     Grms     Xrms     Xmax   Walltime
---- ---------------- -------- -------- -------- -------- -------- --------
@ 9 -607.38596631 -2.4D-05 NaN NaN 0.00724 0.01978 1358.3



AT THE END OUTPUT FILE SHOW:

     Type          Name      I     J     K     L     M      Value     Gradient
----------- -------- ----- ----- ----- ----- ----- ---------- ----------
1 Stretch 1 2 1.21802 NaN
2 Stretch 2 3 1.53268 NaN
3 Stretch 2 15 1.32972 NaN
4 Stretch 3 4 1.54947 NaN
5 Stretch 3 12 1.41105 NaN
6 Stretch 3 14 1.10233 NaN
7 Stretch 4 5 1.53268 NaN
8 Stretch 4 9 1.41105 NaN
9 Stretch 4 11 1.10233 NaN
10 Stretch 5 6 1.21802 NaN
11 Stretch 5 7 1.32972 NaN
12 Stretch 7 8 0.98224 NaN
13 Stretch 9 10 0.98333 NaN
14 Stretch 12 13 0.98333 NaN
15 Stretch 15 16 0.98224 NaN
16 Bend 1 2 3 122.10493 NaN
17 Bend 1 2 15 122.70731 NaN
18 Bend 2 3 4 109.57005 NaN
19 Bend 2 3 12 107.75300 NaN
20 Bend 2 3 14 107.30628 NaN
21 Bend 2 15 16 106.91872 NaN
22 Bend 3 2 15 115.18741 NaN
23 Bend 3 4 5 109.57005 NaN
24 Bend 3 4 9 112.35556 NaN
25 Bend 3 4 11 108.00757 NaN
26 Bend 3 12 13 107.05434 NaN
27 Bend 4 3 12 112.35556 NaN
28 Bend 4 3 14 108.00757 NaN
29 Bend 4 5 6 122.10493 NaN
30 Bend 4 5 7 115.18741 NaN
31 Bend 4 9 10 107.05434 NaN
32 Bend 5 4 9 107.75300 NaN
33 Bend 5 4 11 107.30628 NaN
34 Bend 5 7 8 106.91872 NaN
35 Bend 6 5 7 122.70731 NaN
36 Bend 9 4 11 111.71909 NaN
37 Bend 12 3 14 111.71909 NaN
38 Torsion 1 2 3 4 -44.85103 NaN
39 Torsion 1 2 3 12 -167.37631 NaN
40 Torsion 1 2 3 14 72.17665 NaN
41 Torsion 1 2 15 16 175.09818 NaN
42 Torsion 2 3 4 5 -180.00000 NaN
43 Torsion 2 3 4 9 60.25711 NaN
44 Torsion 2 3 4 11 -63.41412 NaN
45 Torsion 2 3 12 13 164.41952 NaN
46 Torsion 3 2 15 16 -5.11401 NaN
47 Torsion 3 4 5 6 44.85103 NaN
48 Torsion 3 4 5 7 -135.35975 NaN
49 Torsion 3 4 9 10 -43.62507 NaN
50 Torsion 4 3 2 15 135.35975 NaN
51 Torsion 4 3 12 13 43.62507 NaN
52 Torsion 4 5 7 8 5.11401 NaN
53 Torsion 5 4 3 12 -60.25711 NaN
54 Torsion 5 4 3 14 63.41412 NaN
55 Torsion 5 4 9 10 -164.41952 NaN
56 Torsion 6 5 4 9 167.37631 NaN
57 Torsion 6 5 4 11 -72.17665 NaN
58 Torsion 6 5 7 8 -175.09818 NaN
59 Torsion 7 5 4 9 -12.83447 NaN
60 Torsion 7 5 4 11 107.61258 NaN
61 Torsion 9 4 3 12 180.00000 NaN
62 Torsion 9 4 3 14 -56.32877 NaN
63 Torsion 10 9 4 11 77.95091 NaN
64 Torsion 11 4 3 12 56.32877 NaN
65 Torsion 11 4 3 14 180.00000 NaN
66 Torsion 12 3 2 15 12.83447 NaN
67 Torsion 13 12 3 14 -77.95091 NaN
68 Torsion 14 3 2 15 -107.61258 NaN

!! There are insufficient internal variables: expected    56 got    42
!! Either AUTOZ failed or your geometry has changed so much that the
!! coordinates should be regenerated.
geom_binvr: #indep variables incorrect 5600042
------------------------------------------------------------------------
------------------------------------------------------------------------
current input line :
0:
------------------------------------------------------------------------
------------------------------------------------------------------------
There is an error related to the specified geometry
------------------------------------------------------------------------
For more information see the NWChem manual at http://www.nwchem-sw.org/index.php/NWChem_Documentation


For further details see manual section:                                                                                                                                                                                                                                                                
2:2:geom_binvr: #indep variables incorrect:: 5600042
(rank:2 hostnamec-0 pid:2240):ARMCI DASSERT fail. ../../ga-5-4/armci/src/common/armci.c:ARMCI_Error():208 cond:0
Last System Error Message from Task 2:: Numerical result out of range
------------------------------------------------------------------------
geom_binvr: #indep variables incorrect 5600042
------------------------------------------------------------------------
------------------------------------------------------------------------
current input line :
36: task dft optimize
------------------------------------------------------------------------
------------------------------------------------------------------------
There is an error related to the specified geometry
------------------------------------------------------------------------
For more information see the NWChem manual at http://www.nwchem-sw.org/index.php/NWChem_Documentation


For further details see manual section:                                                                                                                                                                                                                                                                
0:0:geom_binvr: #indep variables incorrect:: 5600042
(rank:0 hostnamec-0 pid:2238):ARMCI DASSERT fail. ../../ga-5-4/armci/src/common/armci.c:ARMCI_Error():208 cond:0
Last System Error Message from Task 0:: Numerical result out of range
geom_binvr: #indep variables incorrect             5600042
------------------------------------------------------------------------
------------------------------------------------------------------------
current input line :
0:
------------------------------------------------------------------------
------------------------------------------------------------------------
There is an error related to the specified geometry
------------------------------------------------------------------------
For more information see the NWChem manual at http://www.nwchem-sw.org/index.php/NWChem_Documentation


For further details see manual section:                                                                                                                                                                                                                                                                
1:1:geom_binvr: #indep variables incorrect:: 5600042
(rank:1 hostnamec-0 pid:2239):ARMCI DASSERT fail. ../../ga-5-4/armci/src/common/armci.c:ARMCI_Error():208 cond:0
Last System Error Message from Task 1:: Numerical result out of range


MPI_ABORT was invoked on rank 2 in communicator MPI COMMUNICATOR 4 DUP FROM 0
with errorcode 5600042.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.


geom_binvr: #indep variables incorrect             5600042
------------------------------------------------------------------------
------------------------------------------------------------------------
current input line :
0:
------------------------------------------------------------------------
------------------------------------------------------------------------
There is an error related to the specified geometry
------------------------------------------------------------------------
For more information see the NWChem manual at http://www.nwchem-sw.org/index.php/NWChem_Documentation


For further details see manual section:                                                                                                                                                                                                                                                                
3:3:geom_binvr: #indep variables incorrect:: 5600042
(rank:3 hostnamec-0 pid:2241):ARMCI DASSERT fail. ../../ga-5-4/armci/src/common/armci.c:ARMCI_Error():208 cond:0
Last System Error Message from Task 3:: Numerical result out of range


mpirun has exited due to process rank 2 with PID 2240 on
node pc-0 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).


[pc-0:02237] 3 more processes have sent help message help-mpi-api.txt / mpi-abort
[pc-0:02237] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

Forum Vet
Threads 10
Posts 1643
I am not able to reproduce this bug.
Could you please re-run this case after adding the following section to your input file?

driver
  clear
end

Clicked A Few Times
Threads 5
Posts 15
An iteration does not show numbers
Hi...

I have applied "clear", and this time NWChem complete the calculation successfully. However, only a section of the iterations appears the following

          Memory utilization after 1st SCF pass: 
Heap Space remaining (MW): 94.38 94384542
Stack Space remaining (MW): 123.37 123366596

  convergence    iter        energy       DeltaE   RMS-Dens  Diis-err    time
---------------- ----- ----------------- --------- --------- --------- ------
d= 0,ls=0.0,diis 1 -607.3600061220 -1.18D+03 2.02D-03 5.68D-02 316.6
d= 0,ls=0.0,diis 2 -607.3722989086 -1.23D-02 3.74D-04 1.63D-03 321.5
d= 0,ls=0.0 3 NaN NaN 3.22D-03 325.4
d= 0,ls=0.0 4 -607.2664836937 NaN 6.41D-03 330.4
d= 0,ls=0.0,diis 5 -607.1845502595 8.19D-02 4.08D-03 2.10D+00 335.2
d= 0,ls=0.0,diis 6 -607.3724212559 -1.88D-01 1.79D-04 1.25D-03 339.8
d= 0,ls=0.0,diis 7 -607.3725381437 -1.17D-04 5.45D-05 1.71D-04 344.1
d= 0,ls=0.0,diis 8 -607.3725532933 -1.51D-05 1.65D-05 1.50D-05 349.0
d= 0,ls=0.0,diis 9 -607.3725545813 -1.29D-06 5.22D-06 1.61D-06 354.2
d= 0,ls=0.0,diis 10 -607.3725547231 -1.42D-07 1.68D-06 1.53D-07 358.6

TIME:
Total times  cpu:      996.5s     wall:     2067.2s

My CPU Information:

processor : 0
vendor_id : AuthenticAMD
cpu family : 21
model : 19
model name : AMD A10-6800K APU with Radeon(tm) HD Graphics
stepping : 1
microcode : 0x6001119
cpu MHz : 2000.000
cache size : 2048 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 16
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1
bogomips : 8200.97
TLB size : 1536 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor : 1
vendor_id : AuthenticAMD
cpu family : 21
model : 19
model name : AMD A10-6800K APU with Radeon(tm) HD Graphics
stepping : 1
microcode : 0x6001119
cpu MHz : 2000.000
cache size : 2048 KB
physical id : 0
siblings : 4
core id : 1
cpu cores : 2
apicid : 17
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1
bogomips : 8200.97
TLB size : 1536 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor : 2
vendor_id : AuthenticAMD
cpu family : 21
model : 19
model name : AMD A10-6800K APU with Radeon(tm) HD Graphics
stepping : 1
microcode : 0x6001119
cpu MHz : 2000.000
cache size : 2048 KB
physical id : 0
siblings : 4
core id : 2
cpu cores : 2
apicid : 18
initial apicid : 2
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1
bogomips : 8200.97
TLB size : 1536 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor : 3
vendor_id : AuthenticAMD
cpu family : 21
model : 19
model name : AMD A10-6800K APU with Radeon(tm) HD Graphics
stepping : 1
microcode : 0x6001119
cpu MHz : 2000.000
cache size : 2048 KB
physical id : 0
siblings : 4
core id : 3
cpu cores : 2
apicid : 19
initial apicid : 3
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1
bogomips : 8200.97
TLB size : 1536 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

My RAM Memory: 16 GB

My OS: Ubuntu 14.04 64bit with gcc-4.8
I get the same result on Ubuntu 16.04 64bit with gcc-5.4

Is this a bug?...
Edited On 8:09:23 PM PDT - Mon, Oct 31st 2016 by Rintontin

Forum Vet
Threads 10
Posts 1643
Have you compiled NWChem yourself?
If this is the case, could you try to install the Ubuntu NWChem package and see if the NaN shows up with that binary, too?

Clicked A Few Times
Threads 5
Posts 15
For this particular case of tartaric acid the best result I get is to take the main source code "Nwchem-6.6.revision27746-src.2015-10-20.tar.gz" apply all 20 patches and add "driver clear end" to input file as you advise me and I get only three NaN and two empty sections at Dii-err as I show above.


I compiled the source code "nwchem_6.6+r27746.orig.tar.bz2" and all calculation falls, even applying the command "driver clear end" to input file.

Here is my "make.sh" file

export USE_NOFSCHECK=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all python"
export LARGE_FILES=TRUE
export ENABLE_COMPONENT=yes
export PYTHONHOME=/usr
export PYTHON_EXE=python
export PYTHONVERSION=2.7
export PYTHONPATH=/usr/lib/python2.7
export USE_PYTHONCONFIG=yes
export PYTHONCONFIGDIR=config-x86_64-linux-gnu
export PYTHONLIBTYPE=so
export USE_PYTHON64=yes
export USE_64TO32=yes
export CC=gcc
export FC=gfortran
export HAS_BLAS=yes
export BLAS_SIZE=4
export BLASOPT="-L/usr/lib -lopenblas"
export USE_MPI=yes
export USE_MPIF=yes
export USE_MPIF4=yes
export ARMCI_NETWORK=SOCKETS
export MPI_LOC=/usr/lib/openmpi
export MPI_LIB=$MPI_LOC/lib
export MPI_INCLUDE=$MPI_LOC/include
export LIBMPI="-L$MPI_LIB -lpthread -lmpi_f90 -lmpi_f77 -lmpi_cxx -lmpi"
export MRCC_THEORY=TRUE
export MRCC_METHODS=yes
cd $NWCHEM_TOP/src
make clean
make nwchem_config 2>&1 | tee make.nwchem_config.log
make 64_to_32
make
cd ../contrib
./getmem.nwchem
Edited On 1:35:34 PM PDT - Wed, Nov 2nd 2016 by Rintontin

Forum Vet
Threads 10
Posts 1643
I don't see any obvious problem with the settings you used to compile.

Let me try to repeat the message I was trying to convey yesterday.

In order to understand if there is any component (either hardware or software) present on your computer that could be causing this failure, could you try to use the NWChem binary that is distributed with Ubuntu (that you can install with the command sudo apt-get install nwchem) and see if that binary shows the same problem?

Clicked A Few Times
Threads 5
Posts 15
I installed the binary with a sudo command and the problem persists.

Forum Vet
Threads 10
Posts 1643
Quote:Rintontin Nov 3rd 7:46 am
I installed the binary with a sudo command and the problem persists.

Could you upload the full input and output files to a public website?
Thanks

Clicked A Few Times
Threads 5
Posts 15
The following file was produced by the binary taken from ubuntu's repository.

Input file:
tartaric-acid.nw

Output file:
tartaric-acid.out

Forum Vet
Threads 10
Posts 1643
Unfortunately, I am still not able to reproduce your failure.
I have built a new/clean Docker image of Ubuntu/14.04 and run the input file you provided with no problem.
The only suggestion I could have at this point is to add the "direct" keyword to the dft input section (this might rule out "unlikely" I/O problems on your computer that might eventually result in the numerical problems you are experiencing)

Clicked A Few Times
Threads 5
Posts 15
Then my computer has the problem.

Thanks for your help...

Forum Vet
Threads 10
Posts 1643
Quote:Rintontin Nov 3rd 4:10 pm
Then my computer has the problem.

Thanks for your help...

That would be my conclusion, too.

Clicked A Few Times
Threads 5
Posts 15
The problem was in RAM memory.

I was using ECC in a motherboard that does not support it.

Forum Vet
Threads 10
Posts 1643
Thanks for the feedback.
Memory errors could clearly explain the NaNs results.


Forum >> NWChem's corner >> Running NWChem



Who's here now Members 0 Guests 1 Bots/Crawler 0


AWC's: 2.5.10 MediaWiki - Stand Alone Forum Extension
Forum theme style by: AWC