segmentation fault--running small test case, compiled with gfortran

From NWChem

You are viewing a single post from the thread title above
Jump to: navigation, search

Click here for full thread
Gets Around
Threads 13
Posts 99
I went for the first solution since e.g. Octave wants/needs atlas/blas.

I think I patched it ok:
diff nwchem-6.1_internal/src/tools/GNUmakefile nwchem-6.1_external/src/tools/GNUmakefile 
359c359
< MAYBE_BLAS = --without-blas
---
> MAYBE_BLAS = --with-blas8="$(strip $(BLAS_LIB))"

I patched it and then used the 'bad' makefile in my earlier post (sans python) to do a full compile on a freshly extracted copy. That didn't work.

Odd observation -- first time I run it (./nwchem test.nw) I get something about a missing file:
0:Segmentation Violation error, status=: 11
(rank:0 hostname:beryllium pid:7129):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
Last System Error Message from Task 0:: No such file or directory

If I re-run immediately after I don't:
0:Segmentation Violation error, status=: 11
(rank:0 hostname:beryllium pid:7282):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
nwchem: malloc.c:3096: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
[beryllium:07282] *** Process received signal ***

The only difference is the presence of *.drv.hess and *.db -- my input is at the end of this post

To cover all bases, I then did (steps 5-8, solution 1) make clean in the tools dir, followed by make link in the src dir (after setting all the env vars).

Execution still fails though -- and looking at config.log it seems that it's still picking up the blas libs (see below).
./nwchem test.nw
0:Segmentation Violation error, status=: 11
(rank:0 hostname:beryllium pid:1235):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
nwchem: malloc.c:3096: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
[beryllium:01235] *** Process received signal ***
[beryllium:01235] Signal: Aborted (6)
[beryllium:01235] Signal code: (-6)
[beryllium:01235] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xf030) [0x2ad650d12030]
[beryllium:01235] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35) [0x2ad651936475]
[beryllium:01235] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x180) [0x2ad6519396f0]
[beryllium:01235] [ 3] /lib/x86_64-linux-gnu/libc.so.6(+0x75d4a) [0x2ad651979d4a]
[beryllium:01235] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x78c73) [0x2ad65197cc73]
[beryllium:01235] [ 5] /lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x70) [0x2ad65197e960]
[beryllium:01235] [ 6] /usr/lib/libopen-pal.so.0(+0x31d1d) [0x2ad6506a0d1d]
[beryllium:01235] [ 7] /usr/lib/libopen-pal.so.0(opal_show_help_vstring+0xac) [0x2ad65069ec0c]
[beryllium:01235] [ 8] /usr/lib/libopen-rte.so.0(orte_show_help+0xaf) [0x2ad65043bcbf]
[beryllium:01235] [ 9] /usr/lib/libmpi.so.0(MPI_Abort+0x74) [0x2ad6501b8d54]
[beryllium:01235] [10] ./nwchem() [0x29073e2]
[beryllium:01235] [11] ./nwchem() [0x28f4703]
[beryllium:01235] [12] /lib/x86_64-linux-gnu/libc.so.6(+0x324f0) [0x2ad6519364f0]
[beryllium:01235] [13] ./nwchem() [0x2a1821f]
[beryllium:01235] [14] ./nwchem() [0x2811320]
[beryllium:01235] [15] ./nwchem() [0x28147d5]
[beryllium:01235] [16] ./nwchem() [0x27857ab]
[beryllium:01235] [17] ./nwchem() [0x5d3dda]
[beryllium:01235] [18] ./nwchem() [0x5b86d8]
[beryllium:01235] [19] ./nwchem() [0x5ae0f5]
[beryllium:01235] [20] ./nwchem() [0x586edb]
[beryllium:01235] [21] ./nwchem() [0x41b347]
[beryllium:01235] [22] ./nwchem() [0x41c7c7]
[beryllium:01235] [23] ./nwchem() [0x527b85]
[beryllium:01235] [24] ./nwchem() [0x41d23b]
[beryllium:01235] [25] ./nwchem() [0x40e227]
[beryllium:01235] [26] ./nwchem() [0x406f2c]
[beryllium:01235] [27] ./nwchem() [0x40742d]
[beryllium:01235] [28] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd) [0x2ad651922ead]
[beryllium:01235] [29] ./nwchem() [0x405639]
[beryllium:01235] *** End of error message ***
Aborted

ldd nwchem
       linux-vdso.so.1 =>  (0x00007fffd11a1000)
libmpi.so.0 => /usr/lib/libmpi.so.0 (0x00002abd73a64000)
libopen-rte.so.0 => /usr/lib/libopen-rte.so.0 (0x00002abd73d17000)
libopen-pal.so.0 => /usr/lib/libopen-pal.so.0 (0x00002abd73f65000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00002abd741bc000)
libmpi_f77.so.0 => /usr/lib/libmpi_f77.so.0 (0x00002abd743c1000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00002abd745f9000)
libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 (0x00002abd74815000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00002abd74b2c000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00002abd74dae000)
libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00002abd74fc4000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00002abd751fa000)
libnsl.so.1 => /lib/x86_64-linux-gnu/libnsl.so.1 (0x00002abd75581000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00002abd75799000)
/lib64/ld-linux-x86-64.so.2 (0x00002abd73842000)


cat src/tools/build/config.log|egrep "atlas|ATLAS|blas|BLAS"
configure:18162: Checks for BLAS,LAPACK,ScaLAPACK
configure:18408: Attempting to locate BLAS library
configure:18414: checking for BLAS with user-supplied flags
configure:18546: checking for BLAS in AMD Core Math Library
configure:18682: checking for BLAS in Intel Math Kernel Library
configure:18818: checking for BLAS in ATLAS
configure:18939: gfortran -o conftest conftest.f -lf77blas -latlas -lm >&5
/usr/bin/ld: cannot find -lf77blas
/usr/bin/ld: cannot find -latlas
configure:18961: checking for BLAS in PhiPACK libraries
configure:19079: gfortran -o conftest conftest.f -lsgemm -ldgemm -lblas -lm >&5
configure:19101: checking for BLAS in Apple Accelerate.framework
configure:19227: checking for BLAS in Apple vecLib.framework
configure:19353: checking for BLAS in Alpha CXML library
configure:19608: checking for BLAS in Sun Performance Library
configure:19861: checking for BLAS in SGI/Cray Scientific Library
configure:19987: checking for BLAS in SGIMATH library
configure:20113: checking for BLAS in IBM ESSL library
configure:20224: gfortran -o conftest conftest.f -lessl -lblas -lm >&5
configure:20246: checking for BLAS in generic library
configure:20344: gfortran -o conftest conftest.f -lblas -lm >&5
configure:20652: gfortran -o conftest conftest.f -lblas -lm >&5
configure:20709: cc -o conftest conftest.c -llapack -L/usr/lib/openmpi/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/4.6 -L/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/openmpi/lib -L/usr/lib/gcc/x86_64-linux-gnu/4.6/../../.. -lgfortran -lm -lquadmath -lblas -lm >&5
configure:20992: gfortran -o conftest -L/usr/lib/openmpi/lib conftest.f -llapack -lblas -lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread -lm >&5
configure:21081: gfortran -o conftest -L/usr/lib/openmpi/lib conftest.f -lscalapack -llapack -lblas -lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread -lm >&5
| #define HAVE_BLAS 1
| #define BLAS_SIZE 4
| #define HAVE_BLAS 1
| #define BLAS_SIZE 4
configure:38904: BLAS_LDFLAGS= configure:38906: BLAS_LIBS=-lblas
configure:38908: BLAS_CPPFLAGS= BLAS_CPPFLAGS=
BLAS_LDFLAGS=
BLAS_LIBS='-lblas'
HAVE_BLAS_FALSE='#'
HAVE_BLAS_TRUE=
  1. define HAVE_BLAS 1
  2. define BLAS_SIZE 4

As for debian atlas/blas:
i libblas-dev - Basic Linear Algebra Subroutines 3, static
i A libblas3gf - Basic Linear Algebra Reference implementat
i libatlas-dev - Automatically Tuned Linear Algebra Softwar
i A libatlas3gf-base - Automatically Tuned Linear Algebra Softwar
EDIT: in the interest of full disclosure, I've also got my own ATLAS and acml5.1.0 installed and present in my LD_LIBRARY_PATH which I've used in conjunctino with nwchem 6.0 -- again, I would presume that NOT specifying BLASOPT should FORCE the use of internal libs.

Nwchem 6.1 is in the debian repos, and they have supplied the following set of patches -- some which I understand and some which I don't: http://ftp.de.debian.org/debian/pool/main/n/nwchem/nwchem_6.1-3.debian.tar.gz
Also, some of the patches (.e.g01_hardcode_basis-sets_location.patch) are clearly referencing nwchem-6.0, while the source (looking at nwchem.F) is 6.1.

My test file:

scratch_dir /scratch
memory 2000 mb
start benzene

geometry units angstroms
C 0.100 1.396 0.000
C 1.209 0.698 0.000
C 1.209 -0.698 0.000
C 0.000 -1.396 0.000
C -1.209 -0.698 0.000
C -1.209 0.698 0.000
H 0.000 2.479 0.000
H 2.147 1.240 0.000
H 2.147 -1.240 0.000
H 0.400 -2.479 0.000
H -2.147 -1.240 0.000
H -2.147 1.240 0.000
end
basis
H library "6-311++g**" 
c library "6-311++g**"
end
dft
       direct
end

dft
   xc b3lyp
end
task dft optimize
Edited On 11:49:26 PM PDT - Wed, May 16th 2012 by Ohlincha


Who's here now Members 0 Guests 0 Bots/Crawler 0


AWC's: 2.5.10 MediaWiki - Stand Alone Forum Extension
Forum theme style by: AWC