Problem building NWChem version 6.5 on IB cluster with MKL & IntelMPI

From NWChem

Viewed 793 times, With a total of 24 Posts
Jump to: navigation, search

Jump to page 1Prev 162Next 16Last
Forum Vet
Threads 4
Posts 936
Could you upload the file
$NWCHEM_TOP/src/tools/build/config.log
to a website where I can access it.

Could you also send the output of the command

ls -l /u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64

Thanks, Edo

Clicked A Few Times
Threads 1
Posts 8
config.log
Hello Edo,

Here is the file you requested (NWCHEM_TOP/src/tools/build/config.log):

https://ucla.box.com/s/6u0pqd2kthzirfbkdq8vp0buq7v7xztr

and here is the output of the command:

ls -l /u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64
https://ucla.box.com/s/0pf93r20mvhilllez34yxpwvw8j08w98

The output of the (failed) compilation attempt is also here:

 https://ucla.box.com/s/c7tfj46o89cvd9wqcdfz262a5hfe51bo

and the script to build is:

https://ucla.box.com/s/28kxt6tn6ktx8fl2va9umvk68xhxjw8a

Please let me know if you need any other info.

Again the goal would be to create a distributable executable (no xHost or any other host related optimization flags), and to make sure that the includes for the MKL are found.

Grazie

Raffaella.

Clicked A Few Times
Threads 1
Posts 8
openmp
Hello again,

I see that openmp is also switched on by default. However this could end up being a mess with having to set up a host file for parallel run. Is there a way to switch opnemp off? Is every part of the code using openmp?

Thanks,

Raffaella.

Forum Vet
Threads 4
Posts 936
Raffaella
I do not see any problem after inspecting your compilation logs.
Could you be more precise about what you exactly mean by
"... A first compilation attempt indicated that the include location for the MKLs could not be found"?
Does it mean that you get a NWChem binary, but that when you try to run it, you got a failure to find the MKL libraries?
If my analysis somewhat describe what you have experienced, please define
export LD_LIBRARY_PATH=/u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64:$LD_LIBRARY_PATH

Forum Vet
Threads 4
Posts 936
Simply add the following to the scripts you will use to run NWChem jobs and OpenMP will never kick in
export OMP_NUM_THREADS=1
Quote:Rdauria May 20th 12:19 pm
Hello again,

I see that openmp is also switched on by default. However this could end up being a mess with having to set up a host file for parallel run. Is there a way to switch opnemp off? Is every part of the code using openmp?

Thanks,

Raffaella.

Clicked A Few Times
Threads 1
Posts 8
executable not generated
Hello,

The compilation ended prematurely because of undefined references. The LD_LIBRARY_PATH is defined as you suggested in the modulefile.

Looking at the files which I have sent you I noticed that the LAPACK where not being found (this is from the log file):

configure: WARNING: LAPACK library not found, using internal LAPACK

I have therefore tried to add to my script the lines:

export LAPACK_LIBS="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export LAPACK_CPPFLAGS="-DMKL_ILP64 -I$(MKLROOT)/include"

but the lapack are still not being found (notice MKL's LAPACK do not have anymore the work lapack in their names, unless I used libmkl_lapack95_ilp64.a).

Any suggestions?

Thanks,

Raffaella.

Forum Vet
Threads 4
Posts 936
Sorry, It took me a second try to read the large compilation log.
Analysis done in a bit ...
Later, Edo
Edited On 2:34:28 PM PDT - Wed, May 20th 2015 by Edoapra

Forum Vet
Threads 4
Posts 936
I can see that the undefined references are of the kind "ygemm_".
This kind of failures is not directly related to the MKL detection in the tools autoconf.
Instead, the source you are using was once processed with the command
make 64_to_32

The other issue I can see from your log is that you are using 64-bit integers for MKL (both blas and scalapack),
but you have not defined SCALAPACK_SIZE=8.

I am suggesting you to do the following:

1) set the following three env. variables

export SCALAPACK_SIZE=8
export BLAS_SIZE=8
export USE_64TO32=y

(the second export is not strictly necessary, since 8 it is the default value)
3) recompile the tools by executing the following commands

cd $NWCHEM_TOP/src/tools
rm -rf build install
make FC=ifort

4) relink by executing the following commands

cd $NWCHEM_TOP/src
make FC=ifort link

Clicked A Few Times
Threads 1
Posts 8
ld: cannot find -l64to32
Hi Edo,

I followed your instructions but at the step:

make FC=ifort link

I got:

ld: cannot find -l64to32
make: *** [link] Error 1

(see more details below).

I think I will re-unpack the nchem source and start from scratch. But should I not use the 64-bit integers?

Thanks,

Raffaella.

nwchem.F(463): (col. 11) remark: vectorization support: unaligned access used inside loop body
nwchem.F(463): (col. 11) remark: loop was not vectorized: vectorization possible but seems inefficient
ifort -i8 -align -vec-report6 -fimf-arch-consistency=true -O2 -g -fp-model source  -Wl,--export-dynamic  -L/u/local/downloads/nwchem/6.5/rev26243//lib/LINUX64 -L/u/local/downloads/nwchem/6.5/rev26243//src/tools/install/lib  -o /u/local/downloads/nwchem/6.5/rev26243//bin/LINUX64/nwchem nwchem.o stubs.o -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lnwxc -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -lnwpython -ldrdy -lvscf -lqmmm -lqmd -letrans -lpspw -ltce -lbq -lcons -lperfm -ldntmc -lccca -lnwcutil -lga -larmci -lpeigs -lperfm -lcons -lbq -lnwcutil /usr/lib64/python2.6/config/libpython2.6.so -L/u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64 -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lmkl_blacs_intelmpi_ilp64 -lpthread -lm   -l64to32 -L/u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm  -llapack  -lblas   -L/u/local/compilers/intel/impi/5.0.0.028/intel64/lib/release -L/u/local/compilers/intel/impi/5.0.0.028/intel64/lib -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread   -libumad -libverbs -lpthread  -lnwcutil  -lpthread -lutil -ldl -lz  
ld: cannot find -l64to32
make: *** [link] Error 1

Forum Vet
Threads 4
Posts 936
One more step needed (sorry for missing it earlier .. my bad)

1) compile 64to32blas

cd $NWCHEM_TOP/src/64to32blas
make FC=ifort

2) try again to relink

cd $NWCHEM_TOP/src
make FC=ifort link

Clicked A Few Times
Threads 1
Posts 8
Hell Edo,

I followed the steps you suggested and I was able to produce an output (not sure whether it will be able to run everywhere on the cluster as the compiler flag xHost was turned on). However when I tried to run some of the examples I faced this problem (which we have been encountering on the current version that we have installed here, and that was what prompted us to move to the latest version):

0:Segmentation Violation error, status=: 11
(rank:0 hostname:n2180 pid:20922):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
Last System Error Message from Task 0:: Inappropriate ioctl for device
application called MPI_Abort(comm=0x84000001, 11) - process 0


I think I will start a fresh with a newly unpacked version of the source. In the meantime any guidance on this error that seems to be associated with using the intel compiler (from version 13 and up) would be much appreciated.

Thanks,

Raffaella.

Forum Vet
Threads 4
Posts 936
FOPTIMIZE=-O3
If you do not want -xhost to be used, please compile by using the command

make FC=ifort FOPTIMIZE=-O3

Clicked A Few Times
Threads 1
Posts 8
USE_OPENMP=no
Hello Edo,

To switch off openmp should I define the following environmental variable?

USE_OPENMP=no

Or would you suggest to leave openmp on?

Are we supposed to run nwchem doing openmp within a node and mpi across different nodes?

Or should I just define a OMP_NUM_THREADS=1 for parallel runs?

Thanks,

Raffaella.

Clicked A Few Times
Threads 1
Posts 11
compiling nwchem-6.5 MKL Composer XE 2013 SP1
Hi there,

I am also trying to compile nwchem-6.5 on a intel xeon infiniband cluster with Intel Composer XE 2013 SP1 compilers. I will be interested in learning about the final set of env variables (for instance) you used---so that I can compare them with mine.

Best regards,

Alejandro

Clicked A Few Times
Threads 1
Posts 8
To Alejandro
Hi Alejandro,

Sorry for answering only. Here is the script I used:

#!/bin/bash
. /u/local/Modules/default/init/modules.sh
module load intel/14.cs
module load intelmpi/5.0.0
#export NWCHEM_TOP=/u/local/downloads/nwchem/6.5/rev26243/
export NWCHEM_TOP=/u/local/downloads/nwchem/6.5-rev26243
export NWCHEM_TARGET=LINUX64
export ARMCI_NETWORK=OPENIB
export IB_HOME=/usr
export IB_INCLUDE=/usr/include/infiniband/
export IB_LIB=/usr/lib64
export IB_LIB_NAME="-libumad -libverbs -lpthread"
export USE_MPI=Y
export USE_MPIF=Y
export USE_MPIF4=Y
export MPI_LOC=/u/local/compilers/intel/impi/5.0.0.028
export MPI_INCLUDE="-I/u/local/compilers/intel/impi/5.0.0.028/intel64/include"
export MPI_LIB="/u/local/compilers/intel/impi/5.0.0.028/intel64/lib/release -L/u/local/compilers/intel/impi/5.0.0.028/intel64/lib"
export LIBMPI="-lmpifort -lmpi -lmpigi -ldl -lrt -lpthread"
export NWCHEM_MODULES="all python"
export LARGE_FILES=TRUE
export USE_NOFSCHECK=TRUE
#export LIB_DEFINES=-DDFLT_TOT_MEM=16777216
export PYTHONHOME=/usr
export PYTHONVERSION=2.6
export USE_PYTHON64=y
export PYTHONLIBTYPE=so
sed -i 's/libpython$(PYTHONVERSION).a/libpython$(PYTHONVERSION).$(PYTHONLIBTYPE)/g' config/makefile.h
export HAS_BLAS=yes
export USE_SCALAPACK=y
export MKLLIB=/u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64
export MKLINC=/u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/include
export BLASOPT="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export LAPACK_LIBS="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
#export LAPACK_CPPFLAGS="-DMKL_ILP64 -I$MKLINC"
export SCALAPACK="-L$MKLLIB -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lmkl_blacs_intelmpi_ilp64 -lpthread -lm"
#export SCALAPACK_CPPFLAGS="-DMKL_ILP64 -I$MKLINC"
export SCALAPACK_SIZE=8
export BLAS_SIZE=8
export USE_64TO32=y
export FC=ifort
export CC=icc
echo "cd $NWCHEM_TOP/src"
cd $NWCHEM_TOP/src
echo "BEGIN --- make realclean "
make realclean
echo "END --- make realclean "
echo "BEGIN --- make nwchem_config "
make nwchem_config 
echo "END --- make nwchem_config "
echo "BEGIN --- make"
make CC=icc FC=ifort FOPTIMIZE=-O3 -j4 
echo "END --- make "
cd $NWCHEM_TOP/src/util
make CC=icc FC=ifort FOPTIMIZE=-O3 version
make CC=icc FC=ifort FOPTIMIZE=-O3 
cd $NWCHEM_TOP/src
make CC=icc FC=ifort FOPTIMIZE=-O3  link


Please notice that if your cluster contains nodes with slightly different CPU's (with different levels of SSE, for example), you will need to remove manually the -xHost flag from:

From $NWCHEM_TOP/src/custom/makefile.h the intel compiler xHost flag has been taken off from:
rc/config/makefile.h:        FOPTIMIZE = -O3 -xHost
src/config/makefile.h:        FOPTIMIZE += -xHost
src/config/makefile.h:        COPTIONS   +=   -xHOST -ftz


Good luck!

Raffaella.

Just Got Here
Threads 0
Posts 4
Hello Raffaella, Thanks for sharing the build script. Here is what I have based on yours.

#!/bin/bash
module load intel/compiler/64/15.0.0.090
module load intel/mpi/64/5.0.1.035
module load intel/mkl/64/11.2
export NWCHEM_TOP=/apps/nwchem/offline/6.5-26243
export NWCHEM_TARGET=LINUX64
export ARMCI_NETWORK=OPENIB
export IB_HOME=/usr
export IB_INCLUDE=/usr/include/infiniband/
export IB_LIB=/usr/lib64
export IB_LIB_NAME="-libumad -libverbs -lpthread"
export USE_MPI=Y
export USE_MPIF=Y
export USE_MPIF4=Y
export MPI_LOC=$I_MPI_ROOT
export MPI_INCLUDE="-I$I_MPI_ROOT/include"
export MPI_LIB="-L$I_MPI_ROOT/lib64"
export LIBMPI="-lmpifort -lmpi -lmpigi -ldl -lrt -lpthread"
export NWCHEM_MODULES="all python"
export LARGE_FILES=TRUE
export USE_NOFSCHECK=TRUE
export PYTHONHOME=/usr
export PYTHONVERSION=2.6
export USE_PYTHON64=y
export PYTHONLIBTYPE=so
sed -i 's/libpython$(PYTHONVERSION).a/libpython$(PYTHONVERSION).$(PYTHONLIBTYPE)/g' $NWCHEM_TOP/src/config/makefile.h
export HAS_BLAS=yes
export USE_SCALAPACK=y
export MKLLIB=$MKLROOT/lib/intel64
export MKLINC=$MKLROOT/include
export BLASOPT="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export LAPACK_LIBS="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export SCALAPACK="-L$MKLLIB -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lmkl_blacs_intelmpi_ilp64 -lpthread -lm"
export SCALAPACK_SIZE=8
export BLAS_SIZE=8
export USE_64TO32=y
export FC=ifort
export CC=icc
echo "cd $NWCHEM_TOP/src"
cd $NWCHEM_TOP/src
echo "BEGIN --- make realclean "
make realclean
echo "END --- make realclean "
echo "BEGIN --- make nwchem_config "
make nwchem_config
echo "END --- make nwchem_config "
echo "BEGIN --- make"
make CC=icc FC=ifort FOPTIMIZE=-O3 -j4
echo "END --- make "
cd $NWCHEM_TOP/src/util
make CC=icc FC=ifort FOPTIMIZE=-O3 version
make CC=icc FC=ifort FOPTIMIZE=-O3
cd $NWCHEM_TOP/src
make CC=icc FC=ifort FOPTIMIZE=-O3  link


But am running into this error.

ld: cannot find -lccsd
make: *** [link] Error 1


Any pointers on how to get around this would be very helpful.

Thank you

Clicked A Few Times
Threads 1
Posts 11
Thanks for your message! I will certainly try some of the options you used.

The main problem I am finding is that BLAS/LAPACK are not found, even though all the environmental variables are well set---perhaps that's why you defined the variables MKLLIB and MKLINC.

Best regards,

AD

Just Got Here
Threads 1
Posts 3
Quote:Edoapra May 21st 11:28 am
If you do not want -xhost to be used, please compile by using the command

make FC=ifort FOPTIMIZE=-O3


Even better is to use FPOTIMIZE="-O3 -axAVX + any other options" as that will use SSE and AVX where present. AVX2 can also be used if required in a similar way.

Just Got Here
Threads 1
Posts 3
I am also compiling on Intel, with MKL, although using a derivative of OpenMPI.

The compilation works, but when running an example I get:

mpirun -np 2 nwchem ccsdt_polar_small.nw
argument  1 = ccsdt_polar_small.nw
MA fatal error: MA_sizeof: invalid datatype: 343597384693
MA fatal error: MA_sizeof: invalid datatype: 343597384693
...
other failure messages

I pretty sure this is due to an issue with regards to MKL components, but I am not sure what it is I am doing wrong to get this. I am forcing 4 byte integers and 32 bit interfaces to go with it using the following

Just Got Here
Threads 0
Posts 4
Intel MKL, Intel MPI, Intel Compilers, IB build
Has anyone got a working script for Intel MKL, Intel MPI, Intel Compilers, IB build of NWChem 6.5?

Thank you.

Just Got Here
Threads 0
Posts 4
Intel MKL, Intel MPI, Intel Compilers, IB build
(rank:0 hostname:node-as-agpu-001 pid:11185):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/armci.c:ARMCI_Error():208 cond:0
  iter_orthog: failed to converge, error =   0.199219241745242     
 iter_orthog: failed to converge                   0
  iter_orthog: failed to converge, error =   0.199219241745242     
  iter_orthog: failed to converge, error =   0.199219241745242     
 iter_orthog: failed to converge                   0
  iter_orthog: failed to converge, error =   0.199219241745242     
 iter_orthog: failed to converge                   0


Is the error because of a bad input file or a bad NWChem build?

Appreciate any pointers on this.

Thanks.

Forum Vet
Threads 4
Posts 936
Quote:Roshan Sep 10th 9:10 am
Has anyone got a working script for Intel MKL, Intel MPI, Intel Compilers, IB build of NWChem 6.5?

Thank you.

Could you post the versions of MKL, Intel compilers and Intel MPI you are using?

Just Got Here
Threads 0
Posts 4
Intel MKL, Intel MPI, Intel Compilers, IB build
I have tried with

module load intel/compiler/64/14.0/2013_sp1.3.174
module load intel-mpi/64/4.1.3/049
module load intel/mkl/64/11.1/2013_sp1.3.174


and

module load intel/compiler/64/15.0.0.090
module load intel/mpi/64/5.0.1.035
module load intel/mkl/64/11.2


Build script

export NWCHEM_TOP=/apps/nwchem/offline/6.5-intel-impi
export NWCHEM_TARGET=LINUX64
export NWCHEM_LONG_PATHS=Y
# USE_NOIO can be set to avoid NWChem 6.5 doing I/O for the ddscf, mp2 and ccsd modules (it automatically sets USE_NOFSCHECK, too).
# It is strongly recommended on large clusters or supercomputers or any computer lacking any fast and large local filesystem. 
export USE_NOIO=TRUE
# LIB_DEFINES can be set to pass additional defines to the C preprocessor (for both Fortran and C), e.g. 
# Note: -DDFLT_TOT_MEM sets the default dynamic memory available for NWChem to run, where the units are in doubles.
export LIB_DEFINES='-DDFLT_TOT_MEM=16777216'
export ARMCI_NETWORK=OPENIB
export IB_HOME=/usr
export IB_INCLUDE=/usr/include
export IB_LIB=/usr/lib64
export IB_LIB_NAME="-libumad -libverbs -lpthread"
export USE_MPI=Y
export USE_MPIF=Y
export USE_MPIF4=Y
export MPI_LOC=$I_MPI_ROOT
export MPI_INCLUDE="-I$I_MPI_ROOT/include64"
export MPI_LIB="-L$I_MPI_ROOT/lib64"
export LIBMPI="-lmpiif -lmpi -ldl -lrt -lpthread"
export NWCHEM_MPIF_WRAP="mpiifort"
export NWCHEM_MPIC_WRAP="mpiicc"
export NWCHEM_MPICXX_WRAP="mpiicpc"
export NWCHEM_MODULES="all"
export LARGE_FILES=TRUE
export USE_NOFSCHECK=TRUE
export HAS_BLAS=yes
export USE_SCALAPACK=y
export MKLLIB=$MKLROOT/lib/intel64
export MKLINC=$MKLROOT/include
export BLASOPT="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export LAPACK_LIB="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export BLAS_LIB="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export SCALAPACK="-L$MKLLIB -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lmkl_blacs_intelmpi_ilp64 -lpthread -lm"
export SCALAPACK_LIB="-L$MKLLIB -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lmkl_blacs_intelmpi_ilp64 -lpthread -lm"
export SCALAPACK_SIZE=8
export BLAS_SIZE=8
export FC=ifort
export F77=ifort
export CC=icc
export CXX=icpc
export AR=xiar

Forum Vet
Threads 4
Posts 936
Roshan
I was not able to spot anything wrong in your settings.
The only potential problem might be the use of icc as C compiler.

Anyhow, is the input showing this failure a complex one?


Forum >> NWChem's corner >> Compiling NWChem
Jump to page 1Prev 162Next 16Last



Who's here now Members 0 Guests 1 Bots/Crawler 0


AWC's: 2.5.10 MediaWiki - Stand Alone Forum Extension
Forum theme style by: AWC