Problem building NWChem version 6.5 on IB cluster with MKL & IntelMPI

From NWChem

Viewed 792 times, With a total of 24 Posts
Jump to: navigation, search

Jump to page 12Next 16Last
Clicked A Few Times
Threads 1
Posts 8
Hello All,

Is there a way to modify the compilation options? I need to get rid of the -xHost flag since the executable will be run on a non homogeneous (in terms of CPUs) cluster.

A first compilation attempt indicated that the include location for the MKLs could not be found. Is there a way to pass the location of the MKL include directory? What is the name of the variable?

Below is the script I am currently using.

Thanks,

Raffaella.


####################################################
 #!/bin/bash
. /u/local/Modules/default/init/modules.sh
module load intel/14.cs
module load intelmpi/5.0.0
export NWCHEM_TOP=/u/local/downloads/nwchem/6.5/rev26243/
export NWCHEM_TARGET=LINUX64
export ARMCI_NETWORK=OPENIB
export IB_HOME=/usr
export IB_INCLUDE=/usr/include/infiniband/
export IB_LIB=/usr/lib64
export IB_LIB_NAME="-libumad -libverbs -lpthread"
export USE_MPI=Y
export USE_MPIF=Y
export USE_MPIF4=Y
export MPI_LOC=/u/local/compilers/intel/impi/5.0.0.028
export MPI_INCLUDE="-I/u/local/compilers/intel/impi/5.0.0.028/intel64/include"
export MPI_LIB="/u/local/compilers/intel/impi/5.0.0.028/intel64/lib/release -L/u/local/compilers/intel/impi/5.0.0.028/intel64/lib"
export LIBMPI="-lmpifort -lmpi -lmpigi -ldl -lrt -lpthread"
export NWCHEM_MODULES="all python"
export LARGE_FILES=TRUE
export USE_NOFSCHECK=TRUE
#export LIB_DEFINES=-DDFLT_TOT_MEM=16777216
export PYTHONHOME=/usr
export PYTHONVERSION=2.6
export USE_PYTHON64=y
export PYTHONLIBTYPE=so
sed -i 's/libpython$(PYTHONVERSION).a/libpython$(PYTHONVERSION).$(PYTHONLIBTYPE)/g' config/makefile.h
export HAS_BLAS=yes
export USE_SCALAPACK=y
export MKLLIB=/u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64
export MKLINC=/u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/include
export BLASOPT="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export SCALAPACK="-L$MKLLIB -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lmkl_blacs_intelmpi_ilp64 -lpthread -lm"
k
export FC=ifort
export CC=icc
echo "cd $NWCHEM_TOP/src"
cd $NWCHEM_TOP/src
echo "BEGIN --- make realclean"
make realclean
echo "END --- make realclean"
echo "BEGIN --- make nwchem_config"
make nwchem_config
echo "END --- make nwchem_config"
echo "BEGIN --- make CC=icc FC=ifort -j4"
make CC=icc FC=ifort -j4
echo "END --- make CC=icc FC=ifort -j4"
 ####################################################

Forum Vet
Threads 4
Posts 936
Could you upload the file
$NWCHEM_TOP/src/tools/build/config.log
to a website where I can access it.

Could you also send the output of the command

ls -l /u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64

Thanks, Edo

Clicked A Few Times
Threads 1
Posts 8
config.log
Hello Edo,

Here is the file you requested (NWCHEM_TOP/src/tools/build/config.log):

https://ucla.box.com/s/6u0pqd2kthzirfbkdq8vp0buq7v7xztr

and here is the output of the command:

ls -l /u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64
https://ucla.box.com/s/0pf93r20mvhilllez34yxpwvw8j08w98

The output of the (failed) compilation attempt is also here:

 https://ucla.box.com/s/c7tfj46o89cvd9wqcdfz262a5hfe51bo

and the script to build is:

https://ucla.box.com/s/28kxt6tn6ktx8fl2va9umvk68xhxjw8a

Please let me know if you need any other info.

Again the goal would be to create a distributable executable (no xHost or any other host related optimization flags), and to make sure that the includes for the MKL are found.

Grazie

Raffaella.

Clicked A Few Times
Threads 1
Posts 8
openmp
Hello again,

I see that openmp is also switched on by default. However this could end up being a mess with having to set up a host file for parallel run. Is there a way to switch opnemp off? Is every part of the code using openmp?

Thanks,

Raffaella.

Forum Vet
Threads 4
Posts 936
Raffaella
I do not see any problem after inspecting your compilation logs.
Could you be more precise about what you exactly mean by
"... A first compilation attempt indicated that the include location for the MKLs could not be found"?
Does it mean that you get a NWChem binary, but that when you try to run it, you got a failure to find the MKL libraries?
If my analysis somewhat describe what you have experienced, please define
export LD_LIBRARY_PATH=/u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64:$LD_LIBRARY_PATH

Forum Vet
Threads 4
Posts 936
Simply add the following to the scripts you will use to run NWChem jobs and OpenMP will never kick in
export OMP_NUM_THREADS=1
Quote:Rdauria May 20th 12:19 pm
Hello again,

I see that openmp is also switched on by default. However this could end up being a mess with having to set up a host file for parallel run. Is there a way to switch opnemp off? Is every part of the code using openmp?

Thanks,

Raffaella.

Clicked A Few Times
Threads 1
Posts 8
executable not generated
Hello,

The compilation ended prematurely because of undefined references. The LD_LIBRARY_PATH is defined as you suggested in the modulefile.

Looking at the files which I have sent you I noticed that the LAPACK where not being found (this is from the log file):

configure: WARNING: LAPACK library not found, using internal LAPACK

I have therefore tried to add to my script the lines:

export LAPACK_LIBS="-L$MKLLIB -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export LAPACK_CPPFLAGS="-DMKL_ILP64 -I$(MKLROOT)/include"

but the lapack are still not being found (notice MKL's LAPACK do not have anymore the work lapack in their names, unless I used libmkl_lapack95_ilp64.a).

Any suggestions?

Thanks,

Raffaella.

Forum Vet
Threads 4
Posts 936
Sorry, It took me a second try to read the large compilation log.
Analysis done in a bit ...
Later, Edo
Edited On 2:34:28 PM PDT - Wed, May 20th 2015 by Edoapra

Forum Vet
Threads 4
Posts 936
I can see that the undefined references are of the kind "ygemm_".
This kind of failures is not directly related to the MKL detection in the tools autoconf.
Instead, the source you are using was once processed with the command
make 64_to_32

The other issue I can see from your log is that you are using 64-bit integers for MKL (both blas and scalapack),
but you have not defined SCALAPACK_SIZE=8.

I am suggesting you to do the following:

1) set the following three env. variables

export SCALAPACK_SIZE=8
export BLAS_SIZE=8
export USE_64TO32=y

(the second export is not strictly necessary, since 8 it is the default value)
3) recompile the tools by executing the following commands

cd $NWCHEM_TOP/src/tools
rm -rf build install
make FC=ifort

4) relink by executing the following commands

cd $NWCHEM_TOP/src
make FC=ifort link

Clicked A Few Times
Threads 1
Posts 8
ld: cannot find -l64to32
Hi Edo,

I followed your instructions but at the step:

make FC=ifort link

I got:

ld: cannot find -l64to32
make: *** [link] Error 1

(see more details below).

I think I will re-unpack the nchem source and start from scratch. But should I not use the 64-bit integers?

Thanks,

Raffaella.

nwchem.F(463): (col. 11) remark: vectorization support: unaligned access used inside loop body
nwchem.F(463): (col. 11) remark: loop was not vectorized: vectorization possible but seems inefficient
ifort -i8 -align -vec-report6 -fimf-arch-consistency=true -O2 -g -fp-model source  -Wl,--export-dynamic  -L/u/local/downloads/nwchem/6.5/rev26243//lib/LINUX64 -L/u/local/downloads/nwchem/6.5/rev26243//src/tools/install/lib  -o /u/local/downloads/nwchem/6.5/rev26243//bin/LINUX64/nwchem nwchem.o stubs.o -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lnwxc -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -lnwpython -ldrdy -lvscf -lqmmm -lqmd -letrans -lpspw -ltce -lbq -lcons -lperfm -ldntmc -lccca -lnwcutil -lga -larmci -lpeigs -lperfm -lcons -lbq -lnwcutil /usr/lib64/python2.6/config/libpython2.6.so -L/u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64 -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lmkl_blacs_intelmpi_ilp64 -lpthread -lm   -l64to32 -L/u/local/compilers/intel/cs-2013-SP1-u1/composer_xe_2013_sp1.3.174/mkl/lib/intel64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm  -llapack  -lblas   -L/u/local/compilers/intel/impi/5.0.0.028/intel64/lib/release -L/u/local/compilers/intel/impi/5.0.0.028/intel64/lib -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread   -libumad -libverbs -lpthread  -lnwcutil  -lpthread -lutil -ldl -lz  
ld: cannot find -l64to32
make: *** [link] Error 1

Forum Vet
Threads 4
Posts 936
One more step needed (sorry for missing it earlier .. my bad)

1) compile 64to32blas

cd $NWCHEM_TOP/src/64to32blas
make FC=ifort

2) try again to relink

cd $NWCHEM_TOP/src
make FC=ifort link

Clicked A Few Times
Threads 1
Posts 8
Hell Edo,

I followed the steps you suggested and I was able to produce an output (not sure whether it will be able to run everywhere on the cluster as the compiler flag xHost was turned on). However when I tried to run some of the examples I faced this problem (which we have been encountering on the current version that we have installed here, and that was what prompted us to move to the latest version):

0:Segmentation Violation error, status=: 11
(rank:0 hostname:n2180 pid:20922):ARMCI DASSERT fail. ../../ga-5-3/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
Last System Error Message from Task 0:: Inappropriate ioctl for device
application called MPI_Abort(comm=0x84000001, 11) - process 0


I think I will start a fresh with a newly unpacked version of the source. In the meantime any guidance on this error that seems to be associated with using the intel compiler (from version 13 and up) would be much appreciated.

Thanks,

Raffaella.

Forum Vet
Threads 4
Posts 936
FOPTIMIZE=-O3
If you do not want -xhost to be used, please compile by using the command

make FC=ifort FOPTIMIZE=-O3

Clicked A Few Times
Threads 1
Posts 8
USE_OPENMP=no
Hello Edo,

To switch off openmp should I define the following environmental variable?

USE_OPENMP=no

Or would you suggest to leave openmp on?

Are we supposed to run nwchem doing openmp within a node and mpi across different nodes?

Or should I just define a OMP_NUM_THREADS=1 for parallel runs?

Thanks,

Raffaella.

Clicked A Few Times
Threads 1
Posts 11
compiling nwchem-6.5 MKL Composer XE 2013 SP1
Hi there,

I am also trying to compile nwchem-6.5 on a intel xeon infiniband cluster with Intel Composer XE 2013 SP1 compilers. I will be interested in learning about the final set of env variables (for instance) you used---so that I can compare them with mine.

Best regards,

Alejandro


Forum >> NWChem's corner >> Compiling NWChem
Jump to page 12Next 16Last



Who's here now Members 0 Guests 1 Bots/Crawler 0


AWC's: 2.5.10 MediaWiki - Stand Alone Forum Extension
Forum theme style by: AWC