Compiling/Running NWChem 6.8.1 over Intel Omni-Path and NVidia P100 offload

From NWChem

Viewed 596 times, With a total of 1 Posts

Forum >> NWChem's corner >> Compiling NWChem

Jskennyu Member
Profile
Send PM

Just Got Here

Threads 3
Posts 4

12:41:05 AM PDT - Sun, Apr 1st 2018

Dear Colleague,

I was trying to compile NWChem 6.8 for a cluster, each node is equipped dual Xeon Gold and 4x NVidia P100, and interconnected by Intel Omni Path. I plan to run CCSD(T) offload to the GPUs, and at first I wished to use ARMCI_NETWOR=ARMCI according to Hammond's instruction (https://github.com/jeffhammond/HPCInfo/blob/master/ofi/NWChem-OPA.md); but somehow I failed at the stage when compiling Casper, so I decided to try ARMCI_NETWOR=MPI-PR beforehand. Here is the setup of compilation,

 Currently Loaded Modulefiles:

 1) intel/2018_u1                  3) hwloc/1.11.6

 2) mvapich2/gcc/64/2.2rc1   4) cuda/8.0.61

export NWCHEM_TOP=$HOME/src/nwchem-6.8.1.MPI-PR
export FC=ifort
export CC=icc
export CXX=icpc
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export NWCHEM_TARGET=LINUX64
export USE_PYTHONCONFIG=y
export PYTHONVERSION=2.7
export PYTHONHOME=/usr
export BLASOPT="-L$MKLROOT/lib/intel64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm"
export USE_SCALAPACK=y
export SCALAPACK="-L$MKLROOT/lib/intel64 -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lmkl_blacs_intelmpi_ilp64 -lpthread -lm"
export NWCHEM_MODULES="all python"
export MRCC_METHODS=TRUE
export USE_OPENMP=1
export CUDA=nvcc
export TCE_CUDA=Y
export CUDA_LIBS="-L/pkg/cuda/8.0.61/lib64 -lcublas -lcudart"
export CUDA_FLAGS="-arch sm_60 "
export CUDA_INCLUDE="-I. -I/pkg/cuda/8.0.61/include"
export ARMCI_NETWORK=MPI-PR
export USE_MPI=y
export MPI_LOC=/usr/mpi/intel/mvapich2-2.2-hfi
export MPI_LIB=/usr/mpi/intel/mvapich2-2.2-hfi/lib
export MPI_INCLUDE=/usr/mpi/intel/mvapich2-2.2-hfi/include
export LIBMPI="-lmpichf90 -lmpich -lopa -lmpl -lpthread -libverbs -libumad -ldl -lrt"

The following make commands went successfully and I was able to obtain the binary of nwchem. However, I noticed in the log file that although I had set CUDA_FLAGS to "-arch sm_60", nvcc still used -arch=sm_35 during the compilation process, for example,

nvcc -c -O3 -std=c++11 -DNOHTIME -Xptxas --warn-on-spills -arch=sm_35 -I. -I/pkg/cuda/8.0.61/include -I/home/molpro/src/nwchem-6.8.1.MPI-PR/src/tce/ttlg/includes -o memory.o memory.cu

and some warnings also showed up such as:

nvcc -c -O3 -std=c++11 -DNOHTIME -Xptxas --warn-on-spills -arch=sm_35 -I. -I/pkg/cuda/8.0.61/include -I/home/molpro/src/nwchem-6.8.1.MPI-PR/src/tce/ttlg/includes -o sd_t_total_ttlg.o sd_t_total_ttlg.cu
Compiling ccsd_t_gpu.F...
./sd_top.fh(5): warning #7734: DEC$ ATTRIBUTES OFFLOAD is deprecated. [OMP_GET_WTIME]
cdir$ ATTRIBUTES OFFLOAD : mic :: omp_get_wtime

^
./sd_top.fh(5): warning #7734: DEC$ ATTRIBUTES OFFLOAD is deprecated. [OMP_GET_WTIME]
cdir$ ATTRIBUTES OFFLOAD : mic :: omp_get_wtime

^
..... and so on..

Then I just tried to run the QA/tests/tce_cuda/tce_cuda.nw ,

./runtests.mpi.unix procs 2 tce_cuda

but nwchem immediately popped error after SCF calculation as shown in the attached file. May I learn that:

1. Does the error came from my incorrect compilation, or due to the -arch flag of nvcc that , or I need to adjust the tce_cuda.nw to fit my running environment?
2. How can I correctly set the CUDA_FLAGS to force nvcc to do -arch=sm_60 for the P100 GPUs? During the compilation process that make command did not seem to honor the CUDA_FLAGS setting.

Thanks a lot for your kind help.

Kenny

Attached information with Media:tce_cuda.out errors follow.

argument  1 = /home/chem/src/nwchem-6.8.1.MPI-PR/QA/tests/tce_cuda/tce_cuda.nw

 NWChem w/ OpenMP: maximum threads =  1

...

!! The overlap matrix has   2 vectors deemed linearly dependent with

   eigenvalues:

0.00D+00 0.00D+00





     Superposition of Atomic Density Guess

     -------------------------------------



Sum of atomic energies:         -75.76222910

------------------------------------------------------------------------

ga_orthog: hard zero                   1

------------------------------------------------------------------------

------------------------------------------------------------------------

 current input line : 

   41: task tce energy

------------------------------------------------------------------------

------------------------------------------------------------------------

This error has not yet been assigned to a category

------------------------------------------------------------------------

For more information see the NWChem manual at 

http://www.nwchem-sw.org/index.php/NWChem_Documentation

For further details see manual section: 

No section for this category

[0] Received an Error in Communication: (1) 0:ga_orthog: hard zero:
[cli_0]: aborting job:
application called MPI_Abort(comm=0x84000004, 1) - process 0

=======================================================================

= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 151045 RUNNING AT glogin1
= EXIT CODE: 1
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

=======================================================================

Edoapra Forum:Admin, Forum:Mod, bureaucrat, sysop
Profile
Send PM

Forum Vet

Threads 9
Posts 1570

9:01:52 AM PDT - Mon, Apr 2nd 2018
I would try the following export BLAS_SIZE=8 export SCALAPACK_SIZE=8 cd $NWCHEM_TOP/src/tools make clean rm -rf build install make FC=ifort cd .. make FC=ifort link

Forum >> NWChem's corner >> Compiling NWChem

Who's here now Members 0 Guests 1 Bots/Crawler 0

AWC's: 2.5.10 MediaWiki - Stand Alone Forum Extension
Forum theme style by: AWC

Search

Navigation

SEARCH

TOOLBOX

LANGUAGES

Forum Menu

Compiling/Running NWChem 6.8.1 over Intel Omni-Path and NVidia P100 offload

From NWChem

=======================================================================

=======================================================================

Toolbox