Enabling TCE CUDA

From NWChem

Viewed 1772 times, With a total of 1 Posts
Jump to: navigation, search

Just Got Here
Threads 1
Posts 1
Hi,

we have a setup comprising of several blades, each equipped with two 8-way CPUs (Intel Xeon E5-2620-V2) and two Tesla K40 GPUs. In the environment
Currently Loaded Modules:
  1) StdEnv    2) intel/13.1.0    3) mkl/11.0.2    4) intelmpi/4.1.0    5) cuda/5.5

I compiled the version
    source          = /wrk/runeberg/taito_wrkdir/gpu/nwchem-src-2014-01-28
    nwchem branch   = Development
    nwchem revision = 25178
    ga revision     = 10467

using the setup
export NWCHEM_TOP=$PWD/nwchem-src-2014-01-28
export NWCHEM_TARGET=LINUX64
export USE_MPI=y
export USE_MPIF=y
export NWCHEM_MODULES=all
export LARGE_FILES=TRUE
export USE_NOFSCHECK=TRUE
export BLAS_LIB="-L$MKLROOT/lib/intel64 -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core -lpthread -lm"
export BLASOPT="$BLAS_LIB"
export BLAS_SIZE=8
export SCALAPACK_SIZE=8
export SCALAPACK="-L$MKLROOT/lib/intel64 -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_ilp64 -lpthread -lm"
export SCALAPACK_LIB="$SCALAPACK"
export USE_SCALAPACK=y
export TCE_CUDA=y
export CUDA_LIBS="-L$CUDA_PATH/lib64 -lcudart -lcublas"
export CUDA_FLAGS="-arch sm_35"
export CUDA_INCLUDE="-I. -I$CUDA_PATH/include"
export CUDA=nvcc
export PATH=$PATH:$CUDA_PATH/bin
[runeberg@taito-login3 gpu]$ cd $NWCHEM_TOP
./contrib/distro-tools/build_nwchem | tee build_nwchem.log


The build went smoothly but for some reason I can't get the binary to cooperate with
our setup/queuing system, slurm 2.6.7 - I can only launch one mpi process per gpu.
I can launch several mpi processes even over several blades (though the
performance is quite bad) but only engage one gpu per mpi-core. Any ideas?

Cheers,
             --Nino

  • Huub Forum:Admin, Forum:Mod, NWChemDeveloper, bureaucrat, sysop
    Profile
    Send PM
Forum Regular
Threads 1
Posts 185
Hi Nino,

At present we have only implemented the logic to control 1 GPU card per node. We simply set aside a single MPI process to control the GPU and other MPI processes on the same node should work in the same way as the code without GPU support. There are various potential options to extending this to multiple GPUs per node but we haven't established a general approach to doing this yet.

Huub


Forum >> NWChem's corner >> Compiling NWChem



Who's here now Members 0 Guests 1 Bots/Crawler 0


AWC's: 2.5.10 MediaWiki - Stand Alone Forum Extension
Forum theme style by: AWC