we are having problems with parallel scaling of NWChem on our supercomputer. We measured little or no parallel speedup with small test cases running on 1-128 CPUs.

For example, times for examples/qmd/3carbo.nw :

NCPU time[s]
1 401,509
2 276,147
4 193,832
8 119,013
16 118,552
32 228,424
64 331,187
128 1362,602

Other tests show worse results, for example calculating energy of (H2O)2 H: aug-cc-pVDZ, O: aug-cc-pVTZ, CCSD, the time of parallel calculation is always worse than when using only 1 CPU.

The computer is a Linux x86_64 cluster with Infiniband (Anselm). NWChem version 6.3-rev2, compiled with Intel. We tested both Intel MPI and Open MPI.

Is there some problem with our configuration or the input systems are just too small for NWChem parallelization?

Any help would be appreciated,

Martin Stachon
VSB-TU Ostrava & IT4Innovations

the input you are using is very small, therefore it will never scale.
Please have a look at the benchmark section on the NWChem web page for beefier input files

Dear Stachon, can you test parallel scaling with input from

I think it sufficiently big to be scalable.

You need to set the size of "memory global" to >=12Gb/n, where n is NCPU.

For 1 CPU, you must increase DEFAULT_MAX_NALLOC, but for 2 or more CPU's it is not needed.

Please apply the patch for all tasks.

Thanks, Vladimir

