CreateSharedRegion: kr malloc Numerical result out of range

From NWChem

You are viewing a single post from the thread title above
Jump to: navigation, search

Click here for full thread
Just Got Here
Threads 1
Posts 2
Someone in the lab that I work in has been trying to run some calculations with NWChem and I've been trying to help him get started running it with MPI. It has been a bumpy ride, however. At first, we were getting problems about not being able to allocate a shared block of memory. SHMMAX was already plenty high (as large as all of the physical memory), so I created a swap file, started calculations again, and waited.

Now they are crashing again, but this time with a much different error message. From what I can tell, it is another problem with allocating shared memory, but it seems like NWChem is passing an invalid (i.e., negative) number to the allocation function.

Here is the relevant error message:
 0:CreateSharedRegion:kr_malloc failed KB=: -772361
(rank:0 hostname:vivaldi.chem.utk.edu pid:3128):ARMCI DASSERT fail. ../../ga-5-1/armci/src/memory/shmem.c:Create_Shared_Region():1188 cond:0
Last System Error Message from Task 0:: Numerical result out of range
application called MPI_Abort(comm=0x84000007, -772361) - process 0
rank 0 in job 2 vivaldi.chem.utk.edu_35229 caused collective abort of all ranks
exit status of rank 0: killed by signal 9

And just in case, here is a link to the full output file:
http://web.eecs.utk.edu/~dbauer3/nwchem/macrofe_full631f.out

Running under Ubuntu 11.04 with MPICH2.


Who's here now Members 0 Guests 0 Bots/Crawler 0


AWC's: 2.5.10 MediaWiki - Stand Alone Forum Extension
Forum theme style by: AWC