SEARCH
TOOLBOX
LANGUAGES
Forum Menu

Run time error on multinode runs.

From NWChem

Viewed 97 times, With a total of 1 Posts
Jump to: navigation, search

Just Got Here
Threads 1
Posts 1
Hello: I am trying to get NWchem to run correctly on multiple nodes. It seems like my installation
runs correctly on a single node (8 cores). Whenever I request more than 1 node (let say 16 cores), the job aborts
with the following error before it complete (note that it has done SCF, CCSDt correctly and then fail during
EOM-CCSDt step). My input is one of the example from the test suite (nwchem-6.1/QA/tests/tce_active_ccsdt/tce_active_ccsdt.nw). May be someone in this forum has run into the same problem and
found a solution.

Thanks, Ajith Perera

mpirun: killing job...



mpirun noticed that process rank 0 with PID 2125 on node r11a-s20.ufhpc exited on signal 0 (Unknown signal 0).


0:Terminate signal was sent, status=: 15
(rank:0 hostname:r11a-s20.ufhpc pid:2125):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/signaltrap.c:SigTermHandler():472 cond:0
Last System Error Message from Task 0:: Inappropriate ioctl for device
forrtl: error (78): process killed (SIGTERM)
forrtl: error (78): process killed (SIGTERM)
forrtl: error (78): process killed (SIGTERM)

  • Karol Forum:Admin, Forum:Mod, NWChemDeveloper, bureaucrat, sysop
    Profile
    Send PM
Clicked A Few Times
Threads 0
Posts 8
Hi Ajit,
would you please send me the whole output (karol.kowalski@pnnl.gov). It seems to be linked to a large number of initial vectors used in the first iteration of the EOMCCSDt method.

Best,
Karol


Forum >> NWChem's corner >> Running NWChem



Who's here now Members 0 Guests 0 Bots/Crawler 1


AWC's: 2.5.10 MediaWiki - Stand Alone Forum Extension
Forum theme style by: AWC