ARMCI ONESIDED SIZEOF IREQ

From NWChem

Viewed 1653 times, With a total of 5 Posts
Jump to: navigation, search

Just Got Here
Threads 1
Posts 3
Hello,

Not sure if this is a problem with compiling or running. We've compiled NWCHEM on Cray XE6m-200 using instructions from the documentation. When we try to run it we get and error:

ARMCI configured for 3 cluster nodes. Network protocol is 'Cray Onesided'.
ARMCI_ONESIDED_SIZEOF_IREQ is not sized correctly.
ARMCI_ONESIDED_SIZEOF_IREQ = 21016
sizeof(armci_ireq_t) = 22040
Application 828140 exit codes: 134

Anyone knows what can be the problem and workaround?

Thanks,
Alex.
Edited On 4:08:55 AM PDT - Fri, Apr 10th 2015 by Sazs

Forum Vet
Threads 10
Posts 1584
Please edit the following file
$NWCHEM_TOP/src/tools/ga-5-3/armci/src-gemini/armci.h
and change the following line from
#define ARMCI_ONESIDED_SIZEOF_IREQ 21016
to
#define ARMCI_ONESIDED_SIZEOF_IREQ 22040

Then, recompile and relink by typing


cd $NWCHEM_TOP/src/tools/build
make FC=ftn install
cd ../..
make FC=ftn link

Just Got Here
Threads 1
Posts 3
Thank you! Now it starts, but I get:
nwchem: ../../ga-5-3/armci/src-gemini/buffers.c:625: _armci_buf_get: Assertion `ar->req.active == 0' failed.

Program received signal SIGABRT: Process abort signal.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x357A4CD in _gfortrani_backtrace at backtrace.c:258
#1  0x3562C20 in _gfortrani_backtrace_handler at compile_options.c:129
#2  0x342F87F in system
#3  0x3759C7C in __libc_fork at fork.c:188
#4  0xFFFFFFFFFFFFFFFF


This is my input file:
start caca

title "caca"
charge 0

memory total 1 gb

geometry units ang  print xyz noautoz
 C                    -0.69678829     1.73570628    -0.00142243
 C                    -0.01620581     1.54099739     1.21131057
 C                     1.32902041     1.15171656     1.19964001
 C                     2.00018884     0.95527676    -0.01379517
 C                     1.31155734     1.15232665    -1.22005238
 C                    -0.03685172     1.54253226    -1.20631979
 H                    -0.52673449     1.71951829     2.15748547
 H                     1.85815361     1.03470211     2.14251901
 H                     3.05396468     0.68571506    -0.02008303
 H                     1.82534835     1.04031559    -2.17156523
 H                    -0.57103646     1.73534382    -2.13423958
 O                    -2.02300658     2.11936078    -0.05844134
 H                    -2.31951336     2.43887107     0.81230826
end

basis spherical
#  * library cc-pVTZ
  * library cc-pVDZ
end


scf
 rhf
 singlet
 maxiter 80
end



mp2
 freeze core atomic
end


task direct_mp2



What can be the problem?

Thanks,
Alex.

Forum Vet
Threads 10
Posts 1584
Not quite sure what the problem could be.
I would try the following
1) decrease the memory requirement from the input file
2) use only half of the cores on each node

Just Got Here
Threads 1
Posts 3
I've tried playing with the memory per core, but it didn't help. I've compiled a version with dmapp and Cray version of GA (as recommended in the docs) and looks like it is working. I guess it is not as good as gemini armci network for XE6? Do you know if there is any preference and performance difference between these 2 on Gemini interconnect?

Thanks,
Alex.

Forum Vet
Threads 10
Posts 1584
Alex
We are aware of the current issues (both DMAPP and GEMINI ports) on Cray machines.
We plan to release a fix for this within the next few months.
If you are interested in testing this new version before it is released, please drop me a PM
http://www.nwchem-sw.org/index.php/Special:AWCforum/member_options/pminbox


Forum >> NWChem's corner >> Running NWChem



Who's here now Members 0 Guests 1 Bots/Crawler 0


AWC's: 2.5.10 MediaWiki - Stand Alone Forum Extension
Forum theme style by: AWC