From NWChem
			You are viewing a single post from the thread title above
												
			
                  
        
            
                | 
                    
                 | 
            
            
                
                    
                        
                            | 
                 Just Got Here 
                
                                Threads 1 
                                Posts 4                              
                             | 
                         
                     
                 | 
            		
		                
		                    
		                        | 8:56:20 AM PST - Tue, Nov 20th 2012  | 
		                             | 
		                     
		                    
		                        Hello, I'm experienceing the stange issue that I'm not sure it's related to MPI, SGE scheduler or NWChem itself. 
When running with 1, 2, 4 or 8 procs on a single node, it runs fine. But when I run with 6 or 12 procs, it failed with the error message below. And for certain input files, I get the same errors when running with a particular number of procs. Can some one explain this? And point me to a direction to troubleshoot this please. 
symmetry adapt  = T 
 
 
Here is snippet from the output 
 
Forming initial guess at       1.1s 
 
 
Error in pstein5. eval  is different on processors 0 and 1  
Error in pstein5.  me = 0 exiting via pgexit.  
Error in pstein5. eval  is different on processors 1 and 0  
Error in pstein5.  me = 1 exiting via pgexit.  
 
Last System Error Message from Task 1:: Inappropriate ioctl for device 
Last System Error Message from Task 0:: Inappropriate ioctl for device 
 ME =                      0  Exiting via  
 
0:0: peigs error: mxpend:: 0 
(rank:0 hostname:node13 pid:13469):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/armci.c:ARMCI_Error():208 cond:0 
 ME =                      1  Exiting via  
 
1:1: peigs error: mxpend:: 0 
(rank:1 hostname:node13 pid:13470):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/armci.c:ARMCI_Error():208 cond:0 
  
MPI_ABORT was invoked on rank 1 in communicator MPI COMMUNICATOR 4 DUP FROM 0  
with errorcode 0. 
 
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. 
You may or may not see output from other processes, depending on 
exactly when Open MPI kills them. 
  
forrtl: error (78): process killed (SIGTERM) 
Image              PC                Routine            Line        Source              
libmkl_sequential  00002AD009ED6150  Unknown               Unknown  Unknown 
Last System Error Message from Task 2:: Inappropriate ioctl for device 
forrtl: error (78): process killed (SIGTERM) 
Image              PC                Routine            Line        Source              
nwchem             0000000002FF271E  Unknown               Unknown  Unknown 
nwchem             0000000002FF11B6  Unknown               Unknown  Unknown 
nwchem             0000000002F939B2  Unknown               Unknown  Unknown 
nwchem             0000000002F4135B  Unknown               Unknown  Unknown 
nwchem             0000000002F46E53  Unknown               Unknown  Unknown 
nwchem             0000000002EB5F3F  Unknown               Unknown  Unknown 
nwchem             0000000002E9108F  Unknown               Unknown  Unknown 
libc.so.6          000000309A432920  Unknown               Unknown  Unknown 
libmpi.so.1        00002B3007E26A99  Unknown               Unknown  Unknown 
libmpi.so.1        00002B3007D594C2  Unknown               Unknown  Unknown 
mca_coll_tuned.so  00002B300DB4F8EE  Unknown               Unknown  Unknown 
mca_coll_tuned.so  00002B300DB58618  Unknown               Unknown  Unknown 
libmpi.so.1        00002B3007D680FD  Unknown               Unknown  Unknown 
nwchem             0000000002E125A0  Unknown               Unknown  Unknown 
nwchem             0000000002E72CB2  Unknown               Unknown  Unknown 
nwchem             0000000002E4471B  Unknown               Unknown  Unknown 
nwchem             00000000009AA79E  Unknown               Unknown  Unknown 
nwchem             00000000009C7347  Unknown               Unknown  Unknown 
nwchem             00000000009ACA49  Unknown               Unknown  Unknown 
nwchem             00000000005B944A  Unknown               Unknown  Unknown 
nwchem             0000000000501C57  Unknown               Unknown  Unknown 
nwchem             000000000050118B  Unknown               Unknown  Unknown 
nwchem             000000000064BE1F  Unknown               Unknown  Unknown 
nwchem             00000000005049C1  Unknown               Unknown  Unknown 
nwchem             00000000004F17A2  Unknown               Unknown  Unknown 
nwchem             00000000004E639B  Unknown               Unknown  Unknown 
nwchem             00000000004E5E7C  Unknown               Unknown  Unknown 
libc.so.6          000000309A41ECDD  Unknown               Unknown  Unknown 
nwchem             00000000004E5D79  Unknown               Unknown  Unknown 
Last System Error Message from Task 3:: Inappropriate ioctl for device 
forrtl: error (78): process killed (SIGTERM)
 | 
		                     
		                 
		             | 
        
 
         | 
        
      
        	
            
                AWC's:
                 2.5.10 MediaWiki - Stand Alone Forum Extension
Forum theme style by: AWC