[Beowulf] Explanation of error message in MPICH-1.2.7

Jeffrey B. Layton laytonjb at charter.net
Fri Oct 6 15:50:49 EDT 2006


Afternoon cluster fans,

  I'm working with a CFD code using the PGI 6.1 compilers and
MPICH-1.2.7. The code runs fine for a while but I get an error
message that I've never seen before:


[2] MPI Internal Aborting program Deep nest in Check_incoming
[2] Deep nest in Check_incoming

This error message is in the error file from PBS. The output from
the code gives the following:


p2_15458:  p4_error: : 1
p5_21530:  p4_error: net_recv read:  probable EOF on socket: 1
p7_21548:  p4_error: net_recv read:  probable EOF on socket: 1
p6_21539:  p4_error: net_recv read:  probable EOF on socket: 1
rm_l_6_21544: (95.492188) net_send: could not write to fd=5, errno = 32
rm_l_2_15464: (95.835938) net_send: could not write to fd=5, errno = 32
rm_l_5_21535: (95.574219) net_send: could not write to fd=5, errno = 32
rm_l_7_21553: (95.410156) net_send: could not write to fd=5, errno = 32


  The code runs fine with other MPI implementations (Scali,
MVAPICH, etc.) My googling efforts haven't yielded anything.
Does anyone have any input on this?

Thanks!

Jeff
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list