[Beowulf] Errors on IBM e325
Jeff Layton
jeffrey.b.layton at lmco.com
Fri Jun 25 11:21:11 EDT 2004
Good morning,
We've got a shiny new IBM cluster with e325 nodes (Opteron).
However, we're having some trouble with a number of nodes.
We keep getting 'GART' errors showing up in the logs. Here is
an example,
Jun 21 07:07:42 c3n32.cluster kernel: Lost an northbridge error
Jun 21 07:40:52 c1n4.cluster kernel: Lost an northbridge error
Jun 21 07:07:42 c3n32.cluster kernel: GART error 3
Jun 21 07:40:52 c1n4.cluster kernel: GART error 3
Jun 21 14:03:49 c1n2.cluster kernel: extended error chipkill ecc error
Jun 21 14:03:50 c1n2.cluster kernel: corrected ecc error
Does anybody have any ideas what the cause might be?
Thanks!
Jeff
--
Dr. Jeff Layton
Aerodynamics and CFD
Lockheed-Martin Aeronautical Company - Marietta
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list