Machine Check Exception

Derek Richardson derek.richardson at pgs.com
Mon Apr 14 13:46:05 EDT 2003


All,
Does anyone know if a power supply can cause a machine check exception ( 
I would think that the VRM would stop it from effecting the processor, 
but what about the rest of the system - seems odd that the machine 
wouldn't fail in other ways...)?  I have a cluster node that keeps 
crashing w/ one, and I've looked it up in the Intel ia32 manual, and 
it's a not specific to processor and RAM ( which I have already changed 
out ), so I've just been swapping parts out ( so far I've swapped CPU0, 
where the Exception took place, all the RAM, all the fibre, network, and 
RSA cards, the motherboard, etc. - basically the only things that are 
the same as the original node are the chass, power supply, scsi disk ( 
but not controller ), CPU1, and CPU1's VRM - I just changed out the VRM 
for CPU0 and am putting the node back into use once it's fibre disk 
fscks : this might fix the problem.
Does anyone have any thoughts on this?  I'd hate to throw the entire 
scenario out and just replace the entire node ( Since I'll eventually 
have to find and replace the faulty hardware and I've already done so 
much, I'd like to finish it ).
Thanks,
Derek R.

-- 

Linux Administrator
derek.richardson at pgs.com
derek.richardson at ieee.org
Office 713-781-4000
Cell 713-817-1197
bureaucracy, n:
	A method for transforming energy into solid waste.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list