Strange hardware (was Re: custom hardware (was: Xbox clusters?))

Mark at MarkAndrewSmith.co.uk Mark at MarkAndrewSmith.co.uk
Thu Nov 29 08:50:58 EST 2001


 
Yep, seen this problem many times in our computer hire range of
Windows2000Pro machines.  The strange thing is that we only see this on Slot
1 Pentium II machines with various model motherboards.  All our Pentium III
range are socket 370 and no problems.  So we came to a feeling that the
problem was the way in which the Slot1 Pentium II sits on the motherboard.
After months of clients returning equipment to base under warranty, we
issued instruction on how to open the case and remove and re seat the
PentiumII Slot 1 processor package.  The machines then boot every time after
switch on. 
 
How many of you having this problem have it with the slot 1 Pentium II and
slot 2 Pentium III processors in your clusters?  I bet none of you have it
with a socket 370 or other "flat" socket type of CPU package.  We're
fortunate that our development cluster is based on Pentium 233MHz MMX "old"
ex-hire equipment so we don't have this problem on the cluster.  Yet! 
 
Regards, 
     Mark. 
 
-----Original Message----- 
From:		Felix Rauch [SMTP:rauch at inf.ethz.ch] 
Sent:		Thursday 29 November 2001 12:00 
To:		beowulf at beowulf.org 
Subject:	Strange hardware (was Re: custom hardware (was: Xbox
clusters?)) 
 
On Thu, 29 Nov 2001, Daniel Pfenniger wrote: 
> I have seen similar strange behavior of some boxes in a set of 66's, 
> and the way  
to restart is also rather odd. 
[...] 
 
We recently had strange problems with a Dell-Box which has been 
working without problems for  
several years in our small research 
cluster. It's a dual PII 400 MHz box, but suddenly the Linux kernel 
was unable to start the second  
CPU. It could see the second CPU, but 
when it tried to start it up during boot, it got a timeout and so 
continued with only one CPU. 
 
So  
we though that one of the CPUs died and replaced both CPUs. Still 
the same problem. Next we replaced the motherboard (including the 
power  
suply). Still the same problem. Maybe the disk corrupted the 
kernel, so we installed a fresh version of the same kernel onto the 
box.  
Still the same problem. Only after physically replacing the SCSI 
hard disk everything was working properly again. 
 
We are still wondering  
why a disk could cause a CPU to timeout during 
boot... 
 
- Felix 
--  
Felix Rauch                      | Email: rauch at inf.ethz.ch 
Institute  
for Computer Systems   | Homepage: http://www.cs.inf.ethz.ch/~rauch/ 
ETH Zentrum / RZ H18             | Phone: ++41 1 632 7489 
CH  
- 8092 Zuerich / Switzerland  | Fax:   ++41 1 632 1307 
 
_______________________________________________ 
Beowulf mailing list, Beowulf at beowulf.org 
To  
change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf 
 
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20011129/db7b8362/attachment.html>


More information about the Beowulf mailing list