Tyan Tiger 2460
math at velocet.ca
Thu Apr 25 12:18:04 EDT 2002
On Thu, Apr 25, 2002 at 03:19:01AM -0400, Robert G. Brown's all...
> Dear List,
> We've had problems (as have others on this list) getting our 2U
> rackmount Tyan Tiger 2460 motherboards to boot/install/run reliably and
> stably. Seth (our systems guy) and I worked on a couple of the boxes
> today armed with a 32 bit riser, a 64 bit riser, and an ATI rage video
> card and a 3c905m NIC.
We got back a 2466 from RMA that was somehow fried. New replacement board came
back. The new bios reports "V4.0 rel 6" and also "Phoenix 4.01".
I saw this change from previous versions and decided to try our Tbirds in it
that we had tried before under previous BIOS versions (and I cant remember the
version #s from before and I cant reboot any nodes to find out :)
Well something has changed because it warns that the processors are non
MP and so it will operate uniprocessor as SMP is unsupported with non MPs.
Cant flash back to a previous bios version either. So Tyan musta struck some
deal with AMD on this. :) Im wondering why they bothered, really, since
Tbirds are almost out of production anyway.
We still have a few test boards running happily with dual Tbird 1.33Ghz
on both 2460s and 2466s, I assume on the older bios.
No major problems with either type of board, except those wierd Addtron
GBE cards which y'all should stay away from. :)
> We took the PCI cards off of their frames so we could mount them
> vertically directly in the slots for testing. We also dismounted the
> risers so we could try them in different slots as well. The following is
> a summary of our findings.
> a) Only the video card would work in slot 1. Period. If we put the
> 3c905 in slot one all by itself (using the BIOS console), the system
> would behave erratically, actually mistaking the number and speed of
> processors during boot and crashing under heavy network loads if and
> when it booted.
> b) If slot one had video or was empty, the system would work fine for
> all other vertical configurations. That is, video in 1, net in 6, video
> in 2, net in 3 or vice versa, video in 5, NIC in 2, etc. I don't know
> that we tested every combination but we didn't find another that failed
> in all our tests. Slot 1 alone seems to be the ringer.
> It is not a 64 vs 32 bit slot question or a power question per se, as
> far as we can tell. Slots 1-4 are all apparently identical 32 bit, five
> volt slots, slots 5+ are 32 bit five volt slots, and both the 3c905 and
> ATI are slotted for 3.3/32 bit slots with the extra notch near the
> back. There is no reason that we can see for the 3c905 to work in slot
> 2, 3, 4, 5, 6, 7 but not in slot 1.
> This is further verified by the fact that we had a 2566 to play with as
> well, which has two 64/66 3.3 volt slots, and the cards worked perfectly
> in them in any order.
> c) Our real torment comes from the riser. Most riser cards are
> designed so they HAVE to plug into slot 1 so that their physical
> framework can hold the cards sideways in the remaining room over the PCI
> bus. Plugged into slot 2, there isn't generally room to fit a full
> height card (or the support frame) into the remaining space to the side.
> With the riser in slot 1, no combination of cards in the riser that
> included the NIC would work, and even the video alone in the slot that
> should have been a "straight through" connection appeared to have
> problems, although a system without a NIC is useless to us so the issue
> is moot. Again, the most common symptom was that the system wouldn't
> even get the CPU info correct at the bios level before any boot is even
> initiated, and if the boot/install succeeded at all the system was
> highly unstable under any kind of load.
> The problem persisted, identically, when we put the 64 bit riser (which
> we were really counting on to fix things) into slot 1 and plugged the
> NIC and video into it, in either order. We had hoped that the problem
> was just the 32 bit riser not correctly connecting lines needed for the
> power/clock to automatically set to the needs of the card and that the
> 64 bit card would "fix" this. As noted above, the problem is all slot
> 1, though, in any card orientation even without the riser at all.
> HOWEVER, being clever little beasties, we put the dismounted (32 bit)
> riser in slot 2 with the extra cabled keys in slots 3 and 4, added the
> dismounted PCI cards to any slots we felt like and voila! The system,
> she work perfectly. Right number of CPUs, flawless boot/install, still
> running under heavy load for ten hours or so now.
> Since the 3c905 is a highly reliable NIC (and the ATI rage is ditto a
> reliable video card and for that matter we also saw the problem earlier
> with other NICs, e.g. tulipsj) that work perfectly in many, many
> systems, one has to be at least tempted to conclude that this is a
> reproducible BUG in the 2460 Tiger motherboard, either in the BIOS or
> (worse) in the physical wiring of slot 1. We are reporting it to Tyan as
> such to see if they are aware of it (couldn't find it on their website
> if they are) and if they know of any fix. In the meantime, we are
> testing a workaround consisting of a riser with a flexible ribbon
> connecting the primary slot, so that it can be installed offset from
> where it is plugged into the PCI bus. We hypothesize that if we mount
> this riser in the framework (so it sits physically above slot 1 and can
> take full height cards) but plug it into slots 2-4, it will work fine
> and the systems will stabilize.
> Of course the RIGHT solution would be to keep our perfectly good cards
> and risers and get Tyan to replace the 2460's (if there isn't a bios
> upgrade that fixes the ones we have). Given the frustration and
> downtime and lost productivity we have suffered, giving us 2466
> replacements seems reasonable to me:-).
> Anyway, this explains to at least some extent why such a wide range of
> experiences has been reported for these motherboards on the list.
> People who rackmounted them probably had problems, although I'm willing
> to believe that there are riser cards out there or particular card
> combinations that would "fix" the problem, possibly without the owner
> ever knowing it existed. People who tower mounted them probably did not
> have problems, especially if they used an AGP video card or put their
> video and NIC into the regular 32 bit slots (or in any event
> "accidentally" avoided putting something into slot 1 that wouldn't work
> there). The discussion above may help anybody out there who is still
> having problems -- rearrange your cards as described above and all
> SHOULD be well and/or replace your riser and/or get Tyan to make it
> BTW, so far the 2466 runs fine, as noted by many listvolken.
> Robert G. Brown http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Ken Chase, math at velocet.ca * Velocet Communications Inc. * Toronto, CANADA
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf