Scyld 27Z-8 Gig Net - HELP!
calvert at scyld.com
Thu Sep 26 11:35:50 EDT 2002
I know you said you modified all of the files, but just to review, under
27z-8, you need to modify the file /etc/beowulf/config.boot to add the
device and vendor information for the newer e1000 card. So you'll need
to add the following line:
pci 0x8086 0x100E e1000
In addition, make sure you have a 'bootmodule' entry for "e1000" near
the beginning of the file. Next rebuild your node boot floppy and
beoboot images and try rebooting.
If you've already done all of that (which it sounds like you have), then
attached are some directions for building an e1000 driver under Scyld.
Hopefully, this solves your problem.
Stanley, Matthew D. wrote:
>I have several clusters running the public release of 27Z-8. They have been, up until now exclusively via-rhine and 3c59x based 100mbit clusters. We wanted to upgrade to gigabit ethernet and decided to upgrade our 4 machine cluster using Dlink DGE-500T cards (ns820/ns83820 based). I compiled the latest netdrivers.tgz file and the ns820 driver appeared to work fine as a link to the outside world but did not function on the beoboot floppy even though I compiled for that kernel and even did a full kernel set rebuild (rpm -bb) including the new netdrivers.tgz file. What happened was right after it would find the card, find the master server and assign the IP address it would just sit at the line where it requests /var/beowulf/boot.img.
>Ok, so I gave up on Dlink cards, and purchased 4 Intel PRO/1000MT cards, the new version which requires the new release of drivers since it's PCI id is 8086:100E and not 8086:1000. I again compiled the drivers and tested the card to the internet side with 0 problems. I then create my boot images and try to boot, it gets a little farther than the Dlink, it will actually starts to boot the net boot image and then locks up and never completes.
>Am I missing something here? Ive modified all of the files, it finds the cards, it even works for days on the internet if I switch my card to the eth0 and not eth1. It appears to be a driver issue yet I have similar problems with two completely different sets of cards. I have even tried using a 100 mbit hub instead of a gigabit switch with identical results. I can also just take out the cards and put in 3c59x cards and the problem is fixed!
>We use our clusters for NAMD only, is there a way to just install full versions of Scyld and then execute bpslave? If so, what modifications need to be done to the node_up and other scripts to make that work. I realize this means more administration, but at this point I have spent weeks trying to make this work, I can install and update 4 machines in a matter of a couple hours.
>Are there settings in beoboot which changes the way it gets the information from the master node, maybe making it more reliable like broadcast/multicast, etc?
>Any help would be appreciated,
>Structural Biology Core
>University of Missouri - Columbia
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
More information about the Beowulf