[Beowulf] comparing MPI HPC interconnects: manageability?
joachim at ccrl-nece.de
Fri Feb 20 05:10:38 EST 2004
> AFAIK all interconnects would allow the swap of a NIC without bringing down
> the whole network - but in all cases any parallel job running on that node
> would need to be aborted since in general high-speed interconect PCI cards
> are not hot-swappable - that node woudl need to be power-cycled.
AFAIK, this is the same for SCI, but I would need to check this to be sure.
Anyway, the application using the adapter to be swapped would have to be
restarted anyway as its resources are gone. Avoiding this would be very hard,
if at all possible.
> As for the cables and switches, I can't speak for other vendors - but for
> example a line card in a Quadrics Switch can be hot-swapped even while
> there are running MPI jobs that are sending data through that line card at
> the time - the jobs simply pause until the cables are reconnected. I would
> expect that other interconnects are the same in this respect?
SCI typically uses no external switches, and concerning the exchange of
adapters or cables, there are two strategies: the application(s) has/have to
wait until transfers are again successful, or the driver recognizes the
problem and changes the routing. Of course, this can be combined into a
two-phase strategy. I guess this is the way Scali is doing it.
Joachim Worringen - NEC C&C research lab St.Augustin
fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf