[Beowulf] evaluating FLOPS capacity of our cluster

Rahul Nabar rpnabar at gmail.com
Mon May 11 20:34:20 EDT 2009

On Mon, May 11, 2009 at 6:22 PM, Gus Correa <gus at ldeo.columbia.edu> wrote:
> Oops, I misunderstood what you said.
> I see now.  You are bonding channels on the your nodes' dual GigE
> ports to double your bandwidth, particularly for MPI, right?

Yes. Each node has dual gigabit eth cards.

> I am curious about your results with channel bonding.
> OpenMPI claims to work across two or more networks without the need
> for channel bonding.
> What MPI do you use?

We use OpenMPI. I've never really found a good way to measure my
performance. I've tested bandwidth and it definately improves. So also
file transfer times. But I have not tried any computation relevant
benchmarks. It also seems very much a function of what channel bonding
mode is used. The relevant modes seem to be able to split traffic only
when talking to two different hosts at the same time. So a strict peer
to peer communication is not affected at all. At least so far as I
understood those intricacies.

To set or not to set LAG groups on the switch was another factor. A
lot of that stuff seemed to be manufacturer specific on the switch.

> In any case, the single 24-48 port GigE switches (if of good brand)
> should have a single flat latency time between any pair of ports, right?

I think bonding improves bandwidth. But does not touch latency. My
brand's Dell. PowerConnect. Not sure if that fits "good brand"


Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

More information about the Beowulf mailing list