[Beowulf] choosing a high-speed interconnect
landman at scalableinformatics.com
Tue Oct 12 17:23:53 EDT 2004
Chris Sideroff wrote:
>On Tue, 2004-10-12 at 16:48, Joe Landman wrote:
>>First questions first:
>> Why do you think you need a faster network, and what aspect of fast do
>>you think you need? Low latency? High bandwidth?
> To tell you the truth I can't answer that with more than, "I have a
>gut feeling". I am in the process of profiling the performance of our
>current cluster with our programs. Any suggestions ???
Yes, measure the performance as a function of number of CPUs, and then
trying this on another similar cluster with the faster interconnect. Do
this for "real" runs. Contact me offline if you would like to discuss.
>> What codes are you running? Across how many CPUS? Have you done a
>>performance analysis on your system to observe "slow" runs in progress,
>>and are you convinced that the network is the issue?
> We run exclusively computation fluid dynamics on it. One program is
>Fluent the other is an in-house turbo-machinery code. My experiences so
>far have led me to believe Fluent is much more sensitive to the
>network's performance than the in-house program. Thus my inquiry into a
>higher performance network.
I haven't run fluent in the last few months, but it is a latency
sensitive code. Would be worth exploring your models performance on a
faster (e.g. lower latency) net.
>>We have done lots of tuning bits for customers where the issues wound up
>>being something else than what they had thought. It is worth at least
>>looking into for your code/problems, and identifying the bottleneck (if
>>you haven't already done so).
> Do you have more information on this 'tuning for customers'. I am
>interested in your results. Again any suggestions on how to go about
>this are welcomed.
Get atop (http://freshmeat.net/projects/atop/), it is your friend.
Profile your code with the profile tools. If you see lots of time spent
in "do_writ" and similar, as well as high IO percentages in run times
from sar, atop, and other tools, you might want to look at IO tuning.
The important aspect of this is to gather real data about where your
program spends its time. That is invaluable in deciding how to speed it up.
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 612 4615
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf