AMD Opteron memory bandwidth (was Re: CPUs for a Beowulf)
Keith D. Underwood
kdunder at sandia.gov
Wed Sep 10 10:18:14 EDT 2003
> That's an excellent design decision.
> Putting a check word on each packet means
> - the physical encoding layer need to know about packetization
> - a packet must be held until the check passes
> - the tiny packets grow
> - to do anything with the per-packet info, packet copies must be kept
> These all add complexity and latency to the highest speed path.
> By putting the check on fixed block boundaries you can still detect and
> fail an unreliable link
All very true when you have 1, 2, 4, even 8 HT links that could cause a
system to crash. And I'm not suggesting that ECC would be better (that
was Greg's statement), but.... if you had 10000 HT links running their
maximum distance (if you used HT links to build a mesh at Red Storm
scale) and any bit error on any of them causes an app to fail because
you don't know which packet had an error... That would be bad.
Supposedly, this is going to be fixed in HT 2.0 (I wouldn't know since
the spec isn't freely available).
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf