[Beowulf] ...Re: Benchmark results
josip at lanl.gov
Thu Jan 8 13:13:54 EST 2004
Jim Lux wrote:
> From: "Josip Loncaric" <josip at lanl.gov>
>>NTP may need to be streamlined and tuned for clusters...
> Longer integration times in the smoothing algorithm?
Better yet, extended Kalman filtering (given my control theory background).
BTW, there was a highly relevant paper 9 years ago:
"On Efficiently Implementing Global Time for Performance Evaluation on
Multiprocessor Systems," Eric Maillet and Cecile Tron, Journal of
Parallel and Distributed Computing, No. 28, pp. 84-93 (1995).
Their experimental analysis of the relationship between the system clock
and the reference clock over a 50-minute period showed a good fit to a
linear model, within about 50-100 microseconds, with slow 16-minute
periodicity apparently related to thermal swings in the computer room
These measurements were done using their T-Node "Meganode" system (128
T800 transputers) on the Transputer network, older than 1995 vintage,
probably with clock frequency of about 3.8 MHz. Current networks are
much faster, although clock drift behavior may still be similar.
Presumably, clock drift coefficients could be correlated to motherboard
temperature measurements. Over time, one should be able to construct a
linearized model of clock dynamics, linking the local clock drift,
motherboard temperature, and possibly even power supply voltages. This
should overcome the main sources of timing uncertainty.
P.S. Internally, clusters often have a single broadcast domain. Even
if the head node broadcast an augmented NTP packet every second, this
would still not impact the network much -- but may generate enough
timing data so that each node's filter would have at least some good
measurements to work with...
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf