Usage Examples
The following are simple examples of taskmaster/task usage. Don't worry about the fact that parallel speedup seems modest. Next month we'll explore parallel speedup with taskmaster and show how even this very simple example, with the crudest possible task communication mechanism, can still yield excellent parallel scaling. Or you can play with arguments on your own for "homework" in the meantime and see if you can discover this yourself!
rgb@lilith|T:114>./taskmaster hostfile 1 10 1 Spawning host threads Host lucifer thread running. rand[0] = 0.840188 rand[1] = 0.394383 rand[2] = 0.783099 rand[3] = 0.798440 rand[4] = 0.911647 rand[5] = 0.197551 rand[6] = 0.335223 rand[7] = 0.768230 rand[8] = 0.277775 rand[9] = 0.553970 Results: nhosts nrands delay time 1 10 1 13
Note that one host takes thirteen seconds to do ten seconds worth of work! Not too good!
rgb@lilith|T:121>./taskmaster hostfile 5 10 1 Spawning host threads Host lucifer thread running. Host caine thread running. Host uriel thread running. Host abel thread running. Host archangel thread running. rand[0] = 0.840188 rand[1] = 0.394383 rand[2] = 0.700976 rand[3] = 0.809676 rand[4] = 0.561380 rand[5] = 0.224983 rand[6] = 0.916458 rand[7] = 0.133982 rand[8] = 0.274746 rand[9] = 0.046468 Results: nhosts nrands delay time 5 10 1 5
Better! Five hosts now take less than 10 seconds to do 10 seconds worth of work. However a lot of computers for only a factor of two speedup! One more try:
rgb@lilith|T:122>./taskmaster hostfile 5 10 10 Spawning host threads Host lucifer thread running. Host caine thread running. Host uriel thread running. Host abel thread running. Host archangel thread running. rand[0] = 0.840188 rand[1] = 0.394383 rand[2] = 0.700976 rand[3] = 0.809676 rand[4] = 0.561380 rand[5] = 0.224983 rand[6] = 0.916458 rand[7] = 0.133982 rand[8] = 0.274746 rand[9] = 0.046468 Results: nhosts nrands delay time 5 10 10 30
Much better. Now five hosts take 30 seconds to do 100 seconds worth of work. This might turn out to be worthwhile after all!
Conclusion
Building a cluster (or discovering that your existing LAN is a cluster) is apparently pretty easy, really. Using a fairly simple "master" script and a "worker" application we can clearly do work in parallel and can already see a significant speedup (a factor of three using five hosts) which should be quite reproducible on just about any LAN. Between now and next month, you can play with taskmaster and task and see if you can discover settings that yield really good parallel speedup (where running on 5 hosts completes in close to 1/5 the time of one host).In future columns we'll explore themes like Amdahl's Law and parallel speedup, parallel libraries, "the standard linux cluster design", and more, using this basic cluster (and taskmaster/task) as a starting point. We'll also learn better ways of installing and managing a cluster, how (and when) to go beyond the simple NOW-style cluster and into a GRID or a Beowulf, how to compare shelved towers, rackmounts, and different processor types and networks. Our goal will be to achieve a sufficient level of experience that you are ready to be "handed off" to my brother columnists, whose dedicated columns will take you from being a neophyte to a perfect master in the specific areas of clustering that benefit you most.
Hope to see you there.
(source code is on the next page)