[Beowulf] Newbie Question: Batching Versus Custom RPC

Ryan Adams radams at csail.mit.edu
Thu Feb 19 14:03:38 EST 2004

Please forgive the length of this email, as I'm going to try to be

I have a problem that divides nicely (embarrassingly?) into
parallelizable chunks.  Each chunk takes approximately 2 to 5 seconds to
complete and requires no communication during that time.  Essentially
there is a piece of data, around 500KB that must be processed and a
result returned.  I'd like to process as many of these pieces of data as
possible.  I am considering building a small heterogeneous cluster to do
this (at home, basically), and am trying to decide exactly how to
architect the task distribution.  

The network will probably be Fast Ethernet.  Initially there will be
four machines processing the data, but I could imagine as many as ten in
the near term.  My current back-of-the-envelope math puts an aggregate
load (assuming 2.0s per job, 500KB transferred each, with ten nodes) of
2.5MB/s on the network, so it would seem that 100BT can get the job done
without introducing much delay compared to the 2.0s execution time. 
Perhaps I am doing this math wrong, but I was also thinking that since
the download of the data is such an I/O-intensive task that it would be
reasonable to place that in a separate thread from the floating point
calculations.  This way, I could hope to work on data while my socket
read is blocking.

My question is basically this: is 2-5 seconds too small of a job to
justify a batching system like *PBS or Gridengine?  It would seem that
the overhead for a job that requires a few hours would be very
insignificant, but what about a few seconds?  Certainly, one option
would be to bundle sets of these chunks together for a larger effective
job.  Am I wasting my time thinking about this?

I've been considering rolling my own scheduling system using some kind
of RPC, but I've been around software development long enough to know
that it is better to use something off-the-shelf if at all possible.

Thanks in advance...


Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list