ashley at quadrics.com
Tue Apr 12 09:52:49 EDT 2005
On Tue, 2005-04-12 at 02:29 -0700, Rita Zrour wrote:
> Hello I have a question,
> when i do many MPI_Alltoall in my program always the
> first MPI_Alltoall take too much time to be done.
> I don't know where the first communication is always
> expensive. Is that a problem of memory???????
Many MPI implementations do "lazy" allocation of resources, comms
buffers and descriptors, it's not unusual for the first iteration of a
loop to have to allocate these on the fly, future iterations simply
re-use cached descriptors/handles as needed. This isn't unique to MPI
but happens nearly everywhere in the software world, perhaps alltoall
exposes it more as it has more simultaneous pending send/recvs than
Plus of course I assume you are actually initialising your data before
you send it, far to many people write "benchmarks" that just send
un-initialised mmaped() memory and end up measuring the page fault
performance rather than the network bandwidth.
Proper benchmarks (for the most part) zero all data before they send it
and do a handful of warmup laps before doing any measurements, even
without extra allocation/faulting simply having the data cache-hot can
make a difference to measured performance.
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf