Performance Variations using MPI/Myrico
lindahl at conservativecomputer.com
Fri Apr 27 05:41:59 EDT 2001
> I have some ideas, but nothing I would bet on. Mainly cache trashing : the
> memory copy operation is improved with SSE by using the prefecthing
> support, and this prefetch bypass the L2 cache. Without SSE, the L2 cache
> is happilly flushed as a processor is doing a copy. As the FFT code
> include a copy step, who knows... :-)
> Greg: your numbers for FT are on Alpha or x86 ?
x86, a dual PIII. I wasn't using "enterprise edition" but this was
long enough ago that it's hard to believe that whatever kernel I used
had any SSE accelleration. I think it was vanilla RH 6.2.
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf