[Beowulf] MorphMPI based on fortran itf

Robert G. Brown rgb at phy.duke.edu
Wed Oct 12 11:20:32 EDT 2005


Toon Knapen writes:

> So if it is technically feasable to have an ABI, the interface of the
> MorphMPI library can be the ABI. The MorphMPI ABI will than translate
> the calls to the real MPI in whatever (incompatible) format the MPI
> library is encoded in.

This is perfectly reasonable, although there are several ways to arrive
at the same point.  I'm also less cynical than you about the ability of
a task-force group to arrive at an ABI fairly quickly.  There are
numerous examples in linux and networking code and hardware devices and
so on where such a task force has functioned very efficiently and worked
amazingly well.  As in every time you use an ethernet card you should be
thanking the ieee, every time you use a TCP/IP application you should be
thanking the IETF.  It doesn't always work, sure, but it HAS BEEN
WORKING for MPI already, just not YET at the ABI level.  I think it is a
natural progression whose time has come, from the sound of many people
on the list who are more knowledgeable than I.

> 
> If the MorphMPI library catches on, MPI vendors *have* an interest in
> matching the ABI as specified by MorphMPI because this would mean that
> MorphMPI would have to do *no* conversion anymore at all (and can thus
> actually be skipped) and thus the translation (read: the MPI library) is
> faster. OTOH vendors that are not convinced of having an ABI can wait if
> the MorphMPI approach becomes popular before taking a decision if they
> want to align with the ABI or not.

Or you remove any possible motivation they might have for moving to the
ABI, as you've done all the work for them so be able to assert to
clients that they are ABI-compliant "enough".  I don't think the
existence of MorphMPI alone is enough to create even a de facto standard
-- I think that you'd need to co-develop it hand in hand with actual MPI
maintainers who are simultaneously making their products ABI-compliant
or there are all kinds of real-world problems you'll be trying to "hide"
in a lowest-common-denominator ABI instead of fix by making the ABI
forward thinking and a bit proactive wrt typing/conversion etc.

> I'm not so much for 'mandating' a standard, de facto standards are way
> more interesting and usually end up in soth. superior imo.

I don't think we're disagreeing, except that perhaps I intend the word
"mandating" to mean that a number of groups announce fairly publically
that they're going to work towards and develop and open standard that
will (one presumes) become a de facto standard, and that the "mandate"
is either come play with us and help make both the standard and make
your product compliant with that standard or ultimately lose a lot of
business to the ones that do.  

In any case it is ultimately market forces and self-interest that drive
the acceptance of the standard, the question is largely how one goes
about arriving AT the standard -- in a monolithic way or as part of a
cooperative consortium that lets interested party ensure that truly
critical features (to their particular product, or device, or
application) ARE addressed by the standard.  That is, do you personally
create the ABI or do a collective group of MPI producers and device
manufacturers and MPI users create the ABI?  Some things ARE better done
by just one enlightened person or group; however many things, especially
complex things with many interested parties with sine qua non issues,
are better done in a group and better still in a group with some funding
and persistance in time.

I also think that I agree with the remarks that have already been made
in this thread that there is enough ugliness in a MorphMPI library
conceptually (really in the tools it will be fronting, but ugliness
nonetheless) that you'll a) have some difficulty selling it to end users
as it is NOT a real ABI and IS an ugly and inefficient solution to a
problem they only have because you tell them they have it; b) find that
the problem is more difficult than you think -- at the very least a
pretty major hassle both to develop and to maintain and a thankless task
at that; c) it isn't at all clear to me that a toplevel library can
accomplish all of the things would like to accomplish with an ABI -- in
particular, providing a reliable and predictable set of hooks for
"portable" advanced network device support, managing datatype
incompatibilities, etc.  I rather think that you're looking at only
doing a fairly shallow conversion, not the bigger picture.

Some of these things aren't just a matter of writing wrapper functions
or using function calls to set things that are usually set in macros --
they involve a whole middleware translation layer, and that WILL make
the code ugly, very ugly, and VERY difficult to debug.  To put it
another way, you could probably make something that kinda sorta works
for some MPIs fairly quickly and easily, but then you'd hit a wall for
the MPIs that are really different, and (from the sound of it, given
that I haven't used fortran for a decade or two thank god:-) for fortran
in general.  But this is just my opinion based on a moderate amount of
experience managing sources across incompatible libraries -- it is at
BEST a cosmic PITA; even VALIDATING the hacks -- I mean "solutions" --
to make things work, per MPI is a cosmic PITA because there are so many
edge cases (the PERMUTATIONS of all of the MPI differentiations are
their sources and it will NOT be easy to validate your library code) and
other things you didn't think about until they turn up as showstopper
bugs.  Most of which "aren't your fault" and that will have you
screaming at the MPI developers and aging prematurely as you hack away.

I can only shudder to think of trying to make this work for
eighty-zillion MPI sources that I didn't even write and don't control
and that contain who knows what pointer-driven type-sloppy evil.  Even
NON-MPI packages that I build from sources snatched from the net are
typically full of type mismatches that have to be there in plain sight
of the developers as they build with pretty much the same tools that I
do -- it's just that there are obviously a whole lot of developers out
there who don't care enough to change or cast integer to pointer types
appropriately when they "know" that the pointer is really an int anyway.
So when you get a bug report from some butt-head whose code is written
this sloppily and thus runs up against a real type-conversion problem on
the back end, is it your fault or his?

Noting of course that it doesn't matter.  It will cost you time even to
say "you butt-head, use proper typing so that MorphMPI can IDENTIFY your
packed vector of X and convert it..."  Pain, pain, pain...

Pain which doesn't occur with an actual MPI ABI.  The bug is clearly
"owned" by the programmer, and one product of the consortium that
develops the ABI is the simultaneous development of programming
guidelines and the proper documentation (!) required to help the
programmer do the last major port of their MPI code that they're ever
likely to have to make (and to clean up their code in ways they should
have done when writing it in the first place, if it weren't for the fact
that they were ignorant physics graduate students who took one whole
course in programming as undergrads and are learning C in real time as
they work:-).

I do think that the MorphMPI idea works a lot better (and may even be a
good idea:-) as a CO-development project.  If the Open Source MPIs sit
down together with the network device people and some high end users and
MPI consumers and agree on an ABI with some ever-healthy debate about
how to properly manage data types in a way that is portable across
architectures past and present and how to build a "universal device
interface" for MPI as a transparent transport layer (so that code
doesn't have to be literally recompiled to change to using even advanced
interfaces at least in linux-based clusters) and that SIMULTANEOUSLY the
OSMPI groups release ABI-compliant versions AND you then make the
MorphMPI idea work for the commercial holdouts.  This kind of
codevelopment gives you a real world testbed in the OSMPIs, a large base
of MPI users that will almost instantly convert/recompile to ABI
compliance, and heck, you might even make some money selling your
conversion tool and support services as a pretested fait accompli to the
commercial vendors to help guide their own conversion to ABI compliance
if/when they decide to go for it.  Money is always nice...:-)

  rgb
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20051012/04f55d67/attachment-0001.sig>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


More information about the Beowulf mailing list