[Beowulf] SMP Nodes [still] Freezing w/ Scyld (follow-up, long)

Thu Jan 8 08:01:16 EST 2004

On Wed, 7 Jan 2004, Timothy R. Whitcomb wrote:

> Something else I have noticed is that if I run a 2-processor run on 2
> nodes, it goes fairly quickly.  However, if I switch it to 2
> processors on 1 machine, it stalls for very long periods of time.
> >From the memory usage, it looks like there's quite a bit in the swap
> memory (there's 512MB/node of RAM and 1GB of swap) but there is little
> to no CPU access on either processor for very long periods of time (on
> the order of several minutes and up) whereas in the 2-node case there
> are brief pauses, but CPU usage jumps up to 100% fairly quickly.  On
> the 1-node case, one processor will go up a little but then fall back
> down quickly.
> 
> Are there any other ideas as to what would be causing these nodes to
> freeze up?  Thanks again for all the help I've received.

I'm assuming that you've looked at and for memory leaks or just plain
running out of memory.  It is difficult for me to think offhand why a
dual CPU job would leak and a single CPU job wouldn't (except leak
faster and maybe crash sooner.  It is pretty easy to believe that one
job uses (at some point) 60% of available memory and the second job (at
the same point) uses 60% of available memory -- or whatever -- enough to
push it over a critical threshold.  IF your system ever really runs out
of memory it will definitely die shortly after the kernel requests a
page and can't get it.  Often die horribly.

If this were 1997 or 1998, I'd also suspect an SMP kernel deadlock on
some interrupt controlled resource.  In 2003 that's a bit tougher to
imagine, but it still isn't impossible to imagine either a memory
management issue with your particular kernel snapshot (the one in the
Scyld version you bought) or something LIKE a deadlock associated with
your particular hardware combination.  Over the years, I have very
definitely seen certain motherboard/memory combinations that were just
poison to linux kernels -- sometimes in common enough usage that they'd
get fixed; in a few cases they took their instability with them to their
early graves.  SMP kernels are generally marginally less stable than UP
kernels, although honestly it has been maybe 2-3 years since I last had
an issue with any linux kernel at all (and I run lots of SMP systems).

Still, two things to try are:

   a) Try a different version of Scyld, either earlier or later (one
with a different kernel).  See if problems magically vanish.  Yes, even
earlier kernels can be more stable -- some problems don't exist, then
appear in a particular snapshot but make it through the "stable"
selection process to make it out into the world as a plague, then vanish
with a later kernel as somebody puts the offending variable or struct or
algorithm back in order.  Oh, and the issue isn't just kernel -- there
have been significant variations in e.g. glibc over the last year plus
-- any linked library could be the offender, although some bugs would
cause more mundane/recoverable crashes in userspace.

  b) Try a non-Scyld linux on one or two boxes, e.g. a vanilla RH 9.
Run your job to see if it crashes.  This gives you the ability to play a
LOT more with kernels, including building new stable or unstable
kernels, and gives you a base that is in active use with patch feedback
on a LOT more platforms and hence a lot less likely to contain unpatched
critical bugs.

Editorial comment:

The ONE THING that I think really hurts packages like Scyld and Mosix is
the fact that they tend to significantly lag commercial linux kernel
update streams, and the kernel update streams are one of the best of the
various reasons to consider commercial linuces in the first place.
Kernel updates in the commercial streams tend to fairly rapidly address
critical instabilities that sneak into a released kernel snapshot, and
they also address the rare but real kernel-level security hole that
sneak through ditto.

Over the years I've played the roll-your-own kernel game (quite
extensively) and although it is fairly straightforward on a per-system
basis, it isn't terribly easy to come up with a configuration schema for
a good general purpose/portable/installable kernel that can be built
"identically" for smp, amd, i386, i686, etc and installed across a LAN
AND to keep it regularly rebuilt and updated.  In fact it is a lot of
work -- work it is worth it to pay somebody else to do for you.

It is perhaps a silly vision, but I still dream of the day when linux
comes prewrapped to be automagically built, top to bottom, by fully
automated tools so that updating a particular package is a matter of
dropping in a new e.g. src rpm and doing nothing.  I'm hoping that
future revisions of yum and/or caosity eventually move it in this
direction.  Of course, this still begs the question of testing and
validation for commercial application and won't remove the raison d'etre
of the big commercial linuces, but it will mean that perhaps the smaller
commercial linuces like Scyld can lock their offerings more tightly to
the current patch levels of the larger linuces.

I mean, these are COMPUTERS we're talking about, for God's sake.  They
run PROGRAMS.  PROGRAMS are these nifty things that do work for you in
an automated and moderately reproducible way.  So why does building
stuff still require so much handiwork?

<moan>

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf