[Beowulf] SWAP management
Robert G. Brown
rgb at phy.duke.edu
Fri Dec 12 07:25:03 EST 2003
On Thu, 11 Dec 2003, Mark Hahn wrote:
> > It seems that my Linux OS [RH 7.3 basically, in some cases SuSE 8.2]
> > tries to avoid that the percentage of memory used by a single process
> > becomes higher than 60-70 %.
> I don't believe there is any such heuristic. it wouldn't have anything to do
> with the distribution, of course, only with the kernel.
To add to Mark's comment, it is not exactly easy to see what's going on
with a system's memory usage. Using top and/or vmstat for starters --
vmstat 5 will let you differentiate "swap" events from other paging and
disk activity (possibly associated with applications) while letting you
see memory consumption in real time. top will give you a lovely picture
of the active process space that auto-updates ever (interval) seconds.
If you enter M, it will toggle into a mode where the list is sorted by
memory consumption instead of run queue (which I find often misses
problems, or rather flashes them up only rarely). You can then look at
Size (full virtual memory allocation of process) and RSS (space the
process is actually using in memory at the time) while looking at total
memory and swap usage in the header.
Note well that the "used/free" part of memory is not an accurate
reflection of the system's available memory in this display -- to get
that you have to subtract buffer and cached memory from the used
component. This yields the memory that CAN be made available to
a process if all the cached pages are paged out and all the buffers
flushed and freed. Linux does NOT like to run in a mode with no cache
and buffer space as it is not efficient -- one reason linux generally
appears so smooth and fast is that a rather large fraction of the time
"I/O" from slow resources is actually served from the cache and "I/O" to
slow resources is actually written into a buffer so that the task can
continue unblocked. If you do suck up all the free memory, it will then
fuss a bit and try paging things out to free up at least a small bit of
Note that a small amount of swap space usage is fairly normal and
doesn't mean that your system is "swapping". A small amount of swap
out events is also normal ditto. It's the swap ins that are more of a
One problem that can be very difficult to detect is a problem with a
daemon or networking stack. A runaway forking daemon can consume large
amounts of resources and clutter your system with processes. A runaway
networking application that is trying to make connections on a "bad"
port or networking connection can sometimes contain a loop that e.g.
tries to make a socket and succeeds, whereby the connection breaks and
the socket has to terminate, which takes a timeout. I've seen loops
that would leave you with a - um - "large number" of these dying
sockets, which again suck up resources and may or may not eventually
cause problems. There used to be a similar problem with zombie
processes and I suppose there still is if you right just the right code,
but I haven't seen an actual zombie for a long time.
Note also that top and to a less detailed extent vmstat give you a way
of seeing whether or not an application is leaking. If a system
"suddenly" starts paging/swapping, chances are really, really good that
one of your applications is leaking sieve-like. Having written a number
of applications myself which I proudly acknowledge leaked like a
sumbitch until I finally tracked them down with free plumber's putty, I
know just how bone-simple it is to do, especially if you use certain
libraries (e.g. libxml*) where nearly everything you handle is a pointer
to space malloc'd by a called routine that has to be freed before you
reuse it. top with M can help a bit -- watch that Size and if it grows
while RSS remains constant, suspect a leak.
Finally, a few programs may or may not leak, but they constitute a big
sucking noise when run on your system. Open Office, for example, is
lovely but uses more memory than X itself (which is also rather a pig).
Some of the gnome apps are similarly quite large and tend to have RSS
close to SIZE. In general, if you are running a GUI, it is not at all
unlikely that you're using 100 MB or more and might be using several
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf