[Beowulf] query: aggregate cluster performance monitoring without multicast

Lombard, David N david.n.lombard at intel.com
Fri Jan 9 10:52:26 EST 2004

From: Robert G. Brown; Friday, January 09, 2004 5:29 AM
> On Thu, 8 Jan 2004, Chris Dagdigian wrote:
> >
> > {Forwarded to this list on behalf of a friend with some email
> troubles...}
> >
> > > I am in the process of trying to get a stopgap perfomance
> > > system going on a 64 CPU Linux cluster with LSF.  Ultimately, I
> to
> > > use PCP for data collection, but since nobody seems to be doing
> yet,
> > > we are going to be rolling our own solution.  To meet some of
> needs,
> > > management has asked for an interim solution that gives them a web
> page
> > > with aggregate usage statistics and such.

I'm not sure what is meant by "nobody seems to be doing this yet".  Is
"this" read as "using pcp" or "using pcp with LSF"?

At any rate, CluMon http://clumon.ncsa.uiuc.edu/ uses PCP to collect
data, along with Apache, PHP, and MySQL to manage the data.  The only
issue with the specific request is that it's tied to PBS as the queuing
system.  That shouldn't be too hard to eliminate or port to LSF.

Alternatively, just use the data collection facilities of LSF directly,
and while you're at it, use the GUI facilities of LSF to display the
data.  Alternatively, if the LSF GUI doesn't work for you, just build
the GUI you want using the LSF data.  Is there a compelling reason to
collect the data twice?

> > > Unfortunately, ganglia is a non-starter because our networking
> can't
> > > enable multicast for the private network the cluster lives on (it
> would
> > > break almost everything else).

I guess the network isn't that private to the cluster then.  Otherwise,
how would activities on a private network, at best accessible only via
the routing on the headnode, impact the rest of the network?  But, I
digress, challenging IT policy  ;^)

> One day I'll have LOTS of
> time on my hands I'm sure and will even port wulfstat to a gtk "real"
> GUI form, but the nice thing about a tty based display is that it
> on a basic tty console if no X is running at all (which may well
> and everybody has a vast range of xterm choices under X, so it is
> definitely the common denominator of monitor interfaces.

I'd argue the web interface is the common denominator.  Viewable via
text browser, plus you get out of the OS wars...

Yes, I know that real sys admins *never* use a web interface...

David N. Lombard
My comments represent my opinions, not those of Intel Corporation.
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list