[Beowulf] profiling memory bandwidth of a job
landman at scalableinformatics.com
Fri Jan 9 15:06:04 EST 2004
On Fri, 2004-01-09 at 04:54, Michael Arndt wrote:
> I.e., if two One-CPU Jobs run on the dual CPU Node, the total run
> time is far more than twice the time of one job + e.g 30 % penalty for
> ressource competition of both jobs on a SMP Node.
So you believe that you are running out of memory bandwidth?
> Question: Is it possible, to show with standard linux profiling commands
> that really memory bandwidth is the issue ?
You would need to use tools like Oprofile, or Troy Baer's lperfex (works
with perfctr, which requires a kernel patch last I looked).
> (Memory requirements and disk IO rate of the jobs are moderate, as far
> as iostat / top shows)
1) 'vmstat 1' is your friend. Use it to see at a gross level, what is
going on with the system. Is it possible you are swapping? Is it
possible that you are hitting the interrupts hard (don't know why), or
doing lots of context switching?
2) The linux tools won't give you a fine grain view, you will need to go
to something like Oprofile.
3) Before you do two runs, it might be advisable to create a baseline
run, and profile it. Look for the usual suspects (paging, memory,
interrupts, ...). Gather profile data. Unless you have source for your
commercial code, you probably would not be able to get the code to
generate profile data, so you are left with the other tools.
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Joseph Landman, Ph.D
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://scalableinformatics.com
phone: +1 734 612 4615
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf