Article Index


netperf is the third of the network benchmarking tools reviewed in this column. It once was my favorite, and now that it is being loved by its owner it may become my favorite once again. To use netperf one proceeds more or less the same way as for the previous two tools: Download, build, start a server/daemon, point a client tool at it with suitable arguments, wait for the numbers to roll on in. It but remains to fill in the details.

Retrieve netperf 2.3 from, following links to the download page and reading as you go. Unpack the tarball and change to the netperf-2.3 (e.g.) directory. Edit the makefile -- minimally you'll want to change the default netperf directory and you must remove the -DNEED_MAKEFILE_EDIT definition from CFLAGS or it will complain. On my own Linux systems, it then just plain builds.

Once built, you'll note two binaries (netserver and netperf) and a bunch of scripts. The scripts are great for generating a whole suite of tests at once, and they also serve as good examples of some of the many command line arguments netperf takes.

Visit your remote system and start up a netserver (daemon) on its default port or some other:

Starting netserver at port 12865

This daemon will typically run until killed (like the lmbench daemon, unlike the netpipe daemon). Multiple tests can be run to the target host once the daemon is running.

Then return to your source host and run netpipe as shown in Figure Three.

$netperf -l 60 -H g01 -t TCP_STREAM  -- -m 1436 -s 57344 -S 57344
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

114688 114688   1436    60.01      93.60   
Figure Three: Example netperf results

This particular invocation says to test for 60 seconds continuously, sending a TCP stream of data to the target g01 with a message size of 1436 (recall, the largest message that will fit in an MTU) using a large send and receive buffer on both ends. Many other options are possible -- netperf is a powerful command with many distinct ways of running.

Note the excellent agreement between netperf and lmbench for maximum bandwidth. Observe, however, what happens when we use a message size of 1 byte (the minimum). In this limit, the "bandwidth" is entirely dominated by the minimum packet latency as shown in Figure Four.

rgb@ganesh|B:1357>./netperf -l 60 -H g01 -t TCP_STREAM -- -m 1 -s 57344 -S 57344
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

114688 114688      1    60.00       1.30   
Figure Four: Single byte netperf results

1.3 Mbps is obviously much, much lower than the 93 or so Mbps that the interface can manage for large packets. Still, it is more optimistic than one might have expected from the latency measurements above because TCP aggregates packets where possible, reducing apparent latency. netperf also permits one to more or less "directly" measure latency via a request-response (ping-like) operation shown in Figure Five.

$netperf -l 60 -H g01 -t TCP_RR 
Local /Remote
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate         
bytes  Bytes  bytes    bytes   secs.    per sec   

16384  87380  1        1       60.00    6747.70   
16384  87380 
Figure Five: Latency test with netperf

Converting the transmission rate into 148 microseconds per packet, one obtains good agreement with lmbench.


To fully understand the results obtained with any of these three programs, it is essential to have the actual source code being timed. Network connections have many options and ways of being created and used, and different code can have very different timing. Fortunately, they are all fully open source so the code they use is open for your inspection.

All three programs produce "reasonable" results and have options that permit one to explore network performance in a number of contexts. lmbench appears to be the simplest (as is expected of a microbenchmark tool). netpipe allows one to explore rates in a number of distinct settings such as in boilerplate PVM or MPI code in addition to plain old TCP. netperf has quite good built in statistics and updated interfaces.

We started this whole discussion by noting that clusters are quite often designed and built around the network, not the processors or memory, because the network frequently determines (as we have seen in our studies of Amdahl's law and scaling) whether a parallelized task exhibits good speedup over a large number of nodes. By using any or all of these three tools, one can finally perform measurements of certain fundamental rates that give you a decent chance of estimating network-bound parallel performance for a given application on different kinds of networking hardware. I hope you find them as useful as I have.

Sidebar: Networking and Testing Resources

This article was originally published in ClusterWorld Magazine. It has been updated and formatted for the web. If you want to read more about HPC clusters and Linux, you may wish to visit Linux Magazine.

Robert Brown, Ph.D, is has written extensively about Linux clusters. You can find his work and much more on his home page

You have no rights to post comments


Login And Newsletter

Create an account to access exclusive content, comment on articles, and receive our newsletters.


This work is licensed under CC BY-NC-SA 4.0

©2005-2023 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.