Regarding the benchmarking of switches, I'll allow myself to point to
an upcoming paper of our group. It will be published at the Workshop
for Communication Architectures for Clusters (CAC 03):
"Cost/Performance Tradeoffs in Network Interconnects for Clusters of
Commodity PCs" (http://www.cs.inf.ethz.ch/CoPs/publications/).

In the paper we are mainly interested in bandwidth (not latency) and
examine a few different networks on our 128 node cluster "Xibalba" [1]:
- A cheap Fast Ethernet network with one small switch per cabinet and
  a Fast Ethernet uplink (only a maintenance network)
- A medium Fast Ethernet switch with bad performance (Matrix E7)
- A high-end Fast Ethernet switch with near perfect performance
  (Matrix ER16)
- Myrinet as a high-performance Gigabit network

We use for example the following benchmarks: All-to-all personalized
communications and Dolly (our cloning tool [2]). With Dolly, every
participating node sends and receives data at full speed for a long
period of time, which puts a maximal load on the switch. Interestingly
the Matrix E7 switch fails misserably despite its promising data

So, if you want to measure the throughput of your switch under a heavy
load conditions, you might want to try out Dolly. Mail me if you want
the source of most recent version (I should finally update that web

Later in the paper we relate the performance of the network with more
realistic benchmarks like the HPL and a car traffic simulation.

[1] http://www.xibalba.inf.ethz.ch/
[2] http://www.cs.inf.ethz.ch/CoPs/patagonia/index.html#dolly

