Hits: 12885

The Beowulf list discusses rsh vs. ssh (again), using Raw Ethernet (a Neandertal approach?), and Cluster Computing Courses

The Beowulf mailing list provides detailed discussions about issues concerning Linux HPC clusters. In this article I review some postings to the Beowulf list on rsh vs ssh, using raw Ethernet, and cluster computing courses.

rsh Without Passwords

To run parallel jobs on a cluster you need some way to access compute nodes without a password. On July 7, 2004, Sandeep Krishnan asked how one configures rsh to not use passwords. He could login into the compute nodes from the head node without a password but couldn't login from one compute node to another without first supplying a password. Andrew Cater was the first one to respond to this posting by saying that you should use ssh instead of rsh. He mentioned that there was a ssh " hack" floating around that allows ssh to behave just like rsh.

Sean Dilda also responded that he found ssh host based authentication to be very nice. Daniel Pfenniger took issue with the comments that ssh was better than rsh and wanted to know why it was better (details rather than generalities). This discussion was a very nice introduction for ClusterMonkey's Robert Brown to explain why ssh is better than rsh. Robert has written a great deal about why he believes ssh is better and has also performed some tests comparing the two packages. His first comment was to explain what is wrong with rsh: 1) no security at all, 2) no environment passing, 3) no tunneling/port forwarding, 4) no intrinsic X11 support, 5) archaic and easily spoofed/snooped authentication mechanism, 6) terrible control for " no password" login, and 7) more or less frozen, unsupported code. He thought the only good thing about rsh was that it was relatively fast.

On the other hand Robert thought that ssh has a number of good points: 1) strong security, 2) environment passing, 3) port tunneling/forwarding, 4) intrinsic X11 support, 5) strong host authentication, 6) strong personal authentication, 7) bidirectional encryption, not easily snooped, 8) good control for " no password" login, and 9) active code support.

In fairness, there were somethings Robert does not like about ssh: 1) relatively slow, 2) cannot select " no encryption" as an option even on secure networks, and 3) poor tty disconnect "feature" that requires ~. escapes (nested yet) to leave a job in the background from an ssh session.

Sidebar One: No Password RSH Hints

For more information on pasworldless logins, consult Passwordless SSH (and RSH) Logins on the Cluster Agenda.

The brief outline below was taken from Beowulf Mailing list post by Joe Landman of Scalable Informatics Fri Jul 29, 2005.

You will need to make sure your pam configuration enables rhost authentication. You will need this in your /etc/pam.d/rsh file

   auth   sufficient 

and you will need to either add " rsh" to your /etc/securetty, or simply remove that file. There are other good reasons to have the file, so you might wish to go with adding it rather than removing it.

Then make sure your .rhosts are 600 mode and the nodes are listed in the .rhosts file.

   chmod 600 ~/.rhosts 

Note: if you are trying to do this as root, you might need to use

   auth   required 

in your /etc/pam.d/rsh as well. Note: This is complex and painful to debug (many interacting systems). If you use ssh, it is much simpler. You create a shared key

   ssh-keygen -t dsa 

(don't enter a passphrase for the key or you are going to run into prompting issues, just press enter).

You will have a new key in ~/.ssh/ . Copy this key to all the machines you wish to log in to without passwords. Append it to the ~/.ssh/authorized_keys file. Now you should be able to log in w/o a password.

Robert went on to say that he thought, in most cases, the speed difference was negligible (you can search for some tests he did a few years ago showing the speed difference between rsh and ssh). He also pointed out that ssh is only used to start the jobs on remote nodes and that MPI/PVM is used once the program is running.

One of the strong features of ssh that Robert did not mention was the scalability of ssh. Due to its design rsh often has problems when the node count exceeds 256.

Trent Piepho responded that, in his opinion, for cluster jobs on a private network, the only bad things about rsh that really mattered was the lack of supported code and no environment passing. He also posted that when using Gigabit Ethernet he found rsh/rcp to be about 4 times faster than ssh/scp depending upon the CPU. Robert thought that Trent's numbers, while interesting didn't have a big impact on most applications because the actual traffic over ssh/scp was fairly small.

Andrew Cater then added some comments to Roberts first post. In particular, he said passing the environment, being able to port forward or tunnel, and intrinsic X11 were very important reasons for not using rsh. He also thought the authentication was good because once it's configured you don't need to update it despite account password changes.

Jakob Oestergaard posted that he thought MPI itself was vulnerable because of the lack of network security. He thought that the lack of environment passing, intrinsic X11, and tunneling/port forwarding within rsh were all overcome by some simple scripting or were not needed for most clusters. Jakob said he used NFS and NIS for file access and authentication. He felt that on a private " trusted" networks these were very reasonable alternatives (he failed to mention that NIS can be a huge drain on your network resources and tax your NIS master as well). He did mention that having frozen code was not such a bad thing for rsh because it works across many platforms seamlessly. Jakob concluded that he still preferred rsh for his closed private network with trusted users.

Raw Ethernet

There have been many efforts aimed at developing alternative ways for communicating over Ethernet networks besides using the IP, TCP, and UDP protocols. Several of these methods continue to be developed. On July 20, 2004, Simone Saravalli posted that he was interested in studying raw Ethernet communication on Intel P4's. Robert Brown then asked if Simone was interested in using raw Ethernet packets on a flat bridged network for point-to-point communications. He thought such a project was beneficial as a learning tool for understanding networks and networking protocols. He suggested that Simone could probably skip the IP protocol, but would have to deal with some of the functions that TCP provides, such as, reliable packet transmission, sequencing, check summing, etc.

Several people posted some ideas for getting started including taking a look at the GAMMA Project and HyperSCSI projects.

Simone responded that he was indeed studying the effects of raw Ethernet on point-to-point communication. He reported that throughput was good, but that reliability was an issue because of the lack of the things that TCP provides. Robert Brown responded that the project may have the potential to provide some good GPL drivers that may reduce latency for network communication. He also thought Simone would have to develop a library that emulates the standard socket libraries and transport layer, perhaps using the arp tables to provide a mapping between hostname and Ethernet number on the flat network.

Daniel Ridge suggested adding your node number into the MAC address of the NIC (most NICs can be programmed with a specific MAC) and then don't worry about the hostname to MAC mapping (this ignores the IANA rules). Gerry Creager thought that Simone should just use IPv6 and encode the node number. Robert thought that this would not be a good idea because it would limit the networking to a private flat network. He went into some depth to explain why including some old war stories.

Tim Mattox posted to suggest that if you want to change the MAC address of a NIC, then you should make sure you set the Locally Administered bit in the MAC. If you do this and clear the Group bit, then you will never conflict with a factory set MAC. Tim went on to say that as long as you have a firewall or router between your modified MAC then you will never have to worry about an outside conflict.

Daniel Ridge jumped back in to say that he thought these types of projects were good because they take commodity networking components and then ask the question, " I wonder what you get if you think about them differently." He also thought that this kind of project was perfect for clusters that have a private network. He also pointed out that the ifconfig command can change the MAC address of a NIC as well.

John Hearns posted that if you wanted to do this sort of thing with clusters that use PXE booting, then you might have to do some fancy footwork to boot first and then change the MAC address for the private network. Tim Mattox posted that he has been experimenting with changing MAC addresses for some time with his FNN (Flat Network Neighborhood) cluster configuration. He had some very good insights into changing MAC's and then rebooting. He also pointed out that the next version of the Warewulf Cluster Distribution will support multiple MAC's per node (very useful in this situation).

Beowulf: Good Grid Computing Classes?

There seems to be a definite lack of formal courses where one can learn cluster design, administration, and programming. On August 11, 2004, Michael Jastram asked about courses that would address grid computing and other aspects of clustering. His employer had a tuition assistance program and Michael wanted to take advantage of that program.

Brent Clements suggest the Linux Institute to see what they might offer. Dean Johnson suggested the old fashion approach of " just do it." This method is quite simple -- buy/find a couple of machines, connect them and start trying the software and asking questions. Dean had some very good ideas about trying things to learn as much as you can. Robert agreed with Dean's ideas (and comments about getting hardware past the "management").

Michael Hanulec mentioned that the Cornell Theory Center taught courses on Windows clusters. He went on to suggest using either Oscar, Rocks, or Warewulf to get started with cluster management tools.

Robert thought that subscribing to I would help quite a bit (shameless plug). [Since ClusterWorld is no longer in publication, the next best thing (or better thing) is ClusterMonkey of course - Ed.] He also thought that both Scyld and Clemson University might teach some courses. Don Becker joined in to say that Scyld does indeed offer cluster computing courses.

Finally, Glen Gardner echoed the sentiment of others that not many courses existed for this kind of thing. Sounds like a business opportunity to me! Update: The Advanced Research Computing (ARC) team at Georgetown University are running cluster courses. They have successfully run Introduction to Beowulf Design, Planning, Building and Administering trainings several times.

Sidebar Two: Technology Mentioned In This Column

GAMMA Project



This article was originally published in ClusterWorld Magazine. It has been updated and formatted for the web. If you want to read more about HPC clusters and Linux you may wish to visit Linux Magazine.

Jeff Layton has been a cluster enthusiast since 1997 and spends far too much time reading mailing lists. He can found hanging around the Monkey Tree at (don't stick your arms through the bars though).