Subsections

   
Switched Hub Configuration

The switched-hub topology, also known as the start configuration, is the simpler of the two network geometries to implement. The project used a single twenty-four port switch to connect each of the nodes to each other. This resulted in each node being able to communicate directly with any other node in the cluster.
  
Figure 2: This is an example configuration with 8 nodes. Each node can communicate directly with any other node in the cluster.
\resizebox{4in}{4in}{
\includegraphics{sw_hub.eps}
}


Setting up the Network to be a Beowulf

A detailed description of the construction and configuration of a Beowulf class computer is an enormous topic beyond the scope of this paper and any interested parties should consult any of the references at the end of this paper for further reading. What follows is a minimal description of the key aspects of what is involved in constructing a Beowulf cluster. Listed below are five key steps in the configuration process:
1.
Determine the specifications of the cluster.
2.
Acquire and configure the nodes.
3.
Setup the network using the selected medium for communications.
4.
Configure security to allow remote execution.
5.
Install miscellaneous programs and system tools to maintain and monitor the cluster.

The first step had already been determined and is described in section 2 and the computers were already present and accounted for. At this point the machines needed to be configured which included installing the operating system along with file system configuration. The operating system selected to run on these machines was Linux4 because of its compatibility with the Unix platform used by the rest of TJNAF and its versatility. Linux and other open source Unix derivatives are commonly chosen for the operating system of Beowulfs with Linux being the most popular[4, p. 19]. Since the internal nodes were not equipped with an internal CDROM, the operating system was installed with a MicroSolutions backpack CDROM. One notable dilemma is how to get the computer to boot using an external CDROM without any drivers or an operating system. An installation kernel with the driver for this CDROM had to be used5.

Preliminary System Setup

Before the cluster can be used, a few issues have to be resolved. One is host addressing and the other is security. First, the addressing issue will be discussed, since security is not an issue when computers cannot communicate.

For the nodes to cooperate, they must be able to recognize each other. This involves knowing the IP addresses and establishing a network. For this project a C-class reserved network sufficed so the network address of 192.168.68.0 was selected. The front-end computer, now named hydra after the many headed mythological monster, was given the address of 192.168.68.1. The rest of the nodes were given a generic name of `b6' prepended to the last digit of the IP address. For example, node 2 had the address of 192.168.68.2 and was given the name of b02 while node 6 had the IP address of 192.168.68.6 and was named b06. The internal nodes each had all of the development libraries and a full complement of available language support installed, along with compilers and any other software that seemed useful.

A key step in setting up the nodes was to set each node to be its own name server (i.e. adding the line nameserver 127.0.0.1 to the file /etc/resolve.conf) and adding each of the nodes to the /etc/hosts file.

127.0.0.1       locahost                localhost.localdomain, loopback
192.168.68.1    hydra.beowulf.trial     hydra
192.168.68.2    b02.beowulf.trial       b02
192.168.68.3    b03.beowulf.trial       b03
192.168.68.4    b04.beowulf.trial       b04
192.168.68.5    b05.beowulf.trial       b05
192.168.68.6    b06.beowulf.trial       b06
192.168.68.7    b07.beowulf.trial       b07
192.168.68.8    b08.beowulf.trial       b08
192.168.68.9    b09.beowulf.trial       b09

RSH Security Setup

At this point the machines were connected to the hub and the network was easily tested with ping to verify that each of the nodes were accessible. The next issue is login security. All users are added at the front-end machine and then /etc/passwd and /etc/shadow are copied to each of the nodes. As users are added, a home directory has to be created and a .rhosts file is put in their home directory so commands can be remotely executed with rsh7 without requiring password authentication. This setup can also be accomplished with ssh and an authorized_keys file as described in appendix A, but rsh is a little more light weight and easier to setup. The required .rhosts file for this project was:

hydra
b02
b03
b04
b05
b06
b07
b08
b09

The critical point is that the .rhosts file must be in the user's home directory on the remote machine.

Final Setup

At this point the cluster was a trusting network on the verge of being a parallel machine. What was needed at this point was some software to explicitly parallelize execution. The first piece of software was the prsh8 package developed by researchers at the California Institute of Technology which allows for the users to execute standard commands in parallel on a few or all of the nodes in a cluster. It is a wrapper for the standard rsh except that it does not provide login capabilities. Setting an environment variable PRSH_HOSTS with
    > export PRSH_HOSTS='b02 b03 b04 b05 b06 b07 b08 b09'
A command like
    > prsh -- mkdir -p ~/tmp/data_store/
would create the directory tmp/data_store/ in the user's home directory. With prsh installed, the MPI libraries were installed from an RPM package9 and the cluster was ready for testing.



Kevin M. Somervill
2000-05-02