Building a Linux Cluster: Example of a Self-Made Cluster with GigE
As an interesting example of building a self-made Linux cluster with a
Gigabit-Ethernet-interconnect we show the
numbers which Michael S. Warren from Los Alamos National Laboratory
presented at the Clusterworld Conference 2003 (San Jose, June 2003) in
his talk about
"The Space Simulator". The
Space Simulator is a 294-processor Beowulf Linux cluster with a
2.53 GHz Pentium 4 processor and 1 Gb Memory per Node.
Together with his colleagues Chris Fryer and Patrick Goda he
was requested to build a
Linux cluster in one month using the
budget of 500,000 US-Dollars. Here is what they came up with
in September 2002 (Prices are in USD):
Quantity |
Price |
Description |
Total |
Perc. |
294 |
280 |
Chassis:
Shuttle SS51G mini system |
82320 |
17% |
294 |
254 |
Intel P4/2.53GHz, 533Mhz FSB, 512k Cache |
74676 |
15% |
588 |
118 |
512Mb DDR333 SDRAM / Node |
69384 |
14% |
294 |
83 |
80 GB-Maxtor-Harddisk |
24402 |
5% |
1 |
3300 |
Wire shelving / switch rack |
3300 |
1% |
294 |
35 |
Assembly Labor / Extended Warranty |
10290 |
2% |
1 |
1378 |
Power Cable |
1378 |
0% |
294 |
95 |
Gigabit Ethernet PCI Card |
27930 |
6% |
1 |
4000 |
Cat6 Ethernet Cable |
4000 |
1% |
1 |
186175 |
Switches Foundry FastIron 1500+800, 304 Gigabit-Ports |
186175 |
38% |
|
|
SUM |
483855 |
|
|
|
|
|
|
The total cost per node was $1646 while $742 (45%) of the total
was spent on the GigE-networking shown in the last 3 rows above
the sum. The Graphs below show the cost distribution as bar- and pie-charts
as well as the smoothed temperature distribution measured by
lm_sensors in the racks representing the effect of the
air flow through the cluster (for full size please click on the
images).
What about Reliability ?
During the installation of the cluster and the initial large Linpack
benchmark runs they identified the following defective hardware:
- 3 power supplies
- 6 disk drives
- 4 motherboards
- 6 sticks of DRAM
- 1 ethernet card
During the six month period since the initial failures, the following
hardware has failed:
- 2 power supplies
- 5 disk drives
- 1 motherboard
- 3 sticks of DRAM
- 1 fan connector loose
Their advice ?
"Premature optimization is the root of all evil." -Hoare
More Information about building a Linux cluster
We have compiled more detailed information about How to
successfully build a Linux Compute Cluster.
|