Sicortex Technical Summary

A White Paper by Matt Reilly, Lawrence C. Stewart, Judson Leonard, and David Gingold
December 2006/revised April 2008
16 pages

In recent years, there has been phenomenal growth in the use of clusters for high performance technical computing (HPTC). But users of current cluster systems encounter a number of unpleasant realities.

  • Commodity-based clusters seldom deliver more than a small fraction of their peak compute rate because real HPTC applications spend most of their time waiting for data from memory.
  • Applications on commodity-based clusters have not scaled well to large numbers of processors. As long as interprocessor communication is viewed as an I/O function, message operations will take more time than they should.
  • Commodity-based clusters are unreliable. While a single node in a cluster might offer mean-time-to-crash figures of a year or more, this is inadequate when systems are built from hundreds and thousands of nodes.
  • Commodity-based clusters use too much power.