This section contains descriptions of SiCortex techncial papers published at conferences. Click the link to get more information about how to obtain the paper.
Authors: Godiwala, Leonard, Reilly
Publication date: August 2008
Abstract:
Much of high performance technical computing has moved from shared memory architectures to message based cluster systems. The development and wide adoption of the MPI parallel programming model has hastened this transition. Parallel scaling, however, is frequently limited by the inefficient communication hardware commonly found in commodity based clusters. This paper describes a new communication network (the SiCortex fabric) employed in the SiCortex SC5832 integrated cluster system. The fabric switch and communications controller are integrated with a single-chip multiprocessor node and provides three point-to-point links per node chip. The resulting design provides low latency, high bandwidth, reliable communication between the 972 nodes of the SiCortex system.
For more information, see Network Fabric
Authors: Stewart, Leonard, Gingold, Watkins
Publication date: October 2007
Abstract:
The SiCortex cluster systems implement a high-bandwidth, low-latency interconnect supporting direct communication between processes running on the cluster’s nodes. Within a node, this network interface (NI) hardware is closely coupled to the node’s processors and cache, all of which are contained within a single chip. We describe how the SiCortex systems implement RDMA; in particular: the hardware and software mechanisms that support zero-copy data transfers and user-level networking (OS bypass), how an optimistic virtual memory registration scheme eliminates the need for page locking, and how these mechanisms support an efficient MPI implementation. Finally, we provide preliminaryperformance results.
For more information, see DMA Registration
Authors: Jud Leonard
Publication date: March 2007
Abstract:
The Kautz digraph[1,2] has been described as the ideal communication network for parallel computers, but it is generally unknown in the engineering community, and has never previously been used in a commercial product. We will define and characterize it in relation to HPC clusters, then discuss some of the implementation issues encountered in developing it as a large-scale cluster interconnect.
For more information, see Kautz Digraph
Authors: P. Mucci, T. Mohan
Publication date: June 2007
Abstract:
As Linux/MPI clusters continued to replace more traditional, closed-source HPC systems, the quality of the performance tool suite available to the application engineer has suffered. Linux and the open source HPC community has enabled a new wave of innovation in performance tool development, creating tools that match and improve upon their closed-source predecessors. Unfortunately, this work has been hindered by the "last mile" problem that affects many open source efforts. Excellent work has been done, but open source performance tools have rarely reached the state of maturity that would allow mainstream production use. As a result, no major Linux distribution or supported installation contains a performance tools suite. Recognizing that the power of Open Source lies in innovation not necessarily in software engineering, SiCortex has taken the unique approach of integrating a set of high quality Open Source performance tools into single suite, enhancing their usability, stability and performance while providing additional functionality. The result is a comprehensive Open Source performance analysis suite commensurate with the best offered by its closed-source competitors. Consistent with the Open Source community support model, the original developers have been contacted and every effort has been made to feed the changes upstream. In this paper, we present an overview of the SiCortex performance tools suite as well as challenges and successes we had in the process of realizing it.
For more information, see Performance Tool Suite
Authors: Leonard, Purkayastha, Reilly, Mohan
Publication date: September 2007
Abstract:
The Kautz digraph [1,2] has been described as the ideal communication network for parallel computers, but it is generally unknown in the engineering community, and has never previously been used in a commercial product. We will define and characterize it in relation to HPC clusters, then discuss some of the implementation issues encountered in developing it as a large-scale cluster interconnect. The software interface to this interconnect is explained with some initial experiences with performance benchmarks.
For more information, see Software Interface
Authors: Oleg Petlin and Wilson Snyder
Publication date: June 2007
Abstract:
This paper discusses functional verification of the SiCortex multiprocessor compute node. It is shown that the implementation of reusable verification methodology applicable at the block- and chip-level, combined with a flexible SystemC testbench design increases the level of verification productivity. Also it is demonstrated how verification productivity can be improved by using open source verification tools. The simulation approach described in the paper provides a powerful mechanism for conrolling the simulaton speed, accuracy, and overall verification cost. As a result, the SiCortex verification team was able to find more bugs faster and to start co-verification in early stages of the project development.
For more information, see Functional Verification