| |
| |
Quadrics Linux Software
| As the dominant interconnect technology in the world's top 10 supercomputers, QsNet is recognized as being the most capable high performance network. In order to consolidate and extend this position, and to promote the development of high-performance commodity clusters, Quadrics has made the core components of the QsNet software release for Linux under the GNU Public License. OEM customers and end-users are able to use these sources to support new versions of the Linux kernel; port to new architectures; add functionality and install modified versions. |
Quadrics Linux Open Source Release
The following components form the Quadrics open source release for Linux:
QsNet kernel device modules (including IP).
QsNet user libraries.
QsNet MPI libraries.
QsNet diagnostic utilities.
QsNet co-processor development kit.
This software is distributed via the Quadrics Web site www.quadrics.com, sources are provided under the GNU public license. Quadrics provides support of this software to OEM customers and end users. |
QsNet kernel device modules
The kernel device drivers and modules to support the QsNet hardware are provided as pre-built kernel kits based on the current Red Hat distributions for each of the supported architectures (Pentium®, Itanium®, and Alpha) and the SuSe distribution for Opteron. Sources are provided for the benefit of sites using other releases of Linux or wishing to port to new architectures. The following modules are provided:
elan3
The Elan3 driver. Manages device configuration, interrupts and network error correction protocols. Provides support for direct user access to QsNet - OS bypass.
elan4
The Elan4 driver. Manages device configuration, interrupts and network error correction protocols. Provides support for direct user access to QsNet - OS bypass.
elan
Device independent Elan operations including Elan capability management and stats collection.
ep
Elan kernel communications. Provides kernel message passing and cluster membership services for QsNet. Includes support for multiple network rails.
eip
Provides IP over QsNet using Elan kernel communications.
rms
Maintains information about the processes that are part of each parallel program. Supports signal delivery and scheduling operations on parallel programs rather than individual processes.
jtag
Parallel port driver for the QsNet switch control interface. Required for systems with directly connected switches.
qsnet
Utility module containing code common to the other Quadrics modules.
|
QsNet user libraries
The qsnetlibs package provides user libraries supporting direct access to the QsNet hardware. The libraries include support for DMAs, message passing, synchronization and collectives (see the Elan Programming Manual for details). A wide range of optimisations are supported:
Elan DMA for remote put/get
Elan queuing DMA for message passing
Local shared memory for message passing within a node
Support for multiple QsNet data networks, including rail striping and optimized collectives
QsNet hardware broadcast and synchronization support.
The qsnetlibs package includes an optimized implementation of Shmem for QsNet. |
QsNet MPI libraries
Quadrics provide an enhanced version of the ANL/MPICH MPI reference port, which has been implemented directly upon the low latency libraries provided as part of the QsNet product. MPI-2 features include one-sided communication and MPI/IO. Quadrics MPI includes support for TotalView message queue debugging. |
QsNet diagnostic utilities
The qsnetdiags package contains utilities used to check the configuration and integrity of the QsNet network. |
QsNet co-processor development kit
The QsNet PCI adaptor card (Elan) incorporates a programmable RISC based co-processor. Kernel and user level programmers can write and load new code for this unit using the GNU based development tools. Note: this kit is only required by user and kernel developers who will be recompiling the qsnetlibs libraries or writing new co-processor code. It is not needed to run parallel jobs or use/compile against MPI on a QsNet system. |
QsNet documentation
QsNet manuals are provided in .html and .pdf format on the Quadrics website. The following manuals are available:
Elan Kernel Communications Manual. Elan Programming Manual. QsNet Installation and Diagnostics Manual. RMS Reference Manual. RMS User Guide. Shmem Programming Manual. |
Resource Management System
The Quadrics Resource Management System (RMS) provides centralized management of a Linux cluster connected by QsNet or Ethernet. RMS features include scheduling of parallel jobs, scalable job startup, parallel run time (stdio forwarding, signal delivery etc) and job clean up, together with cluster-wide access control and accounting. The RMS scheduler provides priority base gang scheduling and timeslicing of parallel jobs. RMS also includes subsystems for switch network management, node status monitoring (with automatic configuring out of failed nodes) and event handling. RMS provides integration with the LSF® and OpenPBS batch systems and the TotalViewTM parallel debugger. RMS is licensed on a per-CPU basis |
Linux Software Availability
Quadrics software is available for Pentium®, and Alpha systems running Red Hat Linux, Itanium® systems running Red Hat Advanced Server and for Opteron systems running SuSE Linux. Support is also provided for the vanilla Linux kernel. See www.quadrics.com/linux for up-to-date information on supported versions. Installed systems range from small clusters to some of the world's largest Linux supercomputers at Lawrence Livermore and Pacific Northwest National Labs.Quadrics provides third level software support of QsNet products to OEM customers and end-users. Additional services include system configuration, installation planning, performance tuning, second level support and custom software development. See www.quadrics.com/support for our web-based fault reporting system and for details of our support products. |
Linpack performance of QsNet Linux systems
| System | CPU type | CPUs | Linpack | | PNNL | 1.5 GHz Itanium 2 | 1936 | 8633 GF | | LLNL -MCR | 2.4 GHz Pentium 4 | 2304 | 7634 GF | | LLNL - ALC | 2.4 GHz Pentium 4 | 1920 | 6585 GF |
|
"Cluster File Systems, Inc. (CFS) maintains the Lustre file system, with optimized support for the Quadrics QsNet high performance interconnect. Lustre is a scalable cluster file system with POSIX semantics, no single points of failure and high performance (see www.lustre.org). Lustre is currently being installed on clusters with QsNet of approximately 1,000 nodes. Lustre is layered on the Portals package (see www.sandiaportals.org). Portals has pluggable device support, in the form of Portals Network Abstraction Layers (NAL's) which Lustre uses to run over many different and heterogeneous networks, including QsNet. Lawrence Livermore National Laboratory and Cluster File Systems have written a Portals NAL which is used by Lustre, and included in the Lustre and Sandia portals software. Initial indications of performance are that, in optimal situations, Lustre can write up to 220MBytes/s from a single client over QsNet and that I/O throughput scales linearly when more Object Storage Targets are added."
Peter J Braam, CEO, Cluster File Systems Inc.
"The type of Science and Engineering calculations required by Livermore's national security mission require a cluster of this size and a very high bandwidth and low-latency interconnect that provides demonstrable and scalable performance. The 11.2 teraflop/s system will significantly expand the computing resources available to Livermore's unclassified researchers."
Mark Seager, Asst. Dept. Head for TeraScale Systems, Lawrence Livermore National Laboratory
QsNet performance is dependent upon the host PCI interface. Performance figures given in this document are indicative of what can be achieved, but do not represent a commitment for any particular system.
TotalView is a trademark of Etnus LLC.
LSF® is a trademark of Platform Computing Inc.
Pentium® and Itanium® are trademarks of Intel Corporation.
Opteron is a trademark of Advanced Micro Devices.
Linux is a trademark of Linus Torvalds.
Red Hat is a trademark of Red Hat Inc.
All other trademarks are the property of their respective owners. |
> Legal
| |
|