System overview¶
The Hoffman2 Cluster is a Linux compute cluster currently consisting of 800+ 64-bit nodes and over 26,000 cores, with an aggregate of over 174TB of memory. Each node has 1GB Ethernet network and a DDR, QDR, FDR, or EDR Infiniband interconnect. Several nodes have one or more GPU cards with computing capability ranging from 4 to 8.
Batch jobs and interactive sessions on the compute nodes are dispatched via a job scheduler. The Hoffman2 Cluster has a variety of software already installed and routinely updated, which includes compilers for C, C++, Fortran 77, 90 and 95, software libraries, programming languages and many applications (including several licensed software) specific to various fields and various visualization, rendering and an array of miscellaneous software. License manager software services are also available for groups who have purchased licensed software. Software installation support is likewise provided.
The Hoffman2 Cluster is open, free of charge, to the entire UCLA campus with a base amount of computational and storage resources. Researchers can purchase additional computational and storage resources to increase their computational capacity. Computational resources owned by research groups can be accessed in preferential mode (in which each group only accesses their resources with higher priority and for extended run times) or in a shared/condominium mode (in which unused resources from a group are accessed by any other group who has purchased computational resources into the cluster for up to 24 hour run times). The advantage of the shared model is that researchers can access a much wider set of resources than what they have contributed. Additional resources for researchers include complete system administration for contributed cores, cluster access through dual, redundant 100GB network interconnects to the campus backbone, the capability to run large parallel jobs that can take advantage of the cluster’s InfiniBand interconnect, and access to a multi-node NetApp and VAST storage systems. Current HPC storage capacity is six petabytes of NetApp storage and two petabytes of flash-based VAST storage for home, project and scratch directories.
The cluster is also an end point on the Globus Online service using the 100GB network interconnect backbone, thus providing researchers a facility for fast and reliable data movement between Hoffman2 and most leadership class facilities across the USA.