System overview

Hoffman2 is a Linux compute cluster currently consisting of 1,300+ 64-bit nodes and over 21,000 cores, with an aggregate of over 50TB of memory. Each node has 1GB Ethernet network and a DDR, QDR, FDR, or EDR Infiniband interconnect. Over 200 of its nodes have 36 cores and 192GB of memory. Several nodes have one or more GPU cards with computing capability ranging from 4 to 7.5. The current peak CPU performance of the cluster is approximately 150 Trillion Floating Point, double precision, operations per second (TFLOPS) plus another 200 TFLOPS with GPUs. The Hoffman2 Cluster is currently the largest and most powerful shared cluster in the University of California system.

Batch jobs and interactive sessions on the compute nodes are dispatched via a job scheduler. The Hoffman2 Cluster has a variety of software already installed and routinely updated, which includes compilers for C, C++, Fortran 77, 90 and 95, software libraries, programming languages and many applications (including several licensed software) specific to various fields and various visualization, rendering and an array of miscellaneous software. License manager software services are also available for groups who have purchased licensed software. Software installation support is likewise provided.

The Hoffman2 Cluster is open, free of charge, to the entire UCLA campus with a base amount of computational and storage resources. Researchers can purchase additional computational and storage resources to increase their computational capacity. Computational resources owned by research groups can be accessed in preferential mode (in which each group only accesses their resources with higher priority and for extended run times) or in a shared/condominium mode (in which unused resources from a group are accessed by any other group who has purchased computational resources into the cluster). The advantage of the shared model is that researchers can access a much wider set of resources than what they have contributed. Additional resources for researchers include complete system administration for contributed cores, cluster access through dual, redundant 100GB network interconnects to the campus backbone, the capability to run large parallel jobs that can take advantage of the cluster’s InfiniBand interconnect, and access to a multi-node NetApp storage system. Current HPC storage capacity is 2.5 petabytes, augmented by 250TB of flash-based storage for home and scratch directories and over 2PB of backup storage.

The cluster is also an end point on the Globus Online service using the 100GB network interconnect backbone, thus providing researchers a facility for fast and reliable data movement between Hoffman2 and most leadership class facilities across the USA.