The Cluster

The Cluster

The cluster is made up of 1 head node and 10 compute nodes. Of the compute nodes, 8 are AMD based systems while the other 2 are Intel based systems. Here is the following hardware stack for all of the nodes:

Repairs The DGX1 node is currently unavailable due to faulty GPU board. There is no ETA on when this will be repaired.
Node Type Node Name Processor RAM HDD (/tmp) Extra Components
Head Node robotarium AMD Opteron 6320 (8 cores) 64 GB 916 GB
AMD Compute Node gpu01, gpu02, gpu03, gpu04, gpu05, gpu06 4x AMD Opteron 6376 (64 cores) 512 GB 40 GB NVIDIA K20 GPU, (2x in gpu06)
AMD Compute Node (Large RAM) gpu07, gpu08 4x AMD Opteron 6376 (64 cores) 1024 GB 40 GB NVIDIA K20 GPU (only in gpu07), 2x NVIDIA Titan Xp GPUs (only in gpu08)
Intel Compute Node mic01, mic02 2x Intel Xeon E5-2650v2 (16 cores) 128 GB 40 GB
NVIDIA DGX1 Node dgx01 2x Intel Xeon E5-2698v4 (40 cores) 512 GB 440 GB (7 TB at /raid/scratch) 8x NVIDIA Tesla P100

In addition to the above listed nodes, there are also some special-purpose nodes — specifically, the cluster has eight Intel MIC nodes:

Node Type Node Name Processor Onboard Memory Extra Components
Intel MIC Compute Units mic01-0, mic01-1, mic01-2, mic01-3, mic02-0, mic02-1, mic02-2, mic02-3 Intel Xeon Phi 5120D 8 GB

All of the nodes are connected by both a 1Gbps network and an InfiniBand 4x 10Gbps network. We support the full OFED stack.

The nodes are made available through the SLURM resource management system — more information on queues can be found here.

Details about the OS and available software packages can be found on the software information page.