Základní IT infrastruktura

CIIRC Computational Cluster Description

The CIIRC computational cluster serves CIIRC researches, students, and their collaborators for intensive batch computations. The CIIRC cluster also offers acceleration on graphics cards. We have been extending and improving the CIIRC cluster.

The CIIRC cluster is XSEDE Compatible Basic Cluster based on OpenHPC project. The CIIRC cluster runs on CentOS 7 operating system. It uses Slurm workload manager and a job scheduler. Cluster management and orchestration is done by Warewulf toolkit.

User software is managed by EasyBuild, a software build and installation framework, and uses Lmod environment module system. The container platform in the CIIRC cluster is Singularity.

The current Cluster comprises 12 nodes, in GPU and computational partitions. See the table below in the form provided by Slurm:

gpu 1 node-01 56 257,552 gpu:1080Ti:1
gpu 4 node-[14-17] 72 192,088 gpu:1080Ti:4
gpu 2 node-[11-12] 56 257,552 gpu:K40:2
compute* 5 node-[02-05,09] 56 257,552 (null)

Nodes are interconnected by 10 Gbit Ethernet. The 100 Gbit EDR IB (InfiniBand) is available.  All nodes contain a local ssd scratch disk. The whole cluster is connected to the 600 TB Isilon NAS storage. „compute*“ means a default partition for computations without gpu acceleration. „GRES: means generic resources in Slurm.

Present Slurm configuration is fairly standard (simplified):

  • SelectType=select/cons_res
  • SelectTypeParameters=CR_CPU_Memory
  • SchedulerType=sched/backfill
  • PriorityType=priority/multifactor

Future extension and plans:

  • Three nodes will be added soon at least.
  • BeeGFS distributed scratch space
  • Five DGX-1 nodes by the end of the year 2019
  • The cluster throughput improvement by better QOS settings

Funding acknowledgment:
Building and continuous extending of the CIIRC computational cluster needs substantial financial resources. The money comes mainly from national and European Commission projects, e.g. the Czech Government investment which created CIIRC, Josef Urban’s ERC Consolidator project AI4REASON and his project AI & Reasoning; Josef Šivic’s project IMPACT, Robert Babuška’s project R4I, etc.

Responsible: Jan Kreps; Last update 2019-08-08