Setting up SLURM on Multiple Machines

Original post is here: eklausmeier.goip.de

SLURM is a job-scheduler for Linux. It is most useful if you have a cluster of computers. But if you only have two or three computers it can be put to good use. SLURM is an acronym for Simple Linux Utility for Resource Management.

SLURM is used by Sunway TaihuLight with 10,649,600 computing cores, Tianhe-2 with 3,120,000 Intel Xeon E5 2.2 GHz cores (not CPUs) -- Tianhe-2 is in effect a combination of 16,000 computer nodes. These two chinese machines were the two most powerful supercomputers in the world up to November 2017; now overpowered by Oak Ridge National Laboratory's Summit and LLNL's Sierra. TaihuLight is located in Wuxi, Tianhe-2 is located in Guangzhou, China. Europe's fastest supercomputer Piz Daint uses SLURM, so does the fastest supercomputer in the middle-east, KAUST. SLURM is also used by Odyssey supercomputer at Harvard with 2,140 nodes equipped with AMD Opteron 6376 CPUs. Frankfurt's supercomputer with 862 compute nodes, each equipped with AMD Opteron 6172 processors, also uses SLURM. Munich's SuperMUC-NG uses SLURM, Barcelonas Supercomputer as well. Many other HPC sites as listed in Top500 use SLURM.

What is the catch of a job-scheduler? While a single operating system, like Linux, can manage jobs on a single machine, the SLURM job-scheduler can shuffle jobs around lots of machines, thereby even out the load on all of them.

[more_WP_Tag]Installing the software is pretty simple as it is part of Ubuntu, Debian, Arch Linux, and many other distros.

1apt-get install munge
2apt-get install slurm-llnl

These packages also add the users munge and slurmd, respectively.

Configuring munge goes like this:

1create-munge-key

This command creates /etc/munge/munge.key, which is readable by the user munge only. This file is copied to all machines to the same place, i.e., /etc/munge/munge.key. Once this cryptographic key is created for munge, one can start the munge daemon munged and test whether it works:

 1$ echo Hello, world | munge | unmunge
 2STATUS:           Success (0)
 3ENCODE_HOST:      chieftec (127.0.1.1)
 4ENCODE_TIME:      2014-11-07 22:37:30 +0100 (1415396250)
 5DECODE_TIME:      2014-11-07 22:37:30 +0100 (1415396250)
 6TTL:              300
 7CIPHER:           aes128 (4)
 8MAC:              sha1 (3)
 9ZIP:              none (0)
10UID:              klm (1000)
11GID:              klm (1000)
12LENGTH:           13
13
14Hello, world

It is important to use one unique /etc/munge/munge.key file on all machines, and not to use create-munge-key on each machine.

The munged daemon is started by

1/etc/init.d/munge start

Once munged is up and running one configures SLURM by editing /etc/slurm-llnl/slurm.conf.

 1ControlMachine=nuc
 2AuthType=auth/munge
 3CacheGroups=0
 4CryptoType=crypto/munge
 5JobCheckpointDir=/var/lib/slurm-llnl/checkpoint
 6MpiDefault=none
 7ProctrackType=proctrack/pgid
 8ReturnToService=2
 9SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
10SlurmctldPort=6817
11SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
12SlurmdPort=6818
13SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
14SlurmUser=slurm
15StateSaveLocation=/var/lib/slurm-llnl/slurmctld
16SwitchType=switch/none
17TaskPlugin=task/none
18InactiveLimit=0
19KillWait=30
20MinJobAge=300
21SlurmctldTimeout=120
22SlurmdTimeout=300
23Waittime=0
24FastSchedule=1
25SchedulerType=sched/backfill
26SchedulerPort=7321
27SelectType=select/linear
28AccountingStorageLoc=/var/log/slurm-llnl/slurm_jobacct.log
29AccountingStorageType=accounting_storage/filetxt
30AccountingStoreJobComment=YES
31ClusterName=cluster
32JobCompLoc=/var/log/slurm-llnl/slurm_jobcomp.log
33JobCompType=jobcomp/filetxt
34JobAcctGatherFrequency=30
35JobAcctGatherType=jobacct_gather/none
36SlurmctldDebug=3
37SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
38SlurmdDebug=3
39SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
40NodeName=chieftec CPUs=8 State=UNKNOWN
41NodeName=nuc CPUs=4 State=UNKNOWN
42PartitionName=p1 Nodes=chieftec,nuc Default=YES MaxTime=INFINITE State=UP

Starting the SLURM daemons is

1/etc/init.d/slurm-llnl start

As a test run

1srun who

Added 05-Apr-2017: Quote from Quick Start Administrator Guide:

1. Make sure the clocks, users and groups (UIDs and GIDs) are synchronized across the cluster. 2. Install MUNGE for authentication. Make sure that all nodes in your cluster have the same munge.key. Make sure the MUNGE daemon, munged is started before you start the Slurm daemons.

Added 09-Dec-2017: If you compile SLURM yourself beware of these compiler flags, or see the build script in the AUR. SLURM also runs on ARM. My personal cluster contains a mixture of Intel, AMD, and ARM CPUs. Make sure to use the same SLURM version on all nodes.

Added 22-Jun-2019: From Release Notes for SLURM 19.05. 32-bit builds have been deprecated. Use --enable-deprecated to continue building on 32-bit systems. This is relevant, for example, for 32-bit ARM SLURM builds.

Added 10-Oct-2021: Referenced in Building a Home HPC Computer Cluster (Beowulf?!) Using Ubuntu 14.04, old PC's and Lots of Googling.

Added 16-Jun-2022: You need to create a file cgroup.conf in /etc/slurm-llnl with below content:

 1###
 2# Slurm cgroup support configuration file.
 3###
 4CgroupAutomount=yes
 5CgroupMountpoint=/sys/fs/cgroup
 6ConstrainCores=yes
 7ConstrainDevices=yes
 8ConstrainKmemSpace=no        #avoid known Kernel issues
 9ConstrainRAMSpace=yes
10ConstrainSwapSpace=yes

Above file was copied from cgroup.conf. This is required if you have

1ProctrackType=proctrack/cgroup

configured in your slurm.conf. Add

1TaskPlugin=task/cgroup,task/affinity

there.

#32-bit systems #SLURM #munge

last updated: 2024-11-04