Containers¶
Singularity¶
Singularity is a container runtime designed for high-performance computing. It supports Message Passing Interface (MPI) and GPU applications.
Cache¶
Working with Singularity images may require large amount of storage space. By default ~/.
serves as a cache directory, therefore keep an eye on your quota with myquota
command. These environment variables allow you to control Singularity cache directories. If you share images on the project bases, it's best to set cache dir in the project directory.
export SINGULARITY_CACHEDIR=/gpfs/space/projects/<myproject>/SINGULARITY_CACHE
export SINGULARITY_TMPDIR=/tmp
Pulling an image¶
Singularity uses and stores in cache .sif
or .simg
images. You can convert Docker images to a Singularity images.
Singularity pull command differ…
This produces the file alpine_3.16.2.sif
in the current working directory.
Important
In order to pull with Singularity, squashfs
module has to be loaded. In case of an error, check out FAQ section.
module load squashfs
singularity pull docker://alpine:3.16.2
In some cases you can use the build
command to create the image. Unlike pull, build converts the image to the latest Singularity image format after downloading it:
singularity build <name-of-image.sif> <URI>
Dockerfile into Singularity image¶
If some software is a Dockerfile instead of an image. There are two ways to deal with that.
1. If your local machine has Docker¶
In this case, if you have Docker installed on your local machine, for example on laptop, you can create the Docker image yourself and then transfer it to the cluster, where Singularity image can be build from it.
-
Having created the Docker image on your local machine, get the image id by running this command:
docker images
-
Save that image as a tar file. For example, if image id is
80e2283122ee
:docker save 80e2283122ee -o myimage.tar
-
Copy
myimage.tar
to the UTHPC cluster and then create the Singularity image:rsync -av myimage.tar <your_username>@rocket.hpc.ut.ee:/gpfs/space/home/<your_username>/ ssh <your_username>@rocket.hpc.ut.ee cd ~ module load squashfs/4.5.1 singularity build myimage.sif docker-archive://myimage.tar
2. Converting Dockerfile into Singularity recipe¶
Singularity Python converts a Dockerfile into a Singularity recipe or vise-versa. You can install it with ’pip’.
-
Load python module with ’spython’ binary
module load any/python/3.9.9
-
Perform the conversion
# print in the console spython recipe Dockerfile # save in the *.def file spython recipe Dockerfile &> Singularity.default
Running¶
To run a container's default command with the Singularity image:
singularity run ./myimage.sif <arg-1> <arg-2> ... <arg-N>
To run a specific command, use exec
singularity exec ./myimage.sif <commad> <arg-1> <arg-2> ... <arg-N>
singularity exec ./myimage.sif python3 myscript.py 42
To open a shell into the container, use shell .
singularity shell ./myimage.sif
Singularity> cat /etc/os-release
Singularity> cd /
Singularity> ls -l
Singularity> exit
Files, storage and mounts¶
A running container doesn't automatically bind mounts any paths by default. There are two ways to mount by using an option --bind/-B
or a environment variable $SINGULAIRTY_BIND
OR $SINGULARITY_BINDPATH
. The bind
argument is a comma-delimited string in the format src[:dest[:opts]]
, which src
are outside paths, dest
are inside of the container. If dest
isn't provided, it's set equal to src
. opts
may be either ro
what means read-only or rw
- what means read/write.
Here are a few examples:
singularity shell --bind /tmp,/gpfs/space/home/<username>/<input files>:/inputs:ro myimage.sif
export SINGULARITY_BIND="/tmp,/gpfs/space/home/<username>/<input files>:/inputs:ro"
singularity shell myimage.sif
Submitting a job to Slurm¶
-
Serial
#!/bin/bash #SBATCH --job-name=myjob #SBATCH --cpus-per-task=1 # number of cores (>1 if multi-threaded tasks) #SBATCH --mem=4G #SBATCH --time=00:05:00 # run time limit (HH:MM:SS) module load any//3.5.3 singularity run </path/to>/myimage.sif <arg-1> <arg-2> ... <arg-N>
-
Parallel MPI Codes
#!/bin/bash #SBATCH --job-name=myjob #SBATCH --nodes=1 # node count #SBATCH --ntasks=4 # total number of tasks across all nodes #SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks) #SBATCH --mem-per-cpu=4G # memory per cpu-core (4G per cpu-core is default) #SBATCH --time=00:05:00 # total run time limit (HH:MM:SS) module load broadwell/openmpi/4.1.0 module load any//3.5.3 srun singularity exec myimage.sif myscript.py
GPU's¶
The container is large, so it's best to build or pull the docker image to a Singularity Image File (SIF) before starting a job.
singularity pull docker://pytorch/pytorch:latest
singularity pull docker://tensorflow/tensorflow:latest-gpu
#!/bin/bash
#SBATCH --job-name=myjob
#SBATCH -c 4
#SBATCH --mem=40G # memory per cpu-core (4G per cpu-core is default)
#SBATCH --time=01:0:00 # total run time limit (HH:MM:SS)
#SBATCH --gres=gpu:tesla:1 # number of gpus per node
module load any//3.5.3
singularity exec --nv tensorflow_latest-gpu.sif python3 myscript.py
FAQ¶
Why Singularity, not Docker?
Docker isn't secure because it provides a means to gain root access to the system it's running on. Singularity is a secure alternative to Docker designed for HPC workflows. Singularity is compatible with all Docker images and you can use it with GPUs and MPI applications.
How to do deal with FATAL: While making image from oci registry
If at the end of an image pull, you are greeted with FATAL: While making image from oci registry: error fetching image to cache: while building SIF from layers: conveyor failed to get: initializing source oci
, ensure you have loaded squashfs
module and check $HOME/./cache/blob
doesn't have root
ownership. If it does, then write to support@hpc.ut.ee, otherwise try to pull the image again.