Introduction¶

Welcome to the sixth lab. We will cover environments and containers. You can consider this lab to be somewhat of an advanced approach to using software

In this lab we will be going over the practical side of creating environments and containers.

Complete

Before we start with the lab, create a directory called lab6 in your course directory. All of the following will happen inside of that directory.

Conda environments¶

In your lab6 directory create another one called conda. We will be installing our environments there.

To work with conda environments, we actually need to have it. Run module load miniconda3 . Do note that when using conda environments in jobs, you need to load the module before you can activate them.

Creating an environment¶

The syntax to create an environment is quite simple: conda create -n <name> package1=version package2=version. To get more options you can run conda create --help.

Complete

Create your virtual environment with conda create --prefix=/gpfs/space/projects/hpc-course/<your_hpc_username>/lab6/conda/firstenv python=3.6 conda

There will be a prompt that shows every python package that will be installed in that environment, accept that.

To activate the environment, run source activate /gpfs/space/projects/hpc-course/<your_hpc_username>/lab6/conda/firstenv

If you later want to deactivate it, run conda deactivate

There are a few things to notice here. First, we didn't use -n, but --prefix instead. This allows you to install the environment in a predefined location instead of the default. This enables more control but you have to manage the location yourself. Usually you should use -n because then you can just use source activate <name> instead of specifying the full path.

We installed two packages to the environment, python and conda. These are not necessary but they are a good thing to install to increase independency from the outside environment. If you don't specify python for example, the same python that you have loaded will be linked into the environment. If something should happen to the underlying python, then your env will also break. Adding a separate conda installation to your environment makes managing the environment simpler. When using an outside conda, it might try to install packages in weird locations. Conda will also create the activate script that we will discuss later.

Note

Notice how we used source activate instead of conda activate as the program suggests. When you try to run the latter, conda will prompt you to run conda init, which makes some changes to your ~/.bashrc file. These can be convenient when you are working as a developer but they don't work too well in a typical cluster environment. Slurm won't always pick up those modifications so you might be left with a situation where you can't use the envs.

You can explore the firstenv directory further. You can also run a little script find . -type f | wc -l to list how many files are in that directory.

Managing and updating an environment¶

Technically, conda always has an environment that it's using. When you don't have one loaded it's called Base. Run conda env list to see what environments are available and loaded.

When in an environment, run conda list to see what packages are available.

Complete

With an environment activated, run conda install r-essentials to install R into your environment, technically creating one for that language.

Run the file count script again to see how many files this action created.

Note

You can activate multiple environments, technically stacking them. This should be done when you know exactly what you are doing. Thus try to run conda deactivate to disable one environment before activating another. Restarting your terminal session helps as well.

Exporting an environment¶

You can export environments with exact package specifications for others to use. This is achieved by the conda env export command.

Complete

With the firstenv environment activated, run conda env export >> /gpfs/space/projects/hpc-course/hpc_yourname/lab6/conda/firstenv-export.yaml

You can create environments from export files with the -f flag. Create one with: conda env create --prefix=/gpfs/space/projects/hpc-course/hpc_yourname/lab6/conda/secondenv -f=/gpfs/space/projects/hpc-course/hpc_yourname/lab6/conda/firstenv-export.yaml

Usually when a software package needs an environment, they provide one of these export files. Hopefully now you know how to better manage them with conda.

virtualenv environment¶

Start this part with creating the directory lab6/virtenv.

Run conda deactivate and module purge to clear your environment and then module load py-virtualenv.

The python virtualenv package is good to use when you need mainly pip to install your packages.

We won't go too much in depth here as virtualenv has less functionality than conda and the high level concepts already got introduced with it.

Creating an environment¶

Like export files, pip has something called requirements.txt . We will see how to use it in an environment by installing an open diffusion software.

Complete

Run module load python/3.8 . Don't worry about the warnings currently, they are a necessary evil for next steps. Run virtualenv -p python3.8 /gpfs/space/projects/hpc-course/<your_hpc_username>/lab6/virtenv/thirdenv to create the environment.

Run source /gpfs/space/projects/hpc-course/<your_hpc_username>/lab6/virtenv/thirdenv/bin/activate to activate the environment.

Run git clone https://github.com/bes-dev/stable_diffusion.openvino.git in your virtenv directory and navigate to it.

Use pip install -r requirements.txt to install all required packages to the environment.

You now have a working virtualenv environment. You can run the source .../bin/activate command again without having to load any modules. You can follow the guide in the GitHub link to generate your own AI images but those are not part of the course.

Spack environments¶

Create a directory lab6/spackenv to get started. We will be using the same Spack installation initialize your Spack that we used in lab2. Activate the setup-env.sh script if you haven't done so in your .bashrc file.

spack env --help will give you an overview of what you can do.

Complete

Run spack env create -d /gpfs/space/projects/hpc-course/<your_hpc_username>/lab6/spackenv/fourthenv

Navigate to the directory that you specified. In it is spack.yaml that is used to configure the environment. The magic happens in the .spack-env directory that is hidden in the same place.

Inside the fourthenv directory, run spacktivate -p .. The -p is optional, but makes it pretty by displaying what environment you are in. . denotes the path to the environment.

When you run spack find you will see that you have 0 installed packages. You can install packages to the environment by using spack install but we will be modifying the spack.yaml file instead.

In spack.yaml find the line starting with specs:, change it to:

  specs:
    - python@3.8
    - tcl

Now run spack find again. You will see that the root specs list has been updated but still zero installed packages. Fix that with spack install. The python@3.8 package will be installed separately but the tcl one won't . This is because Spack will install it to the default location and will link them to the view under .spack-env .

The Spack environments documentation has tons of more information on what you can achieve with environments.

Singularity containers¶

In this course we will be going over the basic functionality of Singularity. Using it gets easier with practice so we recommend thinking "can I run this in a container instead" for any workflow that you have.

Start off by creating the lab6/singularity directory and loading the module load singularity module.

You can see Singularity functionality with singularity help

A library of openly available images can be found at the sylabs library

Running containers¶

First you have to get a container. We will use lolcow in this lab. This is the same image that is used in the official Singularity user guides.

Run singularity pull library://lolcow in your lab6/singularity directory. This will download the lolcow_latest.sif image file.

Now you have an image to run, try entering singularity run lolcow_latest.sif and see what happens.

Singularity run vs Singularity exec¶

Singularity containers have programs and commands in them that you can execute. You can target them individually with singularity exec. In our case our command is cowsay. Run singularity exec lolcow_latest.sif cowsay "I love the hpc course" to see what happens.

The containers get built with a runscript section, we will cover that in a bit. That section defines what gets executed by default:

%runscript
    date | cowsay | lolcat

Usually the software documentation has the commands that you can run with the image.

Singularity shell¶

You are able to open a shell session inside a image. This can be used for debugging purposes and getting a better overview of what you are working with. Run singularity shell lolcow_latest.sif to open the terminal, after that you can run cowsay directly from the command line. If you want to write inside the container, you also need to pass the --writable flag.

File management with Singularity¶

Create a test file called hello.txt with a simple Hello written in it. Also run export SINGULARITY_BIND="/gpfs/space/projects/hpc-course/hpc_yourname/lab6/singularity".

Now when you run singularity exec lolcow_latest.sif cat hello.txt you should see the contents of the file. Singularity containers will have access to the same files that you do currently. It might happen that you have a container that is only able to read files from a certain path. For that you can use the --bind flag to mount a specific directory

singularity exec --bind /etc:/mnt lolcow_latest.sif cat /mnt/hosts

We used the SINGULARITY_BIND variable a bit ago to define the --bind options with a variable. Both can be used according to your workflow. In default configuration, only $HOME, /tmp, /proc, /sys, /dev and some more are bound automatically. $PWD used to be as well but it got changed due to bugs with the system not always finding the current working directory. You can reset the $PWD functionality by having export SINGULARITY_BIND=$PWD in your env or scripts.

We can also write to files with programs inside the container: singularity exec lolcow_latest.sif cowsay "hello from the container" >> hello.txt

Building Singularity images¶

The recipes to build Singularity containers are called Singularity definition files. Like Dockerfile, they describe how the image should be built. We will be looking at the lolcow definition file:

BootStrap: library
From: ubuntu:22.04
%post
    apt-get -y update
    apt-get -y install cowsay lolcat
%environment
    export LC_ALL=C
    export PATH=/usr/games:$PATH
%runscript
    date | cowsay | lolcat
%labels
    Author Sylabs

The file describes what should be the base image, what commands to run during build time, the environment and much more. The aforementioned %runscript is also here.

This file would be called lolcow.def or something similar and built with singularity build lolcow.sif lolcow.def.

We will not be building containers in this lab since it needs root access. You are free to install Singularity on your own machine(you can use Spack for that) to try out the building process.

Containers with Spack¶

Spack has a very handy tool called spack containerize that creates image definition files from environments. Move back to the spack directory containing the fourth environment and spack.yaml.

Running spack containerize in that directory will create a Dockerfile by default, creating definition files needs a small addition to the spack.yaml file:

...
  specs: 
    - python@3.8
    - sqlite
    - tcl
  container:
    format: singularity
...

Now run spack containerize >> spackenv.def and view the contents.

What spack does in a nutshell is:

uses a Ubuntu image that has spack inside of it
installs the required packages inside that environment
makes the necessary environment modifications

Cotainr¶

Next we will be looking into how to create containers that contain your conda environment

We will be using a tool called cotainr, and using their official documentation

Complete

Download the latest release with wget https://github.com/DeiC-HPC/cotainr/archive/refs/tags/2023.11.0.tar.gz

You can download it anywhere but you must be able to locate it later.

Unpack the tarball with tar xzvf 2023.11.0.tar.gz

For ease of use, create a directory called $HOME/.local/bin and link the cotainr executable there with ln -s /path/to/cotainr/bin/cotainr $HOME/.local/bin/cotainr . This makes it so that you can execute it directly from the command line

Now you should have cotainr installed, load the latest Singularity module with module load singularity and verify dependencies with cotainr info . Check uses with cotainr help

We will build a container from our first conda environment. To do that, execute:

cotainr build my_conda_env_container.sif --base-image=docker://ubuntu:22.04 --conda-env=firstenv-export.yaml

Try it out with singularity exec my_conda_env_container.sif conda list to see the packages from your env from inside the container.

SHPC¶

Next we will be using a tool called SHPC to install containers from a central registry and use them as modules.

Complete

The install process for shpc is similar to cotainr: wget https://github.com/singularityhub/singularity-hpc/archive/refs/tags/0.1.28.tar.gz

tar xzvf 0.1.28.tar.gz

This time, you cannot execute it directly but must install it via pip. Since it is a general use tool, you can use your base pip environment, aka you don't need to create a new one (although you can if you wish!)

Move to the unpacked directory and run pip install -e .[all]

After installation shpc --help should give you helpful output.

We will be doing some configuration edits to specify our container and module directories. For that create two directories in lab6 called containers and modules.

Set correct paths with shpc config set module_base /path/to/lab6/modules and shpc config set container_base /path/to/lab6/containers . Check all available configuration options with shpc config edit

We will be installing python as our first module. It is as easy as shpc install python . This will download the container and create the python/3.13-rc/module . Since our created module path is not available automatically, enable it with module use /path/to/lab6/modules . Now you should be able to load the aforementioned module.

Go check inside of the module file that was created. You will see a lot of magic done for you and how it uses a container inside. This knowledge is not needed for this course but it is good to know.

Load the module and try running python --version to ensure you got the correct one.

Extra notes¶

The last two programs, cotainr and shpc, are things that we are looking into incorporating into our HPC software stack for multiple reasons. It's best to start using these as soon as possible if you have the need as they should make life much easier for you. Try to look at the links to the official documentation of these, alongside of Singularity, Conda etc.