Introduction¶
Welcome to the sixth lab. We will cover environments and containers. You can consider this lab to be somewhat of an advanced approach to using software
In this lab we will be going over the practical side of creating environments and containers.
Complete
Before we start with the lab, create a directory called lab6
in your course directory. All of the following will happen inside of that directory.
Conda environments¶
In your lab6
directory create another one called conda
. We will be installing our environments there.
To work with conda environments, we actually need to have it. Run module load miniconda3
. Do note that when using conda environments in jobs, you need to load the module before you can activate them.
Creating an environment¶
The syntax to create an environment is quite simple: conda create -n <name> package1=version package2=version
. To get more options you can run conda create --help
.
Complete
Create your virtual environment with conda create --prefix=/gpfs/space/projects/hpc-course/<your_hpc_username>/lab6/conda/firstenv python=3.6 conda
There will be a prompt that shows every python package that will be installed in that environment, accept that.
To activate the environment, run source activate /gpfs/space/projects/hpc-course/<your_hpc_username>/lab6/conda/firstenv
If you later want to deactivate it, run conda deactivate
There are a few things to notice here. First, we didn't use -n
, but --prefix
instead. This allows you to install the environment in a predefined location instead of the default. This enables more control but you have to manage the location yourself. Usually you should use -n
because then you can just use source activate <name>
instead of specifying the full path.
We installed two packages to the environment, python and conda. These are not necessary but they are a good thing to install to increase independency from the outside environment. If you don't specify python for example, the same python that you have loaded will be linked into the environment. If something should happen to the underlying python, then your env will also break. Adding a separate conda installation to your environment makes managing the environment simpler. When using an outside conda, it might try to install packages in weird locations. Conda will also create the activate
script that we will discuss later.
Note
Notice how we used source activate
instead of conda activate
as the program suggests. When you try to run the latter, conda will prompt you to run conda init
, which makes some changes to your ~/.bashrc
file. These can be convenient when you are working as a developer but they don't work too well in a typical cluster environment. Slurm
won't always pick up those modifications so you might be left with a situation where you can't use the envs.
You can explore the firstenv directory further. You can also run a little script find . -type f | wc -l
to list how many files are in that directory.
Managing and updating an environment¶
Technically, conda always has an environment that it's using. When you don't have one loaded it's called Base
. Run conda env list
to see what environments are available and loaded.
When in an environment, run conda list
to see what packages are available.
Complete
With an environment activated, run conda install r-essentials
to install R
into your environment, technically creating one for that language.
Run the file count script again to see how many files this action created.
Note
You can activate multiple environments, technically stacking them. This should be done when you know exactly what you are doing. Thus try to run conda deactivate
to disable one environment before activating another. Restarting your terminal session helps as well.
Exporting an environment¶
You can export environments with exact package specifications for others to use. This is achieved by the conda env export
command.
Complete
With the firstenv
environment activated, run conda env export >> /gpfs/space/projects/hpc-course/hpc_yourname/lab6/conda/firstenv-export.yaml
You can create environments from export files with the -f
flag. Create one with: conda env create --prefix=/gpfs/space/projects/hpc-course/hpc_yourname/lab6/conda/secondenv -f=/gpfs/space/projects/hpc-course/hpc_yourname/lab6/conda/firstenv-export.yaml
Usually when a software package needs an environment, they provide one of these export files. Hopefully now you know how to better manage them with conda.
virtualenv environment¶
Start this part with creating the directory lab6/virtenv
.
Run conda deactivate
and module purge
to clear your environment and then module load py-virtualenv
.
The python virtualenv package is good to use when you need mainly pip
to install your packages.
We won't go too much in depth here as virtualenv
has less functionality than conda and the high level concepts already got introduced with it.
Creating an environment¶
Like export files, pip has something called requirements.txt
. We will see how to use it in an environment by installing an open diffusion software.
Complete
Run module load python/3.8
. Don't worry about the warnings currently, they are a necessary evil for next steps. Run virtualenv -p python3.8 /gpfs/space/projects/hpc-course/<your_hpc_username>/lab6/virtenv/thirdenv
to create the environment.
Run source /gpfs/space/projects/hpc-course/<your_hpc_username>/lab6/virtenv/thirdenv/bin/activate
to activate the environment.
Run git clone https://github.com/bes-dev/stable_diffusion.openvino.git
in your virtenv directory and navigate to it.
Use pip install -r requirements.txt
to install all required packages to the environment.
You now have a working virtualenv environment. You can run the source .../bin/activate
command again without having to load any modules. You can follow the guide in the GitHub link to generate your own AI images but those are not part of the course.
Spack environments¶
Create a directory lab6/spackenv
to get started. We will be using the same Spack installation initialize your Spack that we used in lab2. Activate the setup-env.sh
script if you haven't done so in your .bashrc
file.
spack env --help
will give you an overview of what you can do.
Complete
Run spack env create -d /gpfs/space/projects/hpc-course/<your_hpc_username>/lab6/spackenv/fourthenv
Navigate to the directory that you specified. In it is spack.yaml
that is used to configure the environment. The magic happens in the .spack-env
directory that is hidden in the same place.
Inside the fourthenv
directory, run spacktivate -p .
. The -p
is optional, but makes it pretty by displaying what environment you are in. .
denotes the path to the environment.
When you run spack find
you will see that you have 0 installed packages. You can install packages to the environment by using spack install
but we will be modifying the spack.yaml
file instead.
In spack.yaml
find the line starting with specs:
, change it to:
specs:
- python@3.8
- tcl
Now run spack find
again. You will see that the root specs list has been updated but still zero installed packages. Fix that with spack install
. The python@3.8
package will be installed separately but the tcl
one won't . This is because Spack will install it to the default location and will link them to the view under .spack-env
.
The Spack environments documentation has tons of more information on what you can achieve with environments.
Singularity containers¶
In this course we will be going over the basic functionality of Singularity. Using it gets easier with practice so we recommend thinking "can I run this in a container instead" for any workflow that you have.
Start off by creating the lab6/singularity
directory and loading the module load singularity
module.
You can see Singularity functionality with singularity help
A library of openly available images can be found at the sylabs library
Running containers¶
First you have to get a container. We will use lolcow
in this lab. This is the same image that is used in the official Singularity user guides.
Run singularity pull library://lolcow
in your lab6/singularity
directory. This will download the lolcow_latest.sif
image file.
Now you have an image to run, try entering singularity run lolcow_latest.sif
and see what happens.
Singularity run vs Singularity exec¶
Singularity containers have programs and commands in them that you can execute. You can target them individually with singularity exec
. In our case our command is cowsay
. Run singularity exec lolcow_latest.sif cowsay "I love the hpc course"
to see what happens.
The containers get built with a runscript
section, we will cover that in a bit. That section defines what gets executed by default:
%runscript
date | cowsay | lolcat
Usually the software documentation has the commands that you can run with the image.
Singularity shell¶
You are able to open a shell session inside a image. This can be used for debugging purposes and getting a better overview of what you are working with. Run singularity shell lolcow_latest.sif
to open the terminal, after that you can run cowsay
directly from the command line. If you want to write inside the container, you also need to pass the --writable
flag.
File management with Singularity¶
Create a test file called hello.txt
with a simple Hello
written in it. Also run export SINGULARITY_BIND="/gpfs/space/projects/hpc-course/hpc_yourname/lab6/singularity"
.
Now when you run singularity exec lolcow_latest.sif cat hello.txt
you should see the contents of the file. Singularity containers will have access to the same files that you do currently. It might happen that you have a container that is only able to read files from a certain path. For that you can use the --bind
flag to mount a specific directory
singularity exec --bind /etc:/mnt lolcow_latest.sif cat /mnt/hosts
We used the SINGULARITY_BIND
variable a bit ago to define the --bind
options with a variable. Both can be used according to your workflow. In default configuration, only $HOME
, /tmp
, /proc
, /sys
, /dev
and some more are bound automatically. $PWD
used to be as well but it got changed due to bugs with the system not always finding the current working directory. You can reset the $PWD
functionality by having export SINGULARITY_BIND=$PWD
in your env or scripts.
We can also write to files with programs inside the container: singularity exec lolcow_latest.sif cowsay "hello from the container" >> hello.txt
Building Singularity images¶
The recipes to build Singularity containers are called Singularity definition files. Like Dockerfile
, they describe how the image should be built. We will be looking at the lolcow
definition file:
BootStrap: library
From: ubuntu:22.04
%post
apt-get -y update
apt-get -y install cowsay lolcat
%environment
export LC_ALL=C
export PATH=/usr/games:$PATH
%runscript
date | cowsay | lolcat
%labels
Author Sylabs
The file describes what should be the base image, what commands to run during build time, the environment and much more. The aforementioned %runscript
is also here.
This file would be called lolcow.def
or something similar and built with singularity build lolcow.sif lolcow.def
.
We will not be building containers in this lab since it needs root access. You are free to install Singularity on your own machine(you can use Spack for that) to try out the building process.
Containers with Spack¶
Spack has a very handy tool called spack containerize
that creates image definition files from environments. Move back to the spack directory containing the fourth environment and spack.yaml
.
Running spack containerize
in that directory will create a Dockerfile
by default, creating definition files needs a small addition to the spack.yaml
file:
...
specs:
- python@3.8
- sqlite
- tcl
container:
format: singularity
...
Now run spack containerize >> spackenv.def
and view the contents.
What spack does in a nutshell is:
- uses a Ubuntu image that has spack inside of it
- installs the required packages inside that environment
- makes the necessary environment modifications
Cotainr¶
Next we will be looking into how to create containers that contain your conda environment
We will be using a tool called cotainr
, and using their official documentation
Complete
Download the latest release with wget https://github.com/DeiC-HPC/cotainr/archive/refs/tags/2023.11.0.tar.gz
You can download it anywhere but you must be able to locate it later.
Unpack the tarball with tar xzvf 2023.11.0.tar.gz
For ease of use, create a directory called $HOME/.local/bin
and link the cotainr
executable there with ln -s /path/to/cotainr/bin/cotainr $HOME/.local/bin/cotainr
. This makes it so that you can execute it directly from the command line
Now you should have cotainr
installed, load the latest Singularity module with module load singularity
and verify dependencies with cotainr info
. Check uses with cotainr help
We will build a container from our first conda environment. To do that, execute:
cotainr build my_conda_env_container.sif --base-image=docker://ubuntu:22.04 --conda-env=firstenv-export.yaml
Try it out with singularity exec my_conda_env_container.sif conda list
to see the packages from your env from inside the container.
SHPC¶
Next we will be using a tool called SHPC to install containers from a central registry and use them as modules.
Complete
The install process for shpc
is similar to cotainr
: wget https://github.com/singularityhub/singularity-hpc/archive/refs/tags/0.1.28.tar.gz
tar xzvf 0.1.28.tar.gz
This time, you cannot execute it directly but must install it via pip. Since it is a general use tool, you can use your base pip environment, aka you don't need to create a new one (although you can if you wish!)
Move to the unpacked directory and run pip install -e .[all]
After installation shpc --help
should give you helpful output.
We will be doing some configuration edits to specify our container and module directories. For that create two directories in lab6
called containers
and modules
.
Set correct paths with shpc config set module_base /path/to/lab6/modules
and shpc config set container_base /path/to/lab6/containers
. Check all available configuration options with shpc config edit
We will be installing python
as our first module. It is as easy as shpc install python
. This will download the container and create the python/3.13-rc/module
. Since our created module path is not available automatically, enable it with module use /path/to/lab6/modules
. Now you should be able to load the aforementioned module.
Go check inside of the module file that was created. You will see a lot of magic done for you and how it uses a container inside. This knowledge is not needed for this course but it is good to know.
Load the module and try running python --version
to ensure you got the correct one.
Extra notes¶
The last two programs, cotainr
and shpc
, are things that we are looking into incorporating into our HPC software stack for multiple reasons. It's best to start using these as soon as possible if you have the need as they should make life much easier for you. Try to look at the links to the official documentation of these, alongside of Singularity, Conda etc.