Overview: How to Use Software
In order to run jobs on the High Throughput Computing (HTC) system, researchers need to set up their software on the system. This guide introduces how to build software in a container (our recommended strategy), links to a repository with a selection of software installation “recipes”, and quick links to common software packages and their installation recommendations.
Table of Contents
Start Here
For most CHTC users, we recommend using a container for your software environment. A container is a portable, self-contained operating system and can be easily executed on different computers. When building the container you can choose the operating system you want to use, and can install programs as if you were the owner of the computer.
In general, if you have not used containers before, we recommend using Apptainer containers.
- For more specific guidance about your software, see: Quickstart by software type
- For more guidance in whether jumping right into Apptainer is best for you: What approach should I use?
- To learn more about containers in general see: Containers
Quickstart by software type
Click the link in the table below to jump to the instructions for the language/program/software that you want to use, and then click on “More Information.”
Quickstart: Conda
Option A - Container (recommended)
Build a container with Conda packages installed inside. Start by reviewing
the container build instructions for either Apptainer or Docker. Then, when
following the build instructions, use the example recipes for either a
.def
or Dockerfile
, respectively.
- How to build your own container: Apptainer / Docker
- Tips and tricks, container recipes: Example container recipes for Conda
- Use your container in your HTC jobs: Apptainer / Docker
Option B - “Packed” Conda Env
Create your own portable copy of your Conda packages:
More InformationThis approach may be sensitive to the operating system of the execution point. We recommend building a container instead, but are keeping these instructions as a backup.
What approach should I use?
We recommend using containers for jobs in CHTC for a number of reasons, detailed below.
There are two container implementations we support, Docker and Apptainer.
We recommend using Apptainer as a first choice because you can build the container on our servers and it has some features that allow it to be used on a greater variety of computing nodes.
However, there are many good reasons to use Docker (or Apptainer) depending on your circumstances. Here is a list of things to consider when deciding which path to take.
- If your software has an existing Docker or Apptainer/Singularity implementation, use that.
- If you or your group have an existing Docker container, use Docker.
- If you already know how to use Docker or Apptainer/Singularity, use whichever you are familiar with.
- If you want to create a container that can run both on CHTC and your own computer, use Docker.
- If you want to create a container that you and collaborators can use on your own computers and CHTC, use Docker.
- If you want to keep your container more private, use Apptainer.
In certain cases, it is reasonable to use a non-container option for software installation, especially when using a package manager like conda
.
If you are not sure what to choose, talk to the facilitation team! That’s why we’re here.
Container overview
In this section, we provide a brief introduction into what containers are, why we recommend them, and a big picture view of how to use them on our High Throughput system.
What is a container?
“A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.” – Docker
A container is a portable, self-contained operating system and can be easily executed on different computers regardless of their operating systems or programs. When building the container you can choose the operating system you want to use, and can install programs as if you were the owner of the computer.
As an analogy, you could consider a container to be like a camping backpack. Every time you plan to use it, you will need a standard set of gear, which you could pre-pack. Other items, like maps, food, or fuel would depend on where you’re going, but you would still have access to the standard gear.
In the same way, you build a container image by installing your software and any additional dependencies. Jobs that use containers can differ in their tasks or data, but they still have access to the installed software and environment.
While there are some caveats, containers are useful for deploying software on shared computing systems like CHTC, where you do not have permission to install programs directly.
“You can build a container using Apptainer on your laptop, and then run it on many of the largest HPC clusters in the world, local university or company clusters, a single server, in the cloud, or on a workstation down the hall.” – Apptainer
A “container image” is the persistent, on-disk copy of the container. When we talk about building or moving or distributing a container, we’re actually talking about the file(s) that constitute the container. When a container is “running” or “executed”, the container image is used to create the run time environment for executing the programs installed inside of it.
Why use containers?
We recognize that learning about containers is its own challenge! However, we have found in the long run, that containers provide the best long-term user experience when running jobs in CHTC, including:
- a consistent environment no matter what computer your job runs on
- a consistent environment with stable software versions (good for reproducibility)
- ability to fully customize the software installation
Our goal is to make finding or building your container as easy as possible, so you can have most of these benefits, without too much pain.
Using containers on CHTC
If you have a container, adding it to your job is easy! It’s just one more line in the submit file. The challenge is usually getting a container that has what you need. There are typically two ways of doing so:
- Find an existing container
- Build your own container image
If building your own container, the process will include:
- Writing a build file, called either a
Dockerfile
or definition (.def
) file.- Consult your software’s documentation
- Choose an existing base container
- Add the details needed for your software
- Leverage CHTC’s recipes repository
- (If using Apptainer: starting an interactive session on a CHTC build server)
- Running a command to “build” the container image, using your build file as the template for doing so
- Running a command to put the container in a location where you can use it in your jobs.
These guides talk about how to do the above steps for either Apptainer or Docker.
A common question is whether the software installation process is repeated each time a container is used. The answer is “no”. The software installation process only occurs when the container is actually being built. Once the container has been built, no changes can be made to the container when being used (on CHTC systems).
Container build recipes
If you need to create your own build file, CHTC provides many specific examples in our “Recipes” repository on Github: https://github.com/CHTC/recipes.
Links to specific recipes are used in the Software section for certain softwares and coding languages.
Container technologies
There are two container technologies supported by CHTC: Docker and Apptainer. Here we briefly discuss the advantages of each.
Docker
Docker is a commercial container technology for building and distributing containers. Docker provides a platform for distributing containers, called Docker Hub. Docker Hub can make it easy to share containers with colleagues without having to worry about the minutiae of moving files around.
On the HTC system, you can provide the name of your Docker Hub container in your submit file, and HTCondor will automatically pull (download) the container and use it to create the software environment for executing your job. Unfortunately, however, you are unable to build a Docker container and upload it to Docker Hub from CHTC servers, so your container must already exist on Docker Hub in a public repository. This requires that you have Docker installed on your computer so that you can build the container and upload it to Docker Hub.
Apptainer
Apptainer is an open-source container technology for building containers. Apptainer creates a single, stand-alone file that is the (container image). As long as you have the container image file, you can use Apptainer to run your container.
On the HTC system, you can provide the name of your Apptainer file in your submit file, and HTCondor will use a copy of it to create the software environment for executing your job. You can use Apptainer to build the container image file on CHTC servers, so there is no need to install the container software on your own computer.