Conda and Reproducible Installations

Conda is a popular package manager to perform binary installations of a large variety of packages. Although these do not provide optimal performance on HPC, they can represent an acceptable solution when workflow reproducibility and portability are crucial (similar to containers). A faster version of Conda is Mamba, which we recommend you use in combination with Conda, or instead of Conda. Mamba and Conda can work together, or you can use Mamba as a drop-in replacement for Conda. If you know how to use Conda, you know how to use Mamba.  Mamba is much faster at downloading packages, and creates fewer files during the installation process.


This page documents how to improve reproducibility of your environment by saving the details of Conda-installed packages as a YAML file.

Whatever commands you'd use in conda, you can use with mamba. They are interchangeable, and you can mix and match. For example:

Terminal 1. Conda and Mamba are interchangeable
# Example mix and match 
$ conda create -n test_env && conda activate test_env && mamba install -y astropy
 
# The above example is the equivalent to the code below
$ mamba create -n test_env && mamba activate test_env && conda install -y astropy

Reproducible installations with conda and mamba

Suppose we've used conda to install the package astropy  in a newly created environment:


Terminal 2. conda and mamba install
(base) $ conda create -y -n astropy
[..]
(base) $ conda activate astropy
[..]
(astropy) $ conda install -y astropy=3.2.3

[..]

(astropy) $ 

# Equivalent process with mamba

(base) $ mamba create -y -n astropy
[..]
(base) $ mamba activate astropy
[..]
(astropy) $ mamba install -y astropy=3.2.3

[..]

(astropy) $


We can generate a list of packages that are installed in the active conda environment, and their versions, using conda env export:

Terminal 3. conda/mamba env export
(astropy) $ conda env export >environment.yaml
(astropy) $ 

(astropy) $ cat environment.yaml
name: astropy
channels:
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - astropy=4.2.1=py39h6323ea4_1
  - blas=1.0=mkl
  - ca-certificates=2021.5.25=h06a4308_1
  - certifi=2021.5.30=py39h06a4308_0
  - intel-openmp=2021.2.0=h06a4308_610
  - ld_impl_linux-64=2.33.1=h53a641e_7
  - libffi=3.3=he6710b0_2
  - libgcc-ng=9.1.0=hdf63c60_0
  - libstdcxx-ng=9.1.0=hdf63c60_0
  - mkl=2021.2.0=h06a4308_296
  - mkl-service=2.3.0=py39h27cfd23_1
  - mkl_fft=1.3.0=py39h42c9631_2
  - mkl_random=1.2.1=py39ha9443f7_2
  - ncurses=6.2=he6710b0_1
  - numpy=1.20.2=py39h2d18471_0
  - numpy-base=1.20.2=py39hfae3a4d_0
  - openssl=1.1.1k=h27cfd23_0
  - pip=21.1.1=py39h06a4308_0
  - pyerfa=2.0.0=py39h27cfd23_0
  - python=3.9.5=hdb3f193_3
  - readline=8.1=h27cfd23_0
  - setuptools=52.0.0=py39h06a4308_0
  - six=1.15.0=py39h06a4308_0
  - sqlite=3.35.4=hdfb4753_0
  - tk=8.6.10=hbc83047_0
  - tzdata=2020f=h52ac0ba_0
  - wheel=0.36.2=pyhd3eb1b0_0
  - xz=5.2.5=h7b6447c_0
  - zlib=1.2.11=h7b6447c_3
prefix: /group/pawsey0001/mdelapierre/PSS/conda-test/miniconda3/envs/astropy

# Equivalent with mamba

(astropy) $ mamba env export >environment.yaml
(astropy) $ 

(astropy) $ cat environment.yaml
name: astropy
channels:
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - astropy=4.2.1=py39h6323ea4_1
  - blas=1.0=mkl
  - ca-certificates=2021.5.25=h06a4308_1
  - certifi=2021.5.30=py39h06a4308_0
  - intel-openmp=2021.2.0=h06a4308_610
  - ld_impl_linux-64=2.33.1=h53a641e_7
  - libffi=3.3=he6710b0_2
  - libgcc-ng=9.1.0=hdf63c60_0
  - libstdcxx-ng=9.1.0=hdf63c60_0
  - mkl=2021.2.0=h06a4308_296
  - mkl-service=2.3.0=py39h27cfd23_1
  - mkl_fft=1.3.0=py39h42c9631_2
  - mkl_random=1.2.1=py39ha9443f7_2
  - ncurses=6.2=he6710b0_1
  - numpy=1.20.2=py39h2d18471_0
  - numpy-base=1.20.2=py39hfae3a4d_0
  - openssl=1.1.1k=h27cfd23_0
  - pip=21.1.1=py39h06a4308_0
  - pyerfa=2.0.0=py39h27cfd23_0
  - python=3.9.5=hdb3f193_3
  - readline=8.1=h27cfd23_0
  - setuptools=52.0.0=py39h06a4308_0
  - six=1.15.0=py39h06a4308_0
  - sqlite=3.35.4=hdfb4753_0
  - tk=8.6.10=hbc83047_0
  - tzdata=2020f=h52ac0ba_0
  - wheel=0.36.2=pyhd3eb1b0_0
  - xz=5.2.5=h7b6447c_0
  - zlib=1.2.11=h7b6447c_3
prefix: /group/pawsey0001/mdelapierre/PSS/mamba-test/mamba/envs/astropy


With some text substitutions, this YAML file can be turned into one that is accepted by conda as an input file to install packages in an environment:

Terminal 4. Editing the YAML file
(astropy) $ cp environment.yaml requirements.yaml

(astropy) $ sed -i -n '/dependencies/,/prefix/p' requirements.yaml 
(astropy) $ sed -i -e '/dependencies:/d' -e '/prefix:/d' requirements.yaml 
(astropy) $ sed -i 's/ *- //g' requirements.yaml 

(astropy) $ cat requirements.yaml
_libgcc_mutex=0.1=main
astropy=4.2.1=py39h6323ea4_1
blas=1.0=mkl
ca-certificates=2021.5.25=h06a4308_1
certifi=2021.5.30=py39h06a4308_0
intel-openmp=2021.2.0=h06a4308_610
ld_impl_linux-64=2.33.1=h53a641e_7
libffi=3.3=he6710b0_2
libgcc-ng=9.1.0=hdf63c60_0
libstdcxx-ng=9.1.0=hdf63c60_0
mkl=2021.2.0=h06a4308_296
mkl-service=2.3.0=py39h27cfd23_1
mkl_fft=1.3.0=py39h42c9631_2
mkl_random=1.2.1=py39ha9443f7_2
ncurses=6.2=he6710b0_1
numpy=1.20.2=py39h2d18471_0
numpy-base=1.20.2=py39hfae3a4d_0
openssl=1.1.1k=h27cfd23_0
pip=21.1.1=py39h06a4308_0
pyerfa=2.0.0=py39h27cfd23_0
python=3.9.5=hdb3f193_3
readline=8.1=h27cfd23_0
setuptools=52.0.0=py39h06a4308_0
six=1.15.0=py39h06a4308_0
sqlite=3.35.4=hdfb4753_0
tk=8.6.10=hbc83047_0
tzdata=2020f=h52ac0ba_0
wheel=0.36.2=pyhd3eb1b0_0
xz=5.2.5=h7b6447c_0
zlib=1.2.11=h7b6447c_3


If we need to re-install exactly the same environment later on, we can make use of this YAML requirements file:

Terminal 5. conda/mamba install from requirements
(base) $ conda create -y -n astropy-bis
[..]
(base) $ conda activate astropy-bis
[..]
(astropy-bis) $ conda install -y --no-deps --file requirements.yaml

[..]

(astropy-bis) $ 


# Equivalent with mamba

(base) $ mamba create -y -n astropy-bis
[..]
(base) $ mamba activate astropy-bis
[..]
(astropy-bis) $ mamba install -y --no-deps --file requirements.yaml

[..]

(astropy-bis) $

Note how we’re now using conda with the option --no-deps, to instruct it not to consider any package dependency for installation, but just those packages in the requirements list. In principle, this is dangerous and can lead to broken environments, but here it is safe as we obtained this list by exporting a real, functional environment.

Related pages