Installing Python Packages
Pip and Setuptools (setup.py) are the most popular tools for installing Python packages, and also the easiest ways to benefit from the Python performance libraries that come preinstalled on Pawsey systems.
Versions installed in Pawsey systems
To check the current installed versions, use the module avail
command (current versions may be different from content shown here):
$ module avail python --------------------- /software/setonix/2024.05/modules/zen3/gcc/12.2.0/programming-languages --------------------- python/3.9.15 python/3.11.6 (D) --------------------------------------- /opt/cray/pe/lmod/modulefiles/core ---------------------------------------- cray-python/3.9.13.1 cray-python/3.10.10 (D)
We recommend the use of python installed by Pawsey instead of cray-python
.
Many scientific and managing tools have been also installed in our software stack. These are named with the prefix py-
, and sometimes their flavour is also identified by a suffix -py<VERSION>
after the version corresponding to the tool itself (current versions may be different from content shown here):
$ module avail py- ---------------------- /software/setonix/2024.05/modules/zen3/gcc/12.2.0/astro-applications ----------------------- py-astropy/4.2.1 py-astropy/5.1 (D) py-emcee/3.1.1 py-funcsigs/1.0.2 --------------------------- /software/setonix/2024.05/modules/zen3/gcc/12.2.0/utilities --------------------------- py-boto3/1.26.26 py-setuptools/59.4.0-py3.9.15 py-setuptools/68.0.0-py3.11.6 (D) py-pip/23.1.2-py3.9.15 py-setuptools/59.4.0-py3.11.6 py-pip/23.1.2-py3.11.6 (D) py-setuptools/68.0.0-py3.9.15 ------------------------ /software/setonix/2024.05/modules/zen3/gcc/12.2.0/python-packages ------------------------ py-cython/0.29.36 py-ipython/8.14.0 py-numpy/1.24.4 py-scikit-learn/1.3.2 py-cython/3.0.4 (D) py-matplotlib/3.8.1 py-numpy/1.26.1 (D) py-scipy/1.11.3 py-dask/2023.4.1 py-mpi4py/3.1.5-py3.11.6 py-pandas/1.5.3 py-h5netcdf/0.10.0 py-netcdf4/1.6.2 py-pandas/2.1.2 (D) py-h5py/3.8.0 py-numba/0.57.0 py-plotly/5.14.1 ------------------------ /software/setonix/2024.05/modules/zen3/gcc/12.2.0/developer-tools ------------------------ py-hatchet/1.3.1
Before you begin
Basic information on these installation tools can be found in their websites (external links):
To install Python packages on top of the bare Python interpreter, pip and setuptools are available on Pawsey systems as modules (note the py-
prefix for Python packages):
$ module load python/3.11.6 py-pip/23.1.2-py3.11.6
$ module load python/3.11.6 py-setuptools/68.0.0-py3.11.6
Once you choose one of these two options to install Python packages, you should keep using it, to avoid conflicts and errors arising from packages installed in different locations.
Pawsey Python modules also preconfigure the shell environment by default to provide a meaningful location for user-specific package installations:
export PYTHONUSERBASE=/software/projects/<project-id>/<user-name>/setonix/python
export PATH=$PATH:$PYTHONUSERBASE/bin
Install a package using pip
Let's assume you've found a pip-installable package, for instance by browsing the Python Package Index (external link). In this example, we're installing astropy
, a popular package for astronomy.
First, we need to load the relevant system modules, including python
, py-pip
and any required dependency packages (in this case, py-numpy
):
$ module load python/<VERSION_OF_PYTHON> $ module load py-pip/<VERSION>-py<VERSION_OF_PYTHON> $ module load py-numpy/<VERSION>
Now let's proceed with the installation. We're going to use pip with the --user
flag to install in the Python user directory. (Users cannot install software in system directories on shared Pawsey supercomputer systems.)
$ pip install --user astropy==4.1 Collecting astropy==4.1 Downloading https://files.pythonhosted.org/packages/74/9c/a1e51955d4a2af497a507c323409ebe55c122a91c438d2884d918360efc1/astropy-4.1-cp36-cp36m-manylinux1_x86_64.whl (10.3MB) |████████████████████████████████| 10.3MB 14.7MB/s Requirement already satisfied: numpy>=1.16 in /pawsey/sles12sp3/python/3.6.3/numpy/1.19.0/lib/python3.6/site-packages/numpy-1.19.0-py3.6-linux-x86_64.egg (from astropy==4.1) (1.19.0) Installing collected packages: astropy Successfully installed astropy-4.1
Reproducible installations with pip
Let's go through a simple way of making the installation above more reproducible.
After the installation, we can use pip freeze
to save the list of installed packages and their versions:
$ pip freeze >requirements.txt $ cat requirements.txt astropy==4.1 numpy==1.19.0
If we need to reinstall exactly the same Python environment later on, we can make use of the list we have just created:
$ module load python/<VERSION_OF_PYTHON> $ module load py-pip/<VERSION>-py<VERSION_OF_PYTHON> $ module load py-numpy/<VERSION> $ sg <projectcode> -c 'pip install --user --no-deps -r requirements.txt' Collecting astropy==4.1 (from -r requirements.txt (line 1)) Downloading https://files.pythonhosted.org/packages/74/9c/a1e51955d4a2af497a507c323409ebe55c122a91c438d2884d918360efc1/astropy-4.1-cp36-cp36m-manylinux1_x86_64.whl (10.3MB) |████████████████████████████████| 10.3MB 16.0MB/s Requirement already satisfied: numpy==1.19.0 in /pawsey/sles12sp3/python/3.6.3/numpy/1.19.0/lib/python3.6/site-packages/numpy-1.19.0-py3.6-linux-x86_64.egg (from -r requirements.txt (line 2)) (1.19.0) Installing collected packages: astropy Successfully installed astropy-4.1
Note how we use the flag --no-deps
to make sure that pip only installs the packages that are listed in the requirements. This is a fully functional list of packages, as we got it from a previous installation.
Installing a package with setuptools
Sometimes you need to install a package without the support of pip, for example, installing the development branch to obtain a bug fix that has not been published yet. In this case, you can still rely on a software-aided procedure.
For example, to install the development branch of the Python package tqdm
, first download the source code:
$ git clone https://github.com/tqdm/tqdm
Change to the tqdm
source code directory and then load the required modules, namely python
and py-setuptools
:
$ module load python/<VERSION_OF_PYTHON> py-setuptools/<VERSION>-py<VERSION_OF_PYTHON>
Finally, execute the build and installation process:
$ sg <projectcode> -c 'python setup.py build' running build running build_py creating build [...] $ sg <projectcode> -c 'python setup.py install --user' running install running bdist_egg running egg_info [...] Finished processing dependencies for tqdm==4.61.1
Using virtual environments
Python packages can also be installed in a virtual environment, where all packages will be stored in the virtual environment directory. This virtual environment is a local copy of the installation of Python that can be modified without affecting the original one, which can be a very useful feature. This approach is similar to installing packages locally with --usr
or by using the Conda package manager.
$ python -m venv $MYSOFTWARE/manual/pythonEnvironments/python-venv # create a virtual environment $ source $MYSOFTWARE/manual/pythonEnvironments/python-venv/bin/activate # activate the environment, updating python paths $ sg <projectcode> -c 'pip install astropy==4.1' # install astropy in the environment $ deactivate # deactivate the environment, now cannot load packages installed in venv.
Related pages
- To assess the various options to use Python at Pawsey, see the page Using Python.
- To use containers for packaging and deploying Python workflows, see the Containers page.
- To use the Conda package manager, see the page Conda and Reproducible Installations.