Using Python

Python is a very popular programming language and has successfully found its way into many application areas including data science, machine learning, and scientific computing.

On this page:

Python can be used in many different ways in scientific computing, depending on the features of the researcher workflows. As a result, Pawsey provides different options to use and expand Python capabilities on the Setonix supercomputer and the other systems. 

How to use Python at Pawsey

Python is currently installed on Pawsey systems, in the form of software modules.

Do NOT use system Python

Upon login, you may notice that the following commands are already available: python , python2 , python3. These are Python interpreters installed by the system Operating System for its own purposes. They are old versions with limited functionalities.

Do NOT use them for your scientific workflows.

Cray Python is not recommended

The Setonix supercomputer also comes with a vendor-provided cray-python module. Some of the packages included in this module are quite old, therefore at this stage we recommend not to use it, and use instead the Pawsey-provided modules (see below).

Pawsey Software Stack Python is the recommended option

Pawsey provides Python as part of the Pawsey Software Stack. This software stack contains a couple of Python installations that can be used by loading the specific module, as described in the following sections.

Bare Python module

A module with the basic Python interpreter is available on all Pawsey systems. Available versions can be checked via the module avail python command. And they can be used by loading the desired version. For example:

$ module load python/3.9.7

Once loaded, the executables python and python3 are available for use, for instance:

$ python --version

Python 3.9.7

Modules for scientific Python packages

On the Setonix supercomputer, Pawsey curates a set of popular Python packages for scientific computing; where relevant, these packages are configured to leverage Cray MPI for distributed computing. The set can be inspected with module avail :

Terminal 1. Available scientific Python packages
$ module avail py-

..

----------------- /software/setonix/current/modules/zen3/gcc/11.2.0/python-packages ------------------
   py-cython/0.29.24     py-ipython/7.28.0      py-numba/0.54.0    py-scikit-learn/1.0.1
   py-dask/2021.6.2      py-matplotlib/3.4.3    py-numpy/1.20.3    py-scipy/1.7.1
   py-h5netcdf/0.10.0    py-mpi4py/3.1.2        py-pandas/1.3.4
   py-h5py/3.4.0         py-netcdf4/1.5.3       py-plotly/5.2.2

..

When any of these modules are loaded, its dependencies, including the Python interpreter, will also be loaded. When launched, the interpreter will be able to import the loaded scientific packages.

Containers with set of scientific Python packages

On the Setonix supercomputer, Pawsey also provides some curated Python containers. This is the best option for those workflows where reproducibility and portability of the job is a crucial requirement.

These containers include the same Python packages listed above, not necessarily with the same versions; they are configured to leverage Cray MPI from the host supercomputer, for optimal parallel performance.  These containers are installed as container modules using SHPC, hence they can be used by simply loading the corresponding module, and launching the interpreter:

Terminal 3. Python container modules
$ module avail hpc-python-container  

---------------------------- /software/setonix/current/containers/modules ----------------------------
   hpc-python-container/2022.03-hdf5mpi    hpc-python-container/2022.03 (D)

$ module load hpc-python-container/2022.03

$ python --version
Python 3.9.11

More information on these Python containers can be found on the corresponding page at the Pawsey container repository: quay.io/pawsey/hpc-python.

Related pages

External links