Skip to end of banner
Go to start of banner

Connecting to a Supercomputer

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 12 Next »

This section explains how to connect to and interact with Pawsey supercomputing systems.

On this page

Prerequisites

To access a supercomputer, users have a Pawsey account, comprising a username and a password. The account must also be a member of an active project allocation on the selected supercomputer. For new users, an account creation email with instructions is sent when you receive an allocation or are added to a project.

Introduction

Pawsey supercomputers are accessed remotely through the SSH protocol. Most of the time users employ the ssh command-line tool installed on their computers, which allows executing commands through a terminal window; other programs implementing the SSH protocol may be used. To execute programs that display a graphical interface you can use X forwarding over SSH.

When a connection is established with any of our systems, it is to a login node. Login nodes are the "front desk" of the system, and they allow users to manage their workflows, edit files, and submit jobs to the scheduler to be executed on the compute nodes. The compute nodes are where the main computations are processed and they can be accessed through the different queues or partitions managed by the scheduler. These concepts are illustrated in figure 1.


 

Figure 1. An abstract overview of a supercomputer architecture

Use SSH to connect to a supercomputer

The login node of a Pawsey supercomputer is reachable through the Internet at an assigned public hostname. The hostname of the login nodes of each Pawsey's supercomputer is listed in table 1. The linux command ssh should be used by users to connect to the login node of a remote host, as shown in the following line.

$ ssh [options] <username>@<hostname>

Replace <username> with the username of your Pawsey account, and <hostname> with one of the hostnames listed in table 1.

Table 1. List of Pawsey's supercomputing systems' hostnames.

SystemHostname
Setonixsetonix.pawsey.org.au
Topaztopaz.pawsey.org.au
Garrawarlagarrawarla.pawsey.org.au
Data Moverdata-mover.pawsey.org.au

All Linux and macOS distributions come installed with a terminal application that can be used for SSH access to the login nodes.

  • Linux users have different terminals available depending on which distribution and window manager they use (for example: GNOME Terminal in GNOME; Konsole in KDE).

Consult your Linux distribution's documentation for details on how to load a terminal.

  • On macOS you can use the Terminal application, which is located in the Utilities folder within the Applications folder.

Another popular terminal application for MacOS is iTerm2, which needs to be installed separately.

The Microsoft Windows operating system now has in-built SSH client support. It may first need to be enabled as an optional feature in the settings. When using the client at the Windows command prompt or PowerShell, the correct MAC option must also be provided:

$ ssh -m hmac-sha2-512 [options] <username>@<hostname>

Alternatively, the line MACs hmac-sha2-512 can be added to a file called config that can be created in the C:\Users\<username>\.ssh directory in Windows to avoid providing this option every time.

It does not currently support X forwarding of graphical interfaces, for which use of the MobaXterm client is recommended.

To authenticate the connection, a user can either enter the password for each connection or use an SSH key. SSH keys provide more security and eliminate the need to enter the password each time the ssh command is executed.

An SSH key is an access credential in the SSH protocol. Its function is similar to that of username and password but enables automated processes and single sign-on users.

Login nodes

The hostname of a supercomputing system is often a pseudonym of several login nodes that the system may have. When connecting remotely, connections are randomly placed on a login node from where you interact with the rest of the system using a round-robin DNS technique. The actual hostname of the login node ("setonix-1" in the example below) can be easily obtained by executing the hostname Linux command after an SSH connection has been established. Terminal 1 illustrates an SSH connection to a login node and outputting the hostname.

Terminal 1. Connection via the SSH command
$ ssh username@setonix.pawsey.org.au
Password:
Last login: Mon Jan 10 11:07:13 2022 from 130.116.145.55
##############################################################################
#                    Pawsey Supercomputing Centre                            #
#        Empowering cutting-edge research for Australia's future             #
#                                                                            #
#     This service is for authorised clients only.                           #
#     It is a criminal offence to:                                           #
#          - Obtain access to data without permission                        #
#          - Damage, delete, alter or insert data without permission         #
#                                                                            #
##############################################################################
.
.
.
===============================================================================
 By using Pawsey facilities you agree to the Conditions of use available at
 https://support.pawsey.org.au/documentation/display/US/Conditions+of+Use
 
===============================================================================
username@setonix-1:~> hostname
setonix-1

Remote development using Visual Studio Code

Visual Studio Code is a popular free and open-source code editing application that can be deployed on Linux, macOS and Windows. It has an integrated terminal within its user interface that removes the need to switch between command-line tasks and code editing. The functionality of VS Code can easily be extended by installing extensions. These extensions allow for almost arbitrary language support, debugging or remote development.

Please refer to the Visual Studio HomePage (external site) to learn more about Visual Studio Code and for downloading and installation instructions. The default terminal shell is bash on Linux and macOS, and PowerShell on Windows. 

Windows users must install Git for Windows (external site) to then configure the default terminal shell to bash.

Although the integrated terminal on Visual Studio Code can be used directly to SSH login to Pawsey systems, the Remote Development extension pack provides the ability open remote directories and text files on Visual Studio Code for in-app code editing and building. The Remote Development extension pack is easily installed from the Marketplace within the application. There are instructions for downloading the Remove Development extension here. See also the instructions on the official web page Remote Development using SSH (external site).

After the standard installation procedure, you may also need to check the box for Remote.SSH: Lockfiles in Tmp, under Settings, in order to connect to Pawsey systems.

Preventing unexpected behaviour from Visual Studio Code

If you want to end your remote session, click the green box in the lower left corner. In the input box that opens, select the "Close Remote Connection" option. If you simply close your VS Code window, some server-side components of VS Code will continue to run remotely.


Figure 2. Example of Visual Studio Code screen feature that allows clean disconnection from SSH session.


If Visual Studio Code has left some related processes running on the login nodes, these may use CPU and prevent you from logging in to Setonix via Visual Studio Code. If you are unable to login to Setonix with Visual Studio Code, instead use a different command line interface such as Terminal to ssh into Setonix. From there you can you can identify any leftover Visual Studio Code processes using the 'ps' command. 


Leftover Visual Studio Code processed will take the form shown below, where the 40-character hex-string is randomly created for each process.

Terminal 2. Finding and killing unexpected VS Code processes
$ ps
PID TTY          TIME CMD

162952 pts/150  00:00:00 /home/<username>/.vscode-server/bin/695af097c7bd098fbf017ce3ac85e09bbc5dda06/node 

## To kill the process, use

$ kill -9 <PID>



We suggest that our users regularly check what processes they have running, and clean up any leftover processes that they know are no longer in use. If you find this still doesn't resolve the issue, you may need to purge the Visual Studio Code directory on Setonix using the following:

Terminal 3. Purging the VS Code files
$ rm -rf ~/.vscode-server/

Preventing Visual Studio Code overloading the login nodes

The Visual Studio Code filewatcher and file searcher (rg) indexes all the files you have access to in your workspace. If you have a large dataset (e.g. machine learning) this can take a lot of resources on the login nodes. Please note that making some changes to your settings.json file on Setonix can prevent this issue. 

Terminal 4. Updating the settings.json file
# Create the settings.json file

$ touch ~/.vscode-server/data/Machine/settings.json

# add the following information to settings.json with your favourite text editor

"files.watcherExclude": {
  "**/.git/objects/**": true,
  "**/.git/subtree-cache/**": true,     
  "**/node_modules/*/**": true,
  "/usr/local/**": true,
  "/scratch/**": true},

"search.followSymlinks": false,

"search.exclude": {     
  "**/.git/objects/**": true,
  "**/.git/subtree-cache/**": true,     
  "**/node_modules/*/**": true,
  "/usr/local/**": true,
  "/scratch/**": true},


Related pages

External links


  • No labels