IET Cluster manual

This is the manual for the usage of the IET cluster. Issues and suggestions can be submitted through the issue tracker.

Getting started

DISCLAIMER: no backup of your data is provided, this is the resposibility of the user!

Support

If you need support regarding the cluster, make sure to consult the support page first!

Account

To get an account for the cluster, please get in touch with the IET support team via iet-support@ost.ch.

Cluster layout

shpc0003.ost.chlogin nodeRocky Linux 8.6usershpc0002.ost.chmaster nodeRocky Linux 8.6backupold masterRocky Linux 8.6node01RHEL 8.6node02RHEL 8.6node03Rocky Linux 8.6...node24Rocky Linux 8.6ssh/sftp/scp

Connect to the Cluster

To connect to the cluster, it is recommended to use ssh, as it is available in Linux and Windows (Powershell). For further instructions on how to connect to the cluster with ssh, please refer to the ssh page.

If you prefer a graphical interface check out the vnc page.

Relevant folders

The folders most relevant for general usage of the cluster are given by:

  • Your home directory: shpc0002.ost.ch:/home/OST/<username>.
  • The data and project folders, found in shpc0002.ost.ch:/data.

For more details about the folder layout refer to the chapter Folder layout.

Software

Software is installed over ansible scripts for proprietary software and via guix for FOSS software.

The list of installed proprietary software can be found in software.yaml.

The list of installed FOSS software can be found in guix-modules.scm.

Modules

To use any specific software on the cluster, you first need to load its corresponding module. To do so:

  • List the available modules with module avail
  • Load a module via module load <module_name>
  • Unload a module via module unload <module_name>
  • Purge all loaded modules module purge

More information can be found in the chapter Environment modules.

Submitting a job

Job management and ressource allocation on the cluster is handled by SLURM. To submit a job to SLURM use the slurm-submit script ssubmit. It facilitates general job submission and has also helper commands for

  • starccm
  • comsol
  • matlab
  • moldex3d
  • openfoam

Detailed instructions can be found in the README.

More information about SLURM on the cluster can be found in the SLURM chapter.

Job status

To get an Overview of slurm's queue status and the current load on each node, open http://shpc0002.ost.ch in your browser.

To check any job's status, you can: - Use squeue to check job-ID, time consumed, and queing status of any job. - Check the standard output and standard error directed to a file of the name slurm-%j.out, where the %j is replaced with the job allocation number (found here).

To cancel a running job, you do scancel <job-ID>.

Accessing your data locally

To access your data on the cluster locally, you can, for example, use sshfs, a filesystem client based on SSH. To mount a filesystem, you type, for example:

sshfs <username>@shpc0003.ost.ch:~/run ~/shpc0003.ost.ch/run

To unmount it, you type the following.

fusermount3 -u mountpoint   # Linux
umount mountpoint           # OS X, FreeBSD

More information can be found in the chapter Data access.