SSH and SLURM basic and how to send a notification email upon job completion

How to run – basic

To run a .sh file with a specific environment using SLURM, you can follow these steps:

  1. Load the required modules: If you need to load any specific modules (e.g., Python, Anaconda), you can do so using the module load command.
  2. Activate the environment: Use the appropriate command to activate your environment. For example, if you’re using Conda, you can use conda activate my_env.
  3. Run your script: Finally, run your script within the activated environment.

Here’s an example of a SLURM script (job.sh) that demonstrates these steps:

#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --output=output.txt
#SBATCH --error=error.txt
#SBATCH --time=01:00:00
#SBATCH --partition=standard
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4

# Load the required module
module load anaconda

# Activate the environment
source activate my_env

# Run your script without queueing 
bash my_script.sh

It’s important to note that sometimes, depending on the server, you may need to specify specificly the location of the stuffs like

module load miniconda3/py310/23.1.0-1
eval "$(conda shell.bash hook)" #initializes Conda in the current shell session
conda activate my_enviroment
bash my_script.sh # run interactively 

Run your script with queueing

Now that you have your script ready, you can submit your job to SLURM using the sbatch command. Here are the steps:

  1. Open your terminal.
  2. Navigate to the directory where your job.sh script is located. You can use the cd command to change directories.
  3. Submit your job using the sbatch command followed by the name of your script:
sbatch job.sh

(Instead of navigating to the directory, you can also locate it using a command instead. For example, if it’s located in folder scripts, I can use sbatch scripts/job.sh)

Once you run this command, SLURM will schedule your job and you should see an output similar to this:

Submitted batch job 12345

The number 12345 is the Job ID assigned to your job by SLURM.

You can monitor the status of your job using the squeue command:

squeue -u your_username

To see more details about the job, you can use:

scontrol show job JOBID

Replacing JOBID with the actual Job ID you received when submitting the job.

And if you need to cancel a job, you can use the scancel command followed by the Job ID:

scancel 12345

How to send a notification email upon job completion

You can configure Slurm to send you an email notification upon job completion by using the --mail-type and --mail-user options when submitting a job. Here’s how:

  1. Add these options to your Slurm job script: #SBATCH --mail-type=END #SBATCH --mail-user=your_email@example.com
    • --mail-type=END: Sends an email when the job finishes successfully.
    • --mail-user=your_email@example.com: Replace this with your actual email address.
  2. Submit your job: sbatch your_script.sh

Alternatively, if you are submitting jobs interactively, you can include these options in your sbatch command:

sbatch --mail-type=END --mail-user=your_email@example.com your_script.sh

If you’d like notifications for different job events (start, failure, etc.), you can change END to:

  • BEGIN (job starts)
  • FAIL (job fails)
  • ALL (any state change)

Make sure that your Slurm configuration allows email notifications and that your system has sendmail or another mail service properly configured.


Discover more from Science Comics

Subscribe to get the latest posts sent to your email.

Leave a Reply

error: Content is protected !!