Frequently asked questions¶

Frequent problems¶

“Illegal instruction” (`SIGILL`)¶

Your job causes an “illegal instruction” or SIGILL error, for instance:

[alice@ip09 ~]$ cat slurm-3304643.out
/var/spool/slurmd/job3304643/slurm_script: line 11: 25047 Illegal instruction   (core dumped) python my_script.py

Such errors happen when a program was optimised for a specific class of processors but is being run on an incompatible processor.

If you compile your code on ip09 with GCC, use the -march=core-avx2 optimisation option. If you use the Intel compilers, the corresponding option is -xCORE-AVX2. Do not use -march=native or -xHost. The latter attempt to optimise for the Intel processors on ip09. The resulting program can be incompatible with some compute nodes’ AMD processors.

My computation is much slower than on my laptop¶

You tested a parallel program on your laptop (or another computer). The program also works in a job on the IQ HPC Platform but is much slower there. What is going on?

First, make sure to use the same number of CPU cores on your laptop and in your job to ensure the comparison is valid. For instance, if your laptop has 8 cores and your program uses them all, you should request 8 CPU cores in your job. If you ask for fewer, or, worse, a single core, performance will of course be lower!

If the problem persists, check if your task uses allocated resources correctly. A common problem is a parallel program that tries to use all the CPU cores on a compute node even though they are not all allocated to the job. For instance, a program could start 96 execution threads to try to use the 96 CPU cores on a compute node from the c-iq partition, even though that job only has access to a single core. The operating system will then alternate between the 96 threads so that each one gets a share of CPU time. This context switching causes an important performance degradation. In addition, thread synchronisation problems can make performance even worse.

In most cases, the number CPU cores allocated to a job should match exactly the number of processes or threads started by the job. If a job starts to many threads or processes, htop will show more execution threads than expected, and each thread will not use 100% CPU time as expected.

For parallel programs that use OpenMP execution threads, the problem can be solved by setting the number of threads to match the number of allocated cores. Add this instruction to your job script before your actual calculation:

export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1}

Some Intel MKL functions are automatically parallelised. In HPC, this implicit parallelism is usually undesirable. You can deactivate it with:

export MKL_NUM_THREADS=1

If you would rather use these parallel algorithms, set the number of threads with:

export MKL_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1}

Frequently asked questions¶

Frequent problems¶

“Illegal instruction” (SIGILL)¶

My computation is much slower than on my laptop¶

“Illegal instruction” (`SIGILL`)¶