Article Index


Fig. 5. Slurm Job Flow for Composable Nodes

3.7 Slurm Job Results

In order to test the queues, three simple scripts were created:
  1. slurm-test.sh - request 0 GPUs from the normal queue
  2. slurm-test-gpu2.sh - request 2 GPUs from the 2gpu queue
  3. slurm-test-gpu4.sh request 4 GPUs from the 4gpu queue

Each script counts the number of GPUs available (using lspci) and waited 30 seconds before completing. The pertinent part of slurm-test-gpu4.sh is shown below.

#SBATCH --partition=4gpu
SLEEPTIME=30
ME=$(hostname)
GPUS=$(lspci|grep -i nvidia|wc -l)
echo "My name is $ME and I have $GPUS GPUs"
echo Sleeping for $SLEEPTIME
sleep $SLEEPTIME
echo done

The correct number of GPUs was reported for each script as indicated in Table 1 above. While the test was running, an sinfo command was run to show the state of the queues. (output compressed and abbreviated):

PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
normal* up inf 2 drain~ kraken-a,leviathan-a
2gpu    up inf 2 drain~ kraken-a-2gpu,leviathan-a-2gpu
4gpu    up inf 1 alloc# kraken-a-4gpu
4gpu    up inf 1 drain~ leviathan-a-4gpu

Notice, all the other nodes are in the drain configuration (not available) and the "#" next to a "alloc" indicates the node is allocated and is in the power-up state.

You have no rights to post comments

Search

Login And Newsletter

Create an account to access exclusive content, comment on articles, and receive our newsletters.

Feedburner

HPCWire


Creative Commons License
©2005-2019 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.