qsub


SLURM how to qsub a task when another task is finished?


I am currently using HPC based on Linux which use only SLURM to submit jobs, and the HPC only allows a job to be run for 12 hours. However, I may need to run 24 jobs continuously for a week to have good results.
Is there a way to run a job again (automatically) when it is finished?
Kind regards
Add:
When the job is finished a .out file will be created. In other words, the number of .out file will increase by 1.
Is it possible to requeue the job when the number of .out is increased?
#!/bin/bash
#!
#! Example SLURM job script for Darwin (Sandy Bridge, ConnectX3)
#! Last updated: Sat Apr 18 13:05:53 BST 2015
#!
#!#############################################################
#!#### Modify the options in this section as appropriate ######
#!#############################################################
#! sbatch directives begin here ###############################
#! Name of the job:
#SBATCH -J Validation
#! Which project should be charged:
#SBATCH -A SOGA
#! How many whole nodes should be allocated?
#SBATCH --nodes=1
#! How many (MPI) tasks will there be in total? (<= nodes*16)
#SBATCH --ntasks=1
#!SBATCH --mem=200
#! How much wallclock time will be required?
#SBATCH --time=12:00:00
#SBATCH --mail-user=zl352
#SBATCH --mail-type=ALL
#! Uncomment this to prevent the job from being requeued (e.g. if
#! interrupted by node failure or system downtime):
##SBATCH --no-requeue
#! Do not change:
#SBATCH -p sandybridge
#! sbatch directives end here (put any additional directives above this line)
#! Notes:
#! Charging is determined by core number*walltime.
#! The --ntasks value refers to the number of tasks to be launched by SLURM only. This
#! usually equates to the number of MPI tasks launched. Reduce this from nodes*16 if
#! demanded by memory requirements, or if OMP_NUM_THREADS>1.
#! Each task is allocated 1 core by default, and each core is allocated 3994MB. If this
#! is insufficient, also specify --cpus-per-task and/or --mem (the latter specifies
#! MB per node).
#! Number of nodes and tasks per node allocated by SLURM (do not change):
numnodes=$SLURM_JOB_NUM_NODES
numtasks=$SLURM_NTASKS
mpi_tasks_per_node=$(echo "$SLURM_TASKS_PER_NODE" | sed -e 's/^\([0-9][0-9]*\).*$/\1/')
#! ############################################################
#! Modify the settings below to specify the application's environment, location
#! and launch method:
#! Optionally modify the environment seen by the application
#! (note that SLURM reproduces the environment at submission irrespective of ~/.bashrc):
. /etc/profile.d/modules.sh # Leave this line (enables the module command)
module purge # Removes all modules still loaded
module load default-impi # REQUIRED - loads the basic environment
#! Insert additional module load commands after this line if needed:
#! Full path to application executable:
application="~/scratch/code7/viv"
#! Run options for the application:
options=" > test.e"
#! Work directory (i.e. where the job will run):
workdir="$SLURM_SUBMIT_DIR" # The value of SLURM_SUBMIT_DIR sets workdir to the directory
# in which sbatch is run.
#! Are you using OpenMP (NB this is unrelated to OpenMPI)? If so increase this
#! safe value to no more than 16:
export OMP_NUM_THREADS=1
#! Number of MPI tasks to be started by the application per node and in total (do not change):
np=$[${numnodes}*${mpi_tasks_per_node}]
#! The following variables define a sensible pinning strategy for Intel MPI tasks -
#! this should be suitable for both pure MPI and hybrid MPI/OpenMP jobs:
export I_MPI_PIN_DOMAIN=omp:compact # Domains are $OMP_NUM_THREADS cores in size
export I_MPI_PIN_ORDER=scatter # Adjacent domains have minimal sharing of caches/sockets
#! Notes:
#! 1. These variables influence Intel MPI only.
#! 2. Domains are non-overlapping sets of cores which map 1-1 to MPI tasks.
#! 3. I_MPI_PIN_PROCESSOR_LIST is ignored if I_MPI_PIN_DOMAIN is set.
#! 4. If MPI tasks perform better when sharing caches/sockets, try I_MPI_PIN_ORDER=compact.
#! Uncomment one choice for CMD below (add mpirun/mpiexec options if necessary):
#! Choose this for a MPI code (possibly using OpenMP) using Intel MPI.
#!CMD="mpirun -ppn $mpi_tasks_per_node -np $np $application $options"
#! Choose this for a pure shared-memory OpenMP parallel program on a single node:
#! (OMP_NUM_THREADS threads will be created):
CMD="$application $options"
#! Choose this for a MPI code (possibly using OpenMP) using OpenMPI:
#!CMD="mpirun -npernode $mpi_tasks_per_node -np $np $application $options"
###############################################################
### You should not have to change anything below this line ####
###############################################################
cd $workdir
echo -e "Changed directory to `pwd`.\n"
JOBID=$SLURM_JOB_ID
echo -e "JobID: $JOBID\n======"
echo "Time: `date`"
echo "Running on master node: `hostname`"
echo "Current directory: `pwd`"
if [ "$SLURM_JOB_NODELIST" ]; then
#! Create a machine file:
export NODEFILE=`generate_pbs_nodefile`
cat $NODEFILE | uniq > machine.file.$JOBID
echo -e "\nNodes allocated:\n================"
echo `cat machine.file.$JOBID | sed -e 's/\..*$//g'`
fi
echo -e "\nnumtasks=$numtasks, numnodes=$numnodes, mpi_tasks_per_node=$mpi_tasks_per_node (OMP_NUM_THREADS=$OMP_NUM_THREADS)"
echo -e "\nExecuting command:\n==================\n$CMD\n"
eval $CMD
If your job is intrinsically restartable, all you need to do is to call sbatch at the end of your submission script. Assuming it is called submit.sh
if ! job_is_done;
then
sbatch submit.sh
fi
The job_is_done part should be replaced by a command that returns 0 when the job is done (i.e. computation finished, process converged, etc.) for instance by 'grepping' in the log file for certain clues.
You can also re-queue the job:
job_is_done || scontrol requeue $SLURM_JOB_ID
If your program is not intrinsically restartable, you could use a wrapper such as DMCTP to make it restartable.

Related Links

Qsub job runs but doesn't write to file
Qsub job not running - possible issue in submission script
How do I know where my qsub job is running/being written
Pipe Symbol in qsub Job name
SLURM how to qsub a task when another task is finished?
Can multiple qsub submissions read the same group of files?
SGE faild to submit job, attribute is not a memory value
How do you submit a job on multiple queues with Torque?
Maui - preventing jobs from running on the same node
qsub: What is the standard when to get occasional updates on a submitted job?
Submitting a job to qsub generates an error, “Warning: no access to tty”
Running samtools from a qsub
How do I schedule a job on multiple nodes with qsub Univa 8.1.7?
How to specify a fixed job name for jobs submitted by qsub
duplicate jobs in sun grid engine
SGE qsub define variable using bach?

Categories

HOME
dotnetrdf
mql4
nuxeo
qpython3
thunderbird-addon
actionscript
cocos2d-x-3.0
zend-framework2
apache2
proguard
currency
ng-admin
structuremap
uiactivityviewcontroller
powershell-v3.0
keystore
google-shopping
qt-installer
uiview
glpk
mod-pagespeed
scenebuilder
alfresco-share
grub2
internet-explorer-8
uiautomator
apple-tv
servicemix
spring-data-neo4j
cgal
dhtmlx-scheduler
gpib
svnkit
ios-ui-automation
remote-server
office365connectors
es-shell
parse-android-sdk
axis-labels
fltk
dojox.mobile
g1gc
rkt
fastq
python-idle
bids
vcf
timesten
jenkins-jira-trigger
mouseclick-event
rollback
fedora20
pagefile
static-code-analysis
createobject
atlassian-crowd
gradle-script-kotlin
uistackview
excon
try-finally
squirrel
string-parsing
inet
twgl.js
asp.net-mvc-2
approval-tests
test-class
apigee-baas
castle-windsor-3
pyopengl
ytplayerview
crash-dumps
gray-code
web-component-tester
client-side-validation
tilestache
formvalidation-plugin
eclipse-classpath
getimagedata
data-generation
delphi-xe3
factors
opensocial
document-database
sql-server-2012-web
haskell-platform
dynamic-data
autostart
pstree
legacy-code
ninject-extensions
selectonemenu
fileutils
front-controller
graph-layout
economics
msf
ajaxpro

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App