qsub


How do I schedule a job on multiple nodes with qsub Univa 8.1.7?


I want to be able to schedule a job on multiple nodes, with one process per node. I also want that process to use threads that use all of the cores available on that node. I know that "ppn" is used for scheduling PBS jobs, so I tried it with the Univa scheduler. The colon delimiter doesn't work, so I used two '-l' flags. I attempted
qsub -cwd -j y -l nodes=4 -l ppn=1 -N hellonodes mpirunscript.sh
This gives
Unable to run job: unknown resource "ppn".
Exiting.
In the man page of qsub it states
complex(5) describes how a list of available resources and their
associated valid value specifiers can be obtained.
Unfortunately no such documentation exists on the cluster I am using. However, I found one here. Eventually I discovered that to get the list of settable resources values, I needed to run
qconf - sc
This output the below (abbreviated):
#name shortcut type relop requestable consumable default urgency
#------------------------------------------------------------------------------------------
...
cpu cpu DOUBLE >= YES NO 0 0
...
m_numa_nodes nodes INT <= YES NO 0 0
m_socket socket INT <= YES NO 0 0
m_thread thread INT <= YES NO 0 0
...
num_proc p INT == YES NO 0 0
...
slots s INT <= YES YES 1 1000
...
"ppn" (processes per node for PBS) was not listed, nor was anything similar that I could find. Can anyone tell me if this is possible, and if so, how?
Since it is a parallel job you need to request a parallel environment with -pe
The admin has to create a parallel environment which fulfill your requirements first. It is then persistent and can be used for this type of parallel jobs. See: http://www.gridengine.eu/mangridengine/htmlman5/sge_pe.html
For creating a parallel environment: qconf -ap mype
For listing all PEs: qconf -spl
Then attach the PE to your queue: qconf -mq all.q (in case of all.q)
--> "pe_list mype"
Important is: allocation_rule
Here you need to set: 1 --> This means one process per compute host.
Set slots to an high value (like the amount of cores in your cluster). It is a limitation for all jobs using this parallel environment.
Then you or your users can start your job: qsub -pe mytpe 8 myscript.sh
Then you get 8 compute nodes for this job with 1 slot each. qstat -g t shows you where.
Does this help?
Daniel

Related Links

Qsub job not running - possible issue in submission script
How do I know where my qsub job is running/being written
Pipe Symbol in qsub Job name
SLURM how to qsub a task when another task is finished?
Can multiple qsub submissions read the same group of files?
SGE faild to submit job, attribute is not a memory value
How do you submit a job on multiple queues with Torque?
Maui - preventing jobs from running on the same node
qsub: What is the standard when to get occasional updates on a submitted job?
Submitting a job to qsub generates an error, “Warning: no access to tty”
Running samtools from a qsub
How do I schedule a job on multiple nodes with qsub Univa 8.1.7?
How to specify a fixed job name for jobs submitted by qsub
duplicate jobs in sun grid engine
SGE qsub define variable using bach?
Job chaining with qsub

Categories

HOME
dotnetrdf
numpy
redis
scipy
bower
hana
orientation
specflow
defragmentation
diagram
xamarin-studio
lagom
xlsx
automata
dlib
pyyaml
flexlm
thumbnails
mousewheel
sensu
compare-and-swap
katharsis
interrupt-handling
web-frontend
nesc
rowcount
expand
shapes
receipt
superpowered
threshold
viewstate
es-shell
linq-to-entities
android-preferences
upsert
w3-total-cache
taskmanager
number-theory
jags
jgraph
execl
jszip
autoresize
onresume
inject
gitweb
json-schema-validator
tomcat5
spoofing
com-interop
faraday
mfc-feature-pack
loose-typing
rad
fabric-twitter
abas
base32
mogrify
playscape
prezto
graphical-logo
isml
httpie
magento-1.12
ultrawingrid
facebook-wall
alpha-transparency
coin-flipping
magic-numbers
broadcasting
initialization-vector
android-json-rpc
circos
nsmatrix
dynamic-binding
entity-framework-4.1
shiva3d
appfog
fluidsynth
android-4.0
crocodoc
visual-web-gui
separation-of-concerns
easygui
requestfactory
j-interop
surefire
scala-2.8
webrat
gears
interface-design

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App