Computing/Linux/OpenPBS

OpenPBS is a GPL'd batch queuing system. In brief, both the [Beowulf] and the [Roadrunner] clusters use OpenPBS to distribute compute-jobs across their several nodes.

Common OpenPBS Commands

 * qsub -- submit a compute-job for execution
 * qstat -- query status of the queues
 * qdel -- remove a (possibly running) compute-job

UnCommon OpenPBS Commands

 * offline-node -- a simple shell wrapper to [pbsnodes]
 * pbsnodes -- maintain which compute-nodes participate the queuing system
 * for i in `qstat | grep   | awk -F. '{print $1}'` ; do qdel $i ; done -- delete all submitted jobs run by
 * checknode   -- check node with name
 * diagnose -n | grep -v nonspeedy -- look at all speedy nodes
 * tracejob   -- look at the history of a particular job with ID that was run
 * diagnose -p -- show priorities

Other Important Queuing Bits

 * qstat -q -- What queues exist?
 * showstart -- When will my job run?
 * fairshare -- What is the amount my group's cluster usage?
 * showq -- show jobstart order

How to manage job output

 * email
 * stdout and stderr