From OpenWetWare
Jump to navigationJump to search

OpenPBS is a GPL'd batch queuing system. In brief, both the [Beowulf] and the [Roadrunner] clusters use OpenPBS to distribute compute-jobs across their several nodes.

Common OpenPBS Commands

  • qsub -- submit a compute-job for execution
  • qstat -- query status of the queues
  • qdel -- remove a (possibly running) compute-job

UnCommon OpenPBS Commands

  • offline-node -- a simple shell wrapper to [pbsnodes]
  • pbsnodes -- maintain which compute-nodes participate the queuing system
  • for i in `qstat | grep <username> | awk -F. '{print $1}'` ; do qdel $i ; done -- delete all submitted jobs run by <username>
  • checknode <nodename> -- check node with name <nodename>
  • diagnose -n | grep -v nonspeedy -- look at all speedy nodes
  • tracejob <jobid> -- look at the history of a particular job with ID <jobid> that was run
  • diagnose -p -- show priorities

Other Important Queuing Bits

  • qstat -q -- What queues exist?
  • showstart -- When will my job run?
  • fairshare -- What is the amount my group's cluster usage?
  • showq -- show jobstart order

How to manage job output

  • email
  • stdout and stderr