Submitting jobs

When analysing data on a single machine, such as a laptop, commands or scripts are run in the terminal and the results are given back to us in that terminal. For example, the sleep command tells your machine to suspend the execution of a command for a defined period of time (in seconds).

Let's tell our machine to pause for 60 seconds.


In [ ]:
sleep 60

When using a cluster, these commands or scripts need to be submitted as jobs. To submit a job with LSF, we use the bsub command.

Now, let's submit the previous command as a job using bsub.


In [ ]:
bsub "sleep 60"
Returning output by mail is not supported on this cluster.
Please use the -o option to write output to disk.
Job <4015755> is submitted to default queue <normal>.

When you submit a job, it will be given a unique identifier (e.g. 4015755) which will help us with getting updates on what's happening with our job. To find out what jobs are scheduled and running, we can use another command, bjobs.

We can see how our job is getting on by running bjobs.


In [ ]:
bjobs
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
4015755 abc     PEND  normal     pcs5b                   sleep 10   Jan 15 14:06

This will give us the following information:

  • JOBID - unique numerical job identifier used to keep track of the job
  • USER - username of the person who submitted the job
  • STAT - job state
  • QUEUE - which queue the job was submitted to
  • FROM_HOST - which host the job was submitted from
  • EXEC_HOST - which host the job is running on (blank if job is pending)
  • JOB_NAME - name of the job
  • SUBMIT_TIME - when the job was submitted

Most jobs will have one of three states (STAT):

  • PEND - the job is waiting in the queue to be scheduled and dispatched
  • RUN - the job has been dispatched to a host (node) and is running
  • DONE - the jobs finished normally (has an exit value of 0)

Occasionally, you may also see suspended job states:

  • PSUSP - job was suspended by the owner or administrator while pending
  • USUSP - job was suspended by the owner or administrator after being dispatched
  • SSUSP - job was suspended by LSF after being dispatched

In this example, we can see that our job was submitted to the normal queue by default and that it hasn't started yet (PEND).

Let's check again.


In [ ]:
bjobs
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
4015755 abc     RUN   normal     pcs5b       pcs5c       sleep 10   Jan 15 14:06

Here we can see that our job has now started running (RUN) and is being executed on pcs5c (EXEC_HOST).

Let's wait a little longer and check one more time.


In [ ]:
bjobs
No unfinished job found

Now we can't see any jobs. Why? That's because our job has finished running and we have no more jobs scheduled.

Did you notice the message that was returned when you submitted your job?

Returning output by mail is not supported on this cluster.
Please use the -o option to write output to disk.

This is because we used the default options and didn't specify an output file or an error file. We'll be looking at why printing the job outputs to files is good practice (and very useful!) in the next section.

You can find more information on job submission in general by looking through the job submission and job information sections of the LSF user manual.


What's next?

For another look at queues, you can go back to queues. Otherwise, let's take a closer look at managing jobs.