To take a look at which queues are available, you can use the command: bqueues
.
In [ ]:
bqueues
QUEUE_NAME PRIO STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SUSP
system 1000 Open:Active - - - - 0 0 0 0
yesterday 500 Open:Active 20 8 - - 0 0 0 0
small 31 Open:Active - - - - 0 0 0 0
normal 30 Open:Active - - - - 35 13 1 0
long 3 Open:Active 50 - - - 31686 31636 46 0
basement 1 Open:Active 20 10 - - 180 170 10 0
This will return information about the queues which are available and how busy they are. Here, we can see information about six queues into which jobs can be submitted on the cluster.
By default, bqueues
will give you the following information:
You may have some jobs which are more urgent than others and that you would like to be run sooner. In these instances, the priority of the queue is important.
Jobs submitted to higher priority queues are run first. You can check the queue priority by looking at the PRIO column. The larger the priority value of the queue, the higher the priority of the queue. In this example, we can see that the yesterday queue has a much higher priority than the normal queue and so a job submitted to the yesterday queue will often be run before a job on the normal queue if the resources that were requested for that job are available.
For more information on priority and how this works, please see priority and fairshare.
Sometimes a queue might not be available. You can check the status of the queue by looking at the STATUS column.
In [ ]:
bqueues -l
This will give you the requirements and limits for all of the queues on the cluster. You can also get the this information for a specific queue by specifying the name of the queue.
bqueues -l <queue_name>
In the example command below, we are asking for detailed information about a queue called yesterday.
bqueues -l yesterday
The -l
option will give us a lot more information, such as the resource limits for the yesterday queue (e.g. maximum memory usage or run time).
QUEUE: yesterday
-- As in I needed it yesterday highest priority (all nodes)
PARAMETERS/STATISTICS
PRIO NICE STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SSUSP USUSP RSV
500 20 Open:Active 20 8 - - 0 0 0 0 0 0
Interval for a host to accept two jobs is 0 seconds
DEFAULT LIMITS:
MEMLIMIT
100 M
MAXIMUM LIMITS:
RUNLIMIT
2880.0 min of BL465c_G8
CORELIMIT MEMLIMIT
0 M 250 G
SCHEDULING PARAMETERS
r15s r1m r15m ut pg io ls it tmp swp mem
loadSched - - - - - - - - - - -
loadStop - - - - - - - - - - -
poe nrt_windows adapter_windows ntbl_windows uptime
loadSched - - - - -
loadStop - - - - -
SCHEDULING POLICIES: FAIRSHARE
USER_SHARES: [default, 1]
SHARE_INFO_FOR: yesterday/
USER/GROUP SHARES PRIORITY STARTED RESERVED CPU_TIME RUN_TIME ADJUST
user1 1 0.302 0 0 47.0 1590 0.000
user2 1 0.301 0 0 590.3 1634 0.000
USERS: all
HOSTS: pcs5a pcs5b+1 others+2
RES_REQ: select[type==any]
Maximum slot reservation time: 14400 seconds
Below is an example for three queues which have different resource limits. Here, jobs in the normal queue will automatically be terminated or killed by LSF if they try to run for more than 12 hours (RUNLIMIT = 720.0 min), in the long queue after 2 days (RUNLIMIT = 2880.0 min) and in the hugemem queue after 15 days (RUNLIMIT = 21600.0 min). The hugemem also has a much larger memory limit (727.5G) than the normal or long queues (250G).
normal:
RUNLIMIT
720.0 min of BL465c_G8
CORELIMIT MEMLIMIT
0 M 250 G
long:
RUNLIMIT
2880.0 min of BL465c_G8
CORELIMIT MEMLIMIT
0 M 250 G
hugemem:
RUNLIMIT
21600.0 min of HS21_E5450_8
CORELIMIT MEMLIMIT
0 M 727.5 G
For more information, please see the working with queues section of the LSF user guide.
For an overview of the key concepts, you can go back to the introduction. Otherwise, let's take a look at submitting jobs.