Interacting With Daemons

Launch this tutorial in a Jupyter Notebook on Binder:

In this module, we'll look at how the HTCondor Python bindings can be used to interact with running daemons.

As usual, we start by importing the relevant modules:


In [1]:
import htcondor

Configuration

The HTCondor configuration is exposed to Python in two ways:

  • The local process's configuration is available in the module-level param object.
  • A remote daemon's configuration may be queried using a RemoteParam

The param object emulates a Python dictionary:


In [2]:
print(htcondor.param["SCHEDD_LOG"])   # prints the schedd's current log file
print(htcondor.param.get("TOOL_LOG")) # print None, since TOOL_LOG isn't set by default


/home/jovyan/.condor/local/log/SchedLog
None

In [3]:
htcondor.param["TOOL_LOG"] = "/tmp/log" # sets TOOL_LOG to /tmp/log
print(htcondor.param["TOOL_LOG"])       # prints /tmp/log, as set above


/tmp/log

Note that assignments to param will persist only in memory; if we use reload_config to re-read the configuration files from disk, our change to TOOL_LOG disappears:


In [4]:
print(htcondor.param.get("TOOL_LOG"))
htcondor.reload_config()
print(htcondor.param.get("TOOL_LOG"))


/tmp/log
None

In HTCondor, a configuration prefix may indicate that a setting is specific to that daemon. By default, the Python binding's prefix is TOOL. If you would like to use the configuration of a different daemon, utilize the set_subsystem function:


In [5]:
htcondor.param["TEST_FOO"] = "foo"         # sets the default value of TEST_FOO to foo
htcondor.param["SCHEDD.TEST_FOO"] = "bar"  # the schedd has a special setting for TEST_FOO

In [6]:
print(htcondor.param['TEST_FOO'])        # default access; should be 'foo'


foo

In [7]:
htcondor.set_subsystem('SCHEDD')         # changes the running process to identify as a schedd.
print(htcondor.param['TEST_FOO'])        # since we now identify as a schedd, should use the special setting of 'bar'


bar

Between param, reload_config, and set_subsystem, we can explore the configuration of the local host.

Remote Configuration

What happens if we want to test the configuration of a remote daemon? For that, we can use the RemoteParam class.

The object is first initialized from the output of the Collector.locate method:


In [8]:
master_ad = htcondor.Collector().locate(htcondor.DaemonTypes.Master)
print(master_ad['MyAddress'])
master_param = htcondor.RemoteParam(master_ad)


<172.17.0.3:9618?addrs=172.17.0.3-9618&alias=70b2e14c06bc&noUDP&sock=master_16_de02>

Once we have the master_param object, we can treat it like a local dictionary to access the remote daemon's configuration.

NOTE that the htcondor.param object attempts to infer type information for configuration values from the compile-time metadata while the RemoteParam object does not:


In [9]:
print(repr(master_param['UPDATE_INTERVAL']))      # returns a string
print(repr(htcondor.param['UPDATE_INTERVAL']))    # returns an integer


'5'
5

In fact, we can even set the daemon's configuration using the RemoteParam object... if we have permission. By default, this is disabled for security reasons:


In [10]:
master_param['UPDATE_INTERVAL'] = '500'


---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-10-90fdfdb9037d> in <module>
----> 1 master_param['UPDATE_INTERVAL'] = '500'

/opt/conda/lib/python3.7/site-packages/htcondor/_lock.py in wrapper(*args, **kwargs)
     67             acquired = LOCK.acquire()
     68 
---> 69             rv = func(*args, **kwargs)
     70 
     71             # if the function returned a context manager,

RuntimeError: Failed to set remote daemon parameter.

Logging Subsystem

The logging subsystem is available to the Python bindings; this is often useful for debugging network connection issues between the client and server.

NOTE Jupyter notebooks discard output from library code; hence, you will not see the results of enable_debug below.


In [11]:
htcondor.set_subsystem("TOOL")
htcondor.param['TOOL_DEBUG'] = 'D_FULLDEBUG'
htcondor.param['TOOL_LOG'] = '/tmp/log'
htcondor.enable_log()    # Send logs to the log file (/tmp/foo)
htcondor.enable_debug()  # Send logs to stderr; this is ignored by the web notebook.
print(open("/tmp/log").read())  # Print the log's contents.


07/06/20 14:13:51 Result of reading /etc/issue:  Ubuntu 18.04.4 LTS \n \l
 
07/06/20 14:13:51 Using IDs: 16 processors, 8 CPUs, 8 HTs
07/06/20 14:13:51 Reading condor configuration from '/etc/condor/condor_config'
07/06/20 14:13:51 Enumerating interfaces: lo 127.0.0.1 up
07/06/20 14:13:51 Enumerating interfaces: eth0 172.17.0.3 up

Sending Daemon Commands

An administrator can send administrative commands directly to the remote daemon. This is useful if you'd like a certain daemon restarted, drained, or reconfigured.

Because we have a personal HTCondor instance, we are the administrator - and we can test this out!

To send a command, use the top-level send_command function, provide a daemon location, and provide a specific command from the DaemonCommands enumeration. For example, we can reconfigure:


In [12]:
print(master_ad['MyAddress'])

htcondor.send_command(master_ad, htcondor.DaemonCommands.Reconfig)


<172.17.0.3:9618?addrs=172.17.0.3-9618&alias=70b2e14c06bc&noUDP&sock=master_16_de02>

In [13]:
import time

time.sleep(1)

log_lines = open(htcondor.param['MASTER_LOG']).readlines()
print(log_lines[-4:])


['07/06/20 14:13:51 Sent SIGHUP to NEGOTIATOR (pid 23)\n', '07/06/20 14:13:51 Sent SIGHUP to SCHEDD (pid 24)\n', '07/06/20 14:13:51 Sent SIGHUP to SHARED_PORT (pid 21)\n', '07/06/20 14:13:51 Sent SIGHUP to STARTD (pid 27)\n']

We can also instruct the master to shut down a specific daemon:


In [14]:
htcondor.send_command(master_ad, htcondor.DaemonCommands.DaemonOff, "SCHEDD")

time.sleep(1)

log_lines = open(htcondor.param['MASTER_LOG']).readlines()
print(log_lines[-1])


07/06/20 14:13:52 The SCHEDD (pid 24) exited with status 0

Or even turn off the whole HTCondor instance:


In [15]:
htcondor.send_command(master_ad, htcondor.DaemonCommands.OffFast)

time.sleep(1)

log_lines = open(htcondor.param['MASTER_LOG']).readlines()
print(log_lines[-1])


07/06/20 14:13:53 **** condor_master (condor_MASTER) pid 16 EXITING WITH STATUS 0

Let's turn HTCondor back on for future tutorials:


In [16]:
import os
os.system("condor_master")
time.sleep(10)  # give condor a few seconds to get started