This is a documentation for QuotaWatcher utility, a small cron job developed to monitor disk usage on GSC servers In this notebook we will explain every part of the utility in order to have other people maintain the code easily

All the code is heavily pep8'd :)

Importing needed Libraries


In [31]:
from __future__ import division

__author__ = "Rad <aradwen@gmail.com>"
__license__ = "GNU General Public License version 3"
__date__ = "06/30/2015"
__version__ = "0.2"

try:
    import os
    from quota_logger import init_log
    import subprocess
    from prettytable import PrettyTable
    from smtplib import SMTP
    from smtplib import SMTPException
    from email.mime.text import MIMEText
    from argparse import ArgumentParser
except ImportError:
    # Checks the installation of the necessary python modules
    import os
    import sys

    print((os.linesep * 2).join(
        ["An error found importing one module:", str(sys.exc_info()[1]), "You need to install it Stopping..."]))
    sys.exit(-2)

I like this way of importing libraries, if some libraries are not already installed, the system will exit. There is another room for improvement here, if a library does not exist, it is possile to install it automatically if we run the code as admin or with enough permission

The Notifier Class


In [32]:
class Notifier(object):

    suffixes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB']

    def __init__(self, **kwargs):

        self.threshold = None
        self.path = None
        self.list = None
        self.email_sender = None
        self.email_password = None
        self.gmail_smtp = None
        self.gmail_smtp_port = None
        self.text_subtype = None
        self.cap_reached = False
        self.email_subject = None

        for (key, value) in kwargs.iteritems():
            if hasattr(self, key):
                setattr(self, key, value)

        self._log = init_log()

We init the class as an object containing some features, this object will have a threshold upon which there will be an email triggered to a recipient list. This obect is looking ath the size of each subdirectory in path. You need to create an email addresse and add some variables to your PATH ( will be discussed later)


In [33]:
@property
    def loggy(self):
        return self._log

We need to inherhit logging capabilities from the logging class we imported (see later the code of this class). This will allow us to log from within the class itself


In [34]:
@staticmethod
    def load_recipients_emails(emails_file):
        recipients = [line.rstrip('\n') for line in open(emails_file) if not line[0].isspace()]
        return recipients

We need to lad the emails from a file created by the user. Usually I create 2 files, development_list containing only email adresses I will use for testing and production_list containing adresses I want to notify in production


In [35]:
@staticmethod
    def load_message_content(message_template_file, table):
        template_file = open(message_template_file, 'rb')
        template_file_content = template_file.read().replace(
            "{{table}}", table.get_string())
        template_file.close()
        return template_file_content

Inspired by MVC apps, we load message body from a template, this template will contain a placeholder called {{table}} that will contain the table of subdirectories and their respective sizes


In [36]:
def notify_user(self, email_receivers, table, template):
        """This method sends an email
        :rtype : email sent to specified members
        """
        # Create the message
        input_file = os.path.join(
            os.path.dirname(__file__), "templates/" + template + ".txt")
        content = self.load_message_content(input_file, table)

        msg = MIMEText(content, self.text_subtype)

        msg["Subject"] = self.email_subject
        msg["From"] = self.email_sender
        msg["To"] = ','.join(email_receivers)

        try:
            smtpObj = SMTP(self.gmail_smtp, self.gmail_smtp_port)
            # Identify yourself to GMAIL ESMTP server.
            smtpObj.ehlo()
            # Put SMTP connection in TLS mode and call ehlo again.
            smtpObj.starttls()
            smtpObj.ehlo()
            # Login to service
            smtpObj.login(user=self.email_sender, password=self.email_password)
            # Send email
            smtpObj.sendmail(self.email_sender, email_receivers, msg.as_string())
            # close connection and session.
            smtpObj.quit()
        except SMTPException as error:
            print "Error: unable to send email :  {err}".format(err=error)

notify_user is the function that will send an email to the users upon request. It loads the message body template and injects the table in it.


In [37]:
@staticmethod
    def du(path):
        """disk usage in kilobytes"""
        # return subprocess.check_output(['du', '-s',
        # path]).split()[0].decode('utf-8')
        try:
            p1 = subprocess.Popen(('ls', '-d', path), stdout=subprocess.PIPE)
            p2 = subprocess.Popen((os.environ["GNU_PARALLEL"], '--no-notice', 'du', '-s', '2>&1'), stdin=p1.stdout,
                                  stdout=subprocess.PIPE)
            p3 = subprocess.Popen(
                ('grep', '-v', '"Permission denied"'), stdin=p2.stdout, stdout=subprocess.PIPE)
            output = p3.communicate()[0]
        except subprocess.CalledProcessError as e:
            raise RuntimeError("command '{0}' return with error (code {1}): {2}".format(
                e.cmd, e.returncode, e.output))
        # return ''.join([' '.join(hit.split('\t')) for hit in output.split('\n')
        # if len(hit) > 0 and not "Permission" in hit and output[0].isdigit()])
        result = [' '.join(hit.split('\t')) for hit in output.split('\n')]
        for line in result:
            if line and len(line.split('\n')) > 0 and "Permission" not in line and line[0].isdigit():
                return line.split(" ")[0]

This is a wrapper of the famous du command. I use GNU_PARALLEL in case we have a lot of subdirectories and in case we don't want to wait for sequential processing. Note that we could have done this in multithreading as well


In [38]:
def du_h(self, nbytes):
        if nbytes == 0:
            return '0 B'
        i = 0
        while nbytes >= 1024 and i < len(self.suffixes) - 1:
            nbytes /= 1024.
            i += 1
        f = ('%.2f'.format(nbytes)).rstrip('0').rstrip('.')
        return '%s %s'.format(f, self.suffixes[i])

I didn't want to use the -h flag because we may want to sum up subdirectories sizes or doing other postprocessing, we'd rather keep them in a unified format (unit). For a more human readable format, we can use du_h() method


In [39]:
@staticmethod
    def list_folders(given_path):
        user_list = []
        for path in os.listdir(given_path):
            if not os.path.isfile(os.path.join(given_path, path)) and not path.startswith(".") and not path.startswith(
                    "archive"):
                user_list.append(path)
        return user_list

we need at some point to return a list of subdirectories, each will be passed through the same function (du)


In [40]:
def notify(self):
        global cap_reached
        self._log.info("Loading recipient emails...")
        list_of_recievers = self.load_recipients_emails(self.list)
        paths = self.list_folders(self.path)
        paths = [self.path + user for user in paths]
        sizes = []
        for size in paths:
            try:
                self._log.info("calculating disk usage for " + size + " ...")
                sizes.append(int(self.du(size)))
            except Exception, e:
                self._log.exception(e)
                sizes.append(0)
        # sizes = [int(du(size).split(' ')[0]) for size in paths]
        # convert kilobytes to bytes
        sizes = [int(element) * 1000 for element in sizes]
        table = PrettyTable(["Directory", "Size"])
        table.align["Directory"] = "l"
        table.align["Size"] = "r"
        table.padding_width = 5
        table.border = False
        for account, size_of_account in zip(paths, sizes):
            if int(size_of_account) > int(self.threshold):
                table.add_row(
                    ["*" + os.path.basename(account) + "*", "*" + self.du_h(size_of_account) + "*"])
                self.cap_reached = True
            else:
                table.add_row([os.path.basename(account), self.du_h(size_of_account)])
        # notify Admins
        table.add_row(["TOTAL", self.du_h(sum(sizes))])
        table.add_row(["Usage", str(sum(sizes) / 70000000000000)])
        self.notify_user(list_of_recievers, table, "karey")
        if self.cap_reached:
            self.notify_user(list_of_recievers, table, "default_size_limit")

    def run(self):
        self.notify()

Finally we create the function that will bring all this protocol together :

  • Read the list of recievers
  • load the path we want to look into
  • for each subdirectory calculate the size of it and append it to a list
  • create a Table to be populated row by row
  • add subdirectories and their sizes
  • Calculate the total of sizes in subdirectories
  • If one of the subdirectories has a size higher than the threshold specified, trigger the email
  • Report the usage as a percentage

In [41]:
def arguments():
    """Defines the command line arguments for the script."""
    main_desc = """Monitors changes in the size of dirs for a given path"""

    parser = ArgumentParser(description=main_desc)
    parser.add_argument("path", default=os.path.expanduser('~'), nargs='?',
                        help="The path to monitor. If none is given, takes the  home directory")
    parser.add_argument("list", help="text file containing the list of persons to be notified, one per line")
    parser.add_argument("-s", "--notification_subject", default=None, help="Email subject of the notification")
    parser.add_argument("-t", "--threshold", default=2500000000000,
                        help="The threshold that will trigger the notification")
    parser.add_argument("-v", "--version", action="version",
                        version="%(prog)s {0}".format(__version__),
                        help="show program's version number and exit")
    return parser

The program takes in account : the path to examine, the list of emails in a file, the subject of the alert, the thresold that will trigger the email (here by defailt 2.5T)


In [42]:
def main():

    args = arguments().parse_args()
    notifier = Notifier()
    loggy = notifier.loggy
    # Set parameters
    loggy.info("Starting QuotaWatcher session...")
    loggy.info("Setting parameters ...")
    notifier.list = args.list
    notifier.threshold = args.threshold
    notifier.path = args.path

    # Configure the app
    try:
        loggy.info("Loading environment variables ...")
        notifier.email_sender = os.environ["NOTIFIER_SENDER"]
        notifier.email_password = os.environ["NOTIFIER_PASSWD"]
        notifier.gmail_smtp = os.environ["NOTIFIER_SMTP"]
        notifier.gmail_smtp_port = os.environ["NOTIFIER_SMTP_PORT"]
        notifier.text_subtype = os.environ["NOTIFIER_SUBTYPE"]
        notifier.email_subject = args.notification_subject
        notifier.cap_reached = False
    except Exception, e:
        loggy.exception(e)

    notifier.run()
    loggy.info("End of QuotaWatcher session")

Note that in the main we load some environment variable that you should specify in advance. This is up to the user to fill these out, It is always preferable to declare these as environment variable, most of the time these are confidential so we better not show them here, it is always safe to set environment variable for these

That's it

this is an example of the LOG output.

2015-07-03 10:40:46,968 - quota_logger - INFO - Starting QuotaWatcher session...
2015-07-03 10:40:46,969 - quota_logger - INFO - Setting parameters ...
2015-07-03 10:40:46,969 - quota_logger - INFO - Loading environment variables ...
2015-07-03 10:40:46,969 - quota_logger - INFO - Loading recipient emails...
2015-07-03 10:40:47,011 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/amcpherson ..
.
2015-07-03 11:21:09,442 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/andrewjlroth
...
2015-07-03 15:31:41,500 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/asteif ...
2015-07-03 15:40:34,268 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/clefebvre ...
2015-07-03 15:42:47,483 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/dgrewal ...
2015-07-03 16:01:30,588 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/fdorri ...
2015-07-03 16:03:43,850 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/fong ...
2015-07-03 16:16:13,781 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/gha ...
2015-07-03 16:16:38,673 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/jding ...
2015-07-03 16:16:50,820 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/cdesouza ...
2015-07-03 16:16:52,585 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/jrosner ...
2015-07-03 16:27:30,684 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/jtaghiyar ...
2015-07-03 16:28:16,982 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/kareys ...
2015-07-03 19:21:07,607 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/hfarahani ...
2015-07-03 19:22:07,618 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/jzhou ...
2015-07-03 19:38:28,147 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/pipelines ...
2015-07-03 19:53:20,771 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/projects ...
2015-07-03 20:52:45,001 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/raniba ...
2015-07-03 20:59:50,543 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/tfunnell ...
2015-07-03 21:00:47,216 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/ykwang ...
2015-07-03 21:03:30,277 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/azhang ...
2015-07-03 21:03:30,820 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/softwares ...
2015-07-03 21:03:42,679 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/sjewell ...
2015-07-03 21:03:51,711 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/kastonl ...
2015-07-03 21:04:52,536 - quota_logger - INFO - calculating disk usage for /genesis/extscratch/shahlab/amazloomian .
..
2015-07-03 21:07:43,501 - quota_logger - INFO - End of QuotaWatcher session

And as of the email triggered, it will look like

** THIS IS AN ALERT MESSAGE : DISK USAGE SPIKE **

This is a warning message about the disk usage relative to the Shahlab group at GSC

We detected a spike > 2.5 T for some accounts and here is a list of the space usage per account reported today


    Directory                   Size     
    amcpherson               1.96 TB     
    andrewjlroth           390.19 GB     
    asteif                   2.05 TB     
    clefebvre               16.07 GB     
    dgrewal                  1.61 TB     
    fdorri                 486.49 GB     
    *fong*                 *9.67 TB*     
    gha                      50.7 GB     
    jding                  638.72 GB     
    cdesouza                56.15 GB     
    jrosner                  1.82 TB     
    jtaghiyar              253.84 GB     
    *kareys*              *11.26 TB*     
    hfarahani                1.09 TB     
    jzhou                    1.19 TB     
    pipelines                 2.1 TB     
    *projects*             *4.09 TB*     
    raniba                   2.03 TB     
    tfunnell                 1.02 TB     
    ykwang                   1.71 TB     
    azhang                  108.4 MB     
    softwares               34.67 GB     
    sjewell                 24.53 GB     
    kastonl                118.51 GB     
    amazloomian              1.71 TB     
    TOTAL                   45.34 TB     
    Usage                    71.218%     


Please do the necessary to remove temporary files and take the time to clean up your working directories

Thank you for your cooperation

(am a cron job, don't reply to this message, if you have questions ask Ali)



PS : This is a very close estimation, some directories may have strict permissions, for an accurate disk usage please make sure that you set your files permissions so that anyone can see them.

The logger


In [43]:
import logging
import datetime

def init_log():
    current_time = datetime.datetime.now()
    logger = logging.getLogger(__name__)
    logger.setLevel(logging.INFO)
    handler = logging.FileHandler(current_time.isoformat()+'_quotawatcher.log')
    handler.setLevel(logging.INFO)
    # create a logging format
    formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
    handler.setFormatter(formatter)
    logger.addHandler(handler)
    return logger

Before you start

export NOTIFIER_SENDER="your_email@gmail.com"
export NOTIFIER_PASSWD="passwordhere"
export NOTIFIER_SMTP="smtp.gmail.com"
export NOTIFIER_SMTP_PORT=587
export NOTIFIER_SUBTYPE="plain"
export GNU_PARALLEL="/path/to/your/gnu/parallel"

How to run the program

python quotawatcher.py /genesis/extscratch/shahlab/ dev_list -s "Hey Test" -t 2500000000000