DB2 Jupyter Notebook Extensions

Version: 2020-04-05

This code is imported as a Jupyter notebook extension in any notebooks you create with DB2 code in it. Place the following line of code in any notebook that you want to use these commands with:

%run db2.ipynb

This code defines a Jupyter/Python magic command called %sql which allows you to execute DB2 specific calls to the database. There are other packages available for manipulating databases, but this one has been specifically designed for demonstrating a number of the SQL features available in DB2.

There are two ways of executing the %sql command. A single line SQL statement would use the line format of the magic command:

%sql SELECT * FROM EMPLOYEE
If you have a large block of sql then you would place the %%sql command at the beginning of the block and then place the SQL statements into the remainder of the block. Using this form of the `%%sql` statement means that the notebook cell can only contain SQL and no other statements.
%%sql
SELECT * FROM EMPLOYEE
ORDER BY LASTNAME
You can have multiple lines in the SQL block (`%%sql`). The default SQL delimiter is the semi-column (`;`). If you have scripts (triggers, procedures, functions) that use the semi-colon as part of the script, you will need to use the `-d` option to change the delimiter to an at "`@`" sign.
%%sql -d
SELECT * FROM EMPLOYEE
@
CREATE PROCEDURE ...
@

The %sql command allows most DB2 commands to execute and has a special version of the CONNECT statement. A CONNECT by itself will attempt to reconnect to the database using previously used settings. If it cannot connect, it will prompt the user for additional information.

The CONNECT command has the following format:

%sql CONNECT TO <database> USER <userid> USING <password | ?> HOST <ip address> PORT <port number>
If you use a "?" for the password field, the system will prompt you for a password. This avoids typing the password as clear text on the screen. If a connection is not successful, the system will print the error message associated with the connect request.

If the connection is successful, the parameters are saved on your system and will be used the next time you run a SQL statement, or when you issue the %sql CONNECT command with no parameters.

In addition to the -d option, there are a number different options that you can specify at the beginning of the SQL:

  • -d, -delim - Change SQL delimiter to "@" from ";"
  • -q, -quiet - Quiet results - no messages returned from the function
  • -r, -array - Return the result set as an array of values instead of a dataframe
  • -t, -time - Time the following SQL statement and return the number of times it executes in 1 second
  • -j - Format the first character column of the result set as a JSON record
  • -json - Return result set as an array of json records
  • -a, -all - Return all rows in answer set and do not limit display
  • -grid - Display the results in a scrollable grid
  • -pb, -bar - Plot the results as a bar chart
  • -pl, -line - Plot the results as a line chart
  • -pp, -pie - Plot the results as a pie chart
  • -e, -echo - Any macro expansions are displayed in an output box
  • -sampledata - Create and load the EMPLOYEE and DEPARTMENT tables

You can pass python variables to the %sql command by using the {} braces with the name of the variable inbetween. Note that you will need to place proper punctuation around the variable in the event the SQL command requires it. For instance, the following example will find employee '000010' in the EMPLOYEE table.

empno = '000010'
%sql SELECT LASTNAME FROM EMPLOYEE WHERE EMPNO='{empno}'

The other option is to use parameter markers. What you would need to do is use the name of the variable with a colon in front of it and the program will prepare the statement and then pass the variable to Db2 when the statement is executed. This allows you to create complex strings that might contain quote characters and other special characters and not have to worry about enclosing the string with the correct quotes. Note that you do not place the quotes around the variable even though it is a string.

empno = '000020'
%sql SELECT LASTNAME FROM EMPLOYEE WHERE EMPNO=:empno

Development SQL

The previous set of %sql and %%sql commands deals with SQL statements and commands that are run in an interactive manner. There is a class of SQL commands that are more suited to a development environment where code is iterated or requires changing input. The commands that are associated with this form of SQL are:

  • AUTOCOMMIT
  • COMMIT/ROLLBACK
  • PREPARE
  • EXECUTE

Autocommit is the default manner in which SQL statements are executed. At the end of the successful completion of a statement, the results are commited to the database. There is no concept of a transaction where multiple DML/DDL statements are considered one transaction. The AUTOCOMMIT command allows you to turn autocommit OFF or ON. This means that the set of SQL commands run after the AUTOCOMMIT OFF command are executed are not commited to the database until a COMMIT or ROLLBACK command is issued.

COMMIT (WORK) will finalize all of the transactions (COMMIT) to the database and ROLLBACK will undo all of the changes. If you issue a SELECT statement during the execution of your block, the results will reflect all of your changes. If you ROLLBACK the transaction, the changes will be lost.

PREPARE is typically used in a situation where you want to repeatidly execute a SQL statement with different variables without incurring the SQL compilation overhead. For instance:

x = %sql PREPARE SELECT LASTNAME FROM EMPLOYEE WHERE EMPNO=?
for y in ['000010','000020','000030']:
    %sql execute :x using :y

EXECUTE is used to execute a previously compiled statement.

To retrieve the error codes that might be associated with any SQL call, the following variables are updated after every call:

  • SQLCODE
  • SQLSTATE
  • SQLERROR - Full error message retrieved from Db2

Install Db2 Python Driver

If the ibm_db driver is not installed on your system, the subsequent Db2 commands will fail. In order to install the Db2 driver, issue the following command from a Jupyter notebook cell:

!pip install --user ibm_db

Db2 Jupyter Extensions

This section of code has the import statements and global variables defined for the remainder of the functions.


In [ ]:
#
# Set up Jupyter MAGIC commands "sql". 
# %sql will return results from a DB2 select statement or execute a DB2 command
#
# IBM 2019: George Baklarz
# Version 2019-10-03
#

from __future__ import print_function
from IPython.display import HTML as pHTML, Image as pImage, display as pdisplay, Javascript as Javascript
from IPython.core.magic import (Magics, magics_class, line_magic,
                                cell_magic, line_cell_magic, needs_local_scope)
import ibm_db
import pandas
import ibm_db_dbi
import json
import matplotlib
import matplotlib.pyplot as plt
import getpass
import os
import pickle
import time
import sys
import re
import warnings

warnings.filterwarnings("ignore")

# Python Hack for Input between 2 and 3

try: 
    input = raw_input 
except NameError: 
    pass 

_settings = {
     "maxrows"  : 10,
     "maxgrid"  : 5,
     "runtime"  : 1,
     "display"  : "PANDAS",
     "database" : "",
     "hostname" : "localhost",
     "port"     : "50000",
     "protocol" : "TCPIP",    
     "uid"      : "DB2INST1",
     "pwd"      : "password",
     "ssl"      : ""
}

_environment = {
     "jupyter"  : True,
     "qgrid"    : True
}

_display = {
    'fullWidthRows': True,
    'syncColumnCellResize': True,
    'forceFitColumns': False,
    'defaultColumnWidth': 150,
    'rowHeight': 28,
    'enableColumnReorder': False,
    'enableTextSelectionOnCells': True,
    'editable': False,
    'autoEdit': False,
    'explicitInitialization': True,
    'maxVisibleRows': 5,
    'minVisibleRows': 5,
    'sortable': True,
    'filterable': False,
    'highlightSelectedCell': False,
    'highlightSelectedRow': True
}

# Connection settings for statements 

_connected = False
_hdbc = None
_hdbi = None
_stmt = []
_stmtID = []
_stmtSQL = []
_vars = {}
_macros = {}
_flags = []
_debug = False

# Db2 Error Messages and Codes
sqlcode = 0
sqlstate = "0"
sqlerror = ""
sqlelapsed = 0

# Check to see if QGrid is installed

try:
    import qgrid
    qgrid.set_defaults(grid_options=_display)
except:
    _environment['qgrid'] = False
    
# Check if we are running in iPython or Jupyter

try:
    if (get_ipython().config == {}): 
        _environment['jupyter'] = False
        _environment['qgrid'] = False
    else:
        _environment['jupyter'] = True
except:
    _environment['jupyter'] = False
    _environment['qgrid'] = False

Options

There are four options that can be set with the %sql command. These options are shown below with the default value shown in parenthesis.

  • MAXROWS n (10) - The maximum number of rows that will be displayed before summary information is shown. If the answer set is less than this number of rows, it will be completely shown on the screen. If the answer set is larger than this amount, only the first 5 rows and last 5 rows of the answer set will be displayed. If you want to display a very large answer set, you may want to consider using the grid option -g to display the results in a scrollable table. If you really want to show all results then setting MAXROWS to -1 will return all output.

  • MAXGRID n (5) - The maximum size of a grid display. When displaying a result set in a grid -g, the default size of the display window is 5 rows. You can set this to a larger size so that more rows are shown on the screen. Note that the minimum size always remains at 5 which means that if the system is unable to display your maximum row size it will reduce the table display until it fits.

  • DISPLAY PANDAS | GRID (PANDAS) - Display the results as a PANDAS dataframe (default) or as a scrollable GRID

  • RUNTIME n (1) - When using the timer option on a SQL statement, the statement will execute for n number of seconds. The result that is returned is the number of times the SQL statement executed rather than the execution time of the statement. The default value for runtime is one second, so if the SQL is very complex you will need to increase the run time.

  • LIST - Display the current settings

To set an option use the following syntax:

%sql option option_name value option_name value ....

The following example sets all options:

%sql option maxrows 100 runtime 2 display grid maxgrid 10

The values will not be saved between Jupyter notebooks sessions. If you need to retrieve the current options values, use the LIST command as the only argument:

%sql option list

In [ ]:
def setOptions(inSQL):

    global _settings, _display

    cParms = inSQL.split()
    cnt = 0

    while cnt < len(cParms):
        if cParms[cnt].upper() == 'MAXROWS':
            
            if cnt+1 < len(cParms):
                try:
                    _settings["maxrows"] = int(cParms[cnt+1])
                except Exception as err:
                    errormsg("Invalid MAXROWS value provided.")
                    pass
                cnt = cnt + 1
            else:
                errormsg("No maximum rows specified for the MAXROWS option.")
                return
            
        elif cParms[cnt].upper() == 'MAXGRID':
            
            if cnt+1 < len(cParms):
                try:
                    maxgrid = int(cParms[cnt+1])
                    if (maxgrid <= 5):                      # Minimum window size is 5
                        maxgrid = 5
                    _display["maxVisibleRows"] =  int(cParms[cnt+1])
                    try:
                        import qgrid
                        qgrid.set_defaults(grid_options=_display)
                    except:
                        _environment['qgrid'] = False
                        
                except Exception as err:
                    errormsg("Invalid MAXGRID value provided.")
                    pass
                cnt = cnt + 1
            else:
                errormsg("No maximum rows specified for the MAXROWS option.")
                return            
            
        elif cParms[cnt].upper() == 'RUNTIME':
            if cnt+1 < len(cParms):
                try:
                    _settings["runtime"] = int(cParms[cnt+1])
                except Exception as err:
                    errormsg("Invalid RUNTIME value provided.")
                    pass
                cnt = cnt + 1
            else:
                errormsg("No value provided for the RUNTIME option.")
                return 
            
        elif cParms[cnt].upper() == 'DISPLAY':
            if cnt+1 < len(cParms):
                if (cParms[cnt+1].upper() == 'GRID'):
                    _settings["display"] = 'GRID'
                elif (cParms[cnt+1].upper()  == 'PANDAS'):
                    _settings["display"] = 'PANDAS'
                else:
                    errormsg("Invalid DISPLAY value provided.")
                cnt = cnt + 1
            else:
                errormsg("No value provided for the DISPLAY option.")
                return  
        elif (cParms[cnt].upper() == 'LIST'):
            print("(MAXROWS) Maximum number of rows displayed: " + str(_settings["maxrows"]))
            print("(MAXGRID) Maximum grid display size: " + str(_settings["maxgrid"]))
            print("(RUNTIME) How many seconds to a run a statement for performance testing: " + str(_settings["runtime"]))
            print("(DISPLAY) Use PANDAS or GRID display format for output: " + _settings["display"]) 
            return
        else:
            cnt = cnt + 1
            
    save_settings()

SQL Help

The calling format of this routine is:

sqlhelp()

This code displays help related to the %sql magic command. This help is displayed when you issue a %sql or %%sql command by itself, or use the %sql -h flag.


In [ ]:
def sqlhelp():
    
    global _environment
    
    if (_environment["jupyter"] == True):
        sd  = '<td style="text-align:left;">'
        ed1 = '</td>'
        ed2 = '</td>'
        sh  = '<th style="text-align:left;">'
        eh1 = '</th>'
        eh2 = '</th>'
        sr  = '<tr>'
        er  = '</tr>'
        helpSQL = """
        <h3>SQL Options</h3> 
        <p>The following options are available as part of a SQL statement. The options are always preceded with a
        minus sign (i.e. -q).
        <table>
         {sr}
            {sh}Option{eh1}{sh}Description{eh2}
         {er}
         {sr}
            {sd}a, all{ed1}{sd}Return all rows in answer set and do not limit display{ed2}
         {er}       
         {sr}
           {sd}d{ed1}{sd}Change SQL delimiter to "@" from ";"{ed2}
         {er}
         {sr}
           {sd}e, echo{ed1}{sd}Echo the SQL command that was generated after macro and variable substituion.{ed2}
         {er}
         {sr}
           {sd}h, help{ed1}{sd}Display %sql help information.{ed2}
         {er}        
         {sr}
           {sd}j{ed1}{sd}Create a pretty JSON representation. Only the first column is formatted{ed2}
         {er}
         {sr}
           {sd}json{ed1}{sd}Retrieve the result set as a JSON record{ed2}
         {er} 
         {sr}
           {sd}pb, bar{ed1}{sd}Plot the results as a bar chart{ed2}
         {er}
         {sr}
           {sd}pl, line{ed1}{sd}Plot the results as a line chart{ed2}
         {er}
         {sr}
           {sd}pp, pie{ed1}{sd}Plot Pie: Plot the results as a pie chart{ed2}
         {er}        
         {sr}
           {sd}q, quiet{ed1}{sd}Quiet results - no answer set or messages returned from the function{ed2}
         {er}
         {sr}  
           {sd}r, array{ed1}{sd}Return the result set as an array of values{ed2}
         {er}
         {sr}
           {sd}sampledata{ed1}{sd}Create and load the EMPLOYEE and DEPARTMENT tables{ed2}
         {er}        
         {sr}
           {sd}t,time{ed1}{sd}Time the following SQL statement and return the number of times it executes in 1 second{ed2}
         {er}
         {sr}
           {sd}grid{ed1}{sd}Display the results in a scrollable grid{ed2}
         {er}       
        
        </table>
       """        
    else:
        helpSQL = """
SQL Options

The following options are available as part of a SQL statement. Options are always 
preceded with a minus sign (i.e. -q).

Option     Description
a, all     Return all rows in answer set and do not limit display 
d          Change SQL delimiter to "@" from ";" 
e, echo    Echo the SQL command that was generated after substitution 
h, help    Display %sql help information
j          Create a pretty JSON representation. Only the first column is formatted 
json       Retrieve the result set as a JSON record 
pb, bar    Plot the results as a bar chart 
pl, line   Plot the results as a line chart 
pp, pie    Plot Pie: Plot the results as a pie chart 
q, quiet   Quiet results - no answer set or messages returned from the function 
r, array   Return the result set as an array of values 
sampledata Create and load the EMPLOYEE and DEPARTMENT tables 
t,time     Time the SQL statement and return the execution count per second
grid       Display the results in a scrollable grid 
       """        
    helpSQL = helpSQL.format(**locals())
    
    if (_environment["jupyter"] == True):
        pdisplay(pHTML(helpSQL))
    else:
        print(helpSQL)

Connection Help

The calling format of this routine is:

connected_help()

This code displays help related to the CONNECT command. This code is displayed when you issue a %sql CONNECT command with no arguments or you are running a SQL statement and there isn't any connection to a database yet.


In [ ]:
def connected_help():
    
   
    sd = '<td style="text-align:left;">'
    ed = '</td>'
    sh = '<th style="text-align:left;">'
    eh = '</th>'
    sr = '<tr>'
    er = '</tr>'
    
    if (_environment['jupyter'] == True):
        
        helpConnect = """
       <h3>Connecting to Db2</h3> 
       <p>The CONNECT command has the following format:
       <p>
       <pre>
       %sql CONNECT TO &lt;database&gt; USER &lt;userid&gt; USING &lt;password|?&gt; HOST &lt;ip address&gt; PORT &lt;port number&gt; &lt;SSL&gt;
       %sql CONNECT CREDENTIALS &lt;varname&gt;
       %sql CONNECT CLOSE
       %sql CONNECT RESET
       %sql CONNECT PROMPT - use this to be prompted for values
       </pre>
       <p>
       If you use a "?" for the password field, the system will prompt you for a password. This avoids typing the 
       password as clear text on the screen. If a connection is not successful, the system will print the error
       message associated with the connect request.
       <p>
       The <b>CREDENTIALS</b> option allows you to use credentials that are supplied by Db2 on Cloud instances.
       The credentials can be supplied as a variable and if successful, the variable will be saved to disk 
       for future use. If you create another notebook and use the identical syntax, if the variable 
       is not defined, the contents on disk will be used as the credentials. You should assign the 
       credentials to a variable that represents the database (or schema) that you are communicating with. 
       Using familiar names makes it easier to remember the credentials when connecting. 
       <p>
       <b>CONNECT CLOSE</b> will close the current connection, but will not reset the database parameters. This means that
       if you issue the CONNECT command again, the system should be able to reconnect you to the database.
       <p>
       <b>CONNECT RESET</b> will close the current connection and remove any information on the connection. You will need 
       to issue a new CONNECT statement with all of the connection information.
       <p>
       If the connection is successful, the parameters are saved on your system and will be used the next time you
       run an SQL statement, or when you issue the %sql CONNECT command with no parameters.
       <p>If you issue CONNECT RESET, all of the current values will be deleted and you will need to 
       issue a new CONNECT statement. 
       <p>A CONNECT command without any parameters will attempt to re-connect to the previous database you 
       were using. If the connection could not be established, the program to prompt you for
       the values. To cancel the connection attempt, enter a blank value for any of the values. The connection 
       panel will request the following values in order to connect to Db2: 
       <table>
       {sr}
         {sh}Setting{eh}
         {sh}Description{eh}
       {er}
       {sr}
         {sd}Database{ed}{sd}Database name you want to connect to.{ed}
       {er}
       {sr}
         {sd}Hostname{ed}
         {sd}Use localhost if Db2 is running on your own machine, but this can be an IP address or host name. 
       {er}
       {sr}
         {sd}PORT{ed}
         {sd}The port to use for connecting to Db2. This is usually 50000.{ed}
       {er}
       {sr}
         {sd}SSL{ed}
         {sd}If you are connecting to a secure port (50001) with SSL then you must include this keyword in the connect string.{ed}
       {sr}         
         {sd}Userid{ed}
         {sd}The userid to use when connecting (usually DB2INST1){ed} 
       {er}
       {sr}                  
         {sd}Password{ed}
         {sd}No password is provided so you have to enter a value{ed}
       {er}
        </table>
       """
    else:
        helpConnect = """\
Connecting to Db2

The CONNECT command has the following format:

%sql CONNECT TO database USER userid USING password | ? 
                HOST ip address PORT port number SSL
%sql CONNECT CREDENTIALS varname
%sql CONNECT CLOSE
%sql CONNECT RESET

If you use a "?" for the password field, the system will prompt you for a password.
This avoids typing the password as clear text on the screen. If a connection is 
not successful, the system will print the error message associated with the connect
request.

The CREDENTIALS option allows you to use credentials that are supplied by Db2 on 
Cloud instances. The credentials can be supplied as a variable and if successful, 
the variable will be saved to disk for future use. If you create another notebook
and use the identical syntax, if the variable is not defined, the contents on disk
will be used as the credentials. You should assign the credentials to a variable 
that represents the database (or schema) that you are communicating with. Using 
familiar names makes it easier to remember the credentials when connecting. 

CONNECT CLOSE will close the current connection, but will not reset the database 
parameters. This means that if you issue the CONNECT command again, the system 
should be able to reconnect you to the database.

CONNECT RESET will close the current connection and remove any information on the
connection. You will need to issue a new CONNECT statement with all of the connection
information.

If the connection is successful, the parameters are saved on your system and will be
used the next time you run an SQL statement, or when you issue the %sql CONNECT 
command with no parameters. If you issue CONNECT RESET, all of the current values 
will be deleted and you will need to issue a new CONNECT statement. 

A CONNECT command without any parameters will attempt to re-connect to the previous 
database you were using. If the connection could not be established, the program to
prompt you for the values. To cancel the connection attempt, enter a blank value for
any of the values. The connection panel will request the following values in order 
to connect to Db2: 

  Setting    Description
  Database   Database name you want to connect to
  Hostname   Use localhost if Db2 is running on your own machine, but this can 
             be an IP address or host name. 
  PORT       The port to use for connecting to Db2. This is usually 50000. 
  Userid     The userid to use when connecting (usually DB2INST1) 
  Password   No password is provided so you have to enter a value
  SSL        Include this keyword to indicate you are connecting via SSL (usually port 50001)
"""
    
    helpConnect = helpConnect.format(**locals())
    
    if (_environment['jupyter'] == True):
        pdisplay(pHTML(helpConnect))
    else:
        print(helpConnect)

Prompt for Connection Information

If you are running an SQL statement and have not yet connected to a database, the %sql command will prompt you for connection information. In order to connect to a database, you must supply:

  • Database name
  • Host name (IP address or name)
  • Port number
  • Userid
  • Password
  • Secure socket

The routine is called without any parameters:

connected_prompt()

In [ ]:
# Prompt for Connection information

def connected_prompt():
    
    global _settings
    
    _database = ''
    _hostname = ''
    _port = ''
    _uid = ''
    _pwd = ''
    _ssl = ''
    
    print("Enter the database connection details (Any empty value will cancel the connection)")
    _database = input("Enter the database name: ");
    if (_database.strip() == ""): return False
    _hostname = input("Enter the HOST IP address or symbolic name: ");
    if (_hostname.strip() == ""): return False 
    _port = input("Enter the PORT number: ");
    if (_port.strip() == ""): return False    
    _ssl = input("Is this a secure (SSL) port (y or n)");
    if (_ssl.strip() == ""): return False
    if (_ssl == "n"):
        _ssl = ""
    else:
        _ssl = "Security=SSL;" 
    _uid = input("Enter Userid on the DB2 system: ").upper();
    if (_uid.strip() == ""): return False
    _pwd = getpass.getpass("Password [password]: ");
    if (_pwd.strip() == ""): return False
        
    _settings["database"] = _database.strip()
    _settings["hostname"] = _hostname.strip()
    _settings["port"] = _port.strip()
    _settings["uid"] = _uid.strip()
    _settings["pwd"] = _pwd.strip()
    _settings["ssl"] = _ssl.strip()
    _settings["maxrows"] = 10
    _settings["maxgrid"] = 5
    _settings["runtime"] = 1
    
    return True
    
# Split port and IP addresses

def split_string(in_port,splitter=":"):
 
    # Split input into an IP address and Port number
    
    global _settings

    checkports = in_port.split(splitter)
    ip = checkports[0]
    if (len(checkports) > 1):
        port = checkports[1]
    else:
        port = None

    return ip, port

Connect Syntax Parser

The parseConnect routine is used to parse the CONNECT command that the user issued within the %sql command. The format of the command is:

parseConnect(inSQL)

The inSQL string contains the CONNECT keyword with some additional parameters. The format of the CONNECT command is one of:

CONNECT RESET
CONNECT CLOSE
CONNECT CREDENTIALS <variable>
CONNECT TO database USER userid USING password HOST hostname PORT portnumber <SSL>

If you have credentials available from Db2 on Cloud, place the contents of the credentials into a variable and then use the CONNECT CREDENTIALS <var> syntax to connect to the database.

In addition, supplying a question mark (?) for password will result in the program prompting you for the password rather than having it as clear text in your scripts.

When all of the information is checked in the command, the db2_doConnect function is called to actually do the connection to the database.


In [ ]:
# Parse the CONNECT statement and execute if possible 

def parseConnect(inSQL,local_ns):
    
    global _settings, _connected

    _connected = False
    
    cParms = inSQL.split()
    cnt = 0
    
    _settings["ssl"] = ""
    
    while cnt < len(cParms):
        if cParms[cnt].upper() == 'TO':
            if cnt+1 < len(cParms):
                _settings["database"] = cParms[cnt+1].upper()
                cnt = cnt + 1
            else:
                errormsg("No database specified in the CONNECT statement")
                return
        elif cParms[cnt].upper() == "SSL":
            _settings["ssl"] = "Security=SSL;"  
            cnt = cnt + 1
        elif cParms[cnt].upper() == 'CREDENTIALS':
            if cnt+1 < len(cParms):
                credentials = cParms[cnt+1]
                tempid = eval(credentials,local_ns)
                if (isinstance(tempid,dict) == False): 
                    errormsg("The CREDENTIALS variable (" + credentials + ") does not contain a valid Python dictionary (JSON object)")
                    return
                if (tempid == None):
                    fname = credentials + ".pickle"
                    try:
                        with open(fname,'rb') as f: 
                            _id = pickle.load(f) 
                    except:
                        errormsg("Unable to find credential variable or file.")
                        return
                else:
                    _id = tempid
                    
                try:
                    _settings["database"] = _id["db"]
                    _settings["hostname"] = _id["hostname"]
                    _settings["port"] = _id["port"]
                    _settings["uid"] = _id["username"]
                    _settings["pwd"] = _id["password"]
                    try:
                        fname = credentials + ".pickle"
                        with open(fname,'wb') as f:
                            pickle.dump(_id,f)
            
                    except:
                        errormsg("Failed trying to write Db2 Credentials.")
                        return
                except:
                    errormsg("Credentials file is missing information. db/hostname/port/username/password required.")
                    return
                     
            else:
                errormsg("No Credentials name supplied")
                return
            
            cnt = cnt + 1
              
        elif cParms[cnt].upper() == 'USER':
            if cnt+1 < len(cParms):
                _settings["uid"] = cParms[cnt+1].upper()
                cnt = cnt + 1
            else:
                errormsg("No userid specified in the CONNECT statement")
                return
        elif cParms[cnt].upper() == 'USING':
            if cnt+1 < len(cParms):
                _settings["pwd"] = cParms[cnt+1]   
                if (_settings["pwd"] == '?'):
                    _settings["pwd"] = getpass.getpass("Password [password]: ") or "password"
                cnt = cnt + 1
            else:
                errormsg("No password specified in the CONNECT statement")
                return
        elif cParms[cnt].upper() == 'HOST':
            if cnt+1 < len(cParms):
                hostport = cParms[cnt+1].upper()
                ip, port = split_string(hostport)
                if (port == None): _settings["port"] = "50000"
                _settings["hostname"] = ip
                cnt = cnt + 1
            else:
                errormsg("No hostname specified in the CONNECT statement")
                return
        elif cParms[cnt].upper() == 'PORT':                           
            if cnt+1 < len(cParms):
                _settings["port"] = cParms[cnt+1].upper()
                cnt = cnt + 1
            else:
                errormsg("No port specified in the CONNECT statement")
                return
        elif cParms[cnt].upper() == 'PROMPT':
            if (connected_prompt() == False): 
                print("Connection canceled.")
                return 
            else:
                cnt = cnt + 1
        elif cParms[cnt].upper() in ('CLOSE','RESET') :
            try:
                result = ibm_db.close(_hdbc)
                _hdbi.close()
            except:
                pass
            success("Connection closed.")          
            if cParms[cnt].upper() == 'RESET': 
                _settings["database"] = ''
            return
        else:
            cnt = cnt + 1
                     
    _ = db2_doConnect()

Connect to Db2

The db2_doConnect routine is called when a connection needs to be established to a Db2 database. The command does not require any parameters since it relies on the settings variable which contains all of the information it needs to connect to a Db2 database.

db2_doConnect()

There are 4 additional variables that are used throughout the routines to stay connected with the Db2 database. These variables are:

  • hdbc - The connection handle to the database
  • hstmt - A statement handle used for executing SQL statements
  • connected - A flag that tells the program whether or not we are currently connected to a database
  • runtime - Used to tell %sql the length of time (default 1 second) to run a statement when timing it

The only database driver that is used in this program is the IBM DB2 ODBC DRIVER. This driver needs to be loaded on the system that is connecting to Db2. The Jupyter notebook that is built by this system installs the driver for you so you shouldn't have to do anything other than build the container.

If the connection is successful, the connected flag is set to True. Any subsequent %sql call will check to see if you are connected and initiate another prompted connection if you do not have a connection to a database.


In [ ]:
def db2_doConnect():
    
    global _hdbc, _hdbi, _connected, _runtime
    global _settings  

    if _connected == False: 
        
        if len(_settings["database"]) == 0:
            return False

    dsn = (
           "DRIVER={{IBM DB2 ODBC DRIVER}};"
           "DATABASE={0};"
           "HOSTNAME={1};"
           "PORT={2};"
           "PROTOCOL=TCPIP;"
           "UID={3};"
           "PWD={4};{5}").format(_settings["database"], 
                                 _settings["hostname"], 
                                 _settings["port"], 
                                 _settings["uid"], 
                                 _settings["pwd"],
                                 _settings["ssl"])

    # Get a database handle (hdbc) and a statement handle (hstmt) for subsequent access to DB2

    try:
        _hdbc  = ibm_db.connect(dsn, "", "")
    except Exception as err:
        db2_error(False,True) # errormsg(str(err))
        _connected = False
        _settings["database"] = ''
        return False
    
    try:
        _hdbi = ibm_db_dbi.Connection(_hdbc)
    except Exception as err:
        db2_error(False,True) # errormsg(str(err))
        _connected = False
        _settings["database"] = ''
        return False  
    
    _connected = True
    
    # Save the values for future use
    
    save_settings()
    
    success("Connection successful.")
    return True

Load/Save Settings

There are two routines that load and save settings between Jupyter notebooks. These routines are called without any parameters.

load_settings() save_settings()

There is a global structure called settings which contains the following fields:

_settings = {
     "maxrows"  : 10,   
     "maxgrid"  : 5,
     "runtime"  : 1,
     "display"  : "TEXT",
     "database" : "",
     "hostname" : "localhost",
     "port"     : "50000",
     "protocol" : "TCPIP",    
     "uid"      : "DB2INST1",
     "pwd"      : "password"
}

The information in the settings structure is used for re-connecting to a database when you start up a Jupyter notebook. When the session is established for the first time, the load_settings() function is called to get the contents of the pickle file (db2connect.pickle, a Jupyter session file) that will be used for the first connection to the database. Whenever a new connection is made, the file is updated with the save_settings() function.


In [ ]:
def load_settings():

    # This routine will load the settings from the previous session if they exist
    
    global _settings
    
    fname = "db2connect.pickle"

    try:
        with open(fname,'rb') as f: 
            _settings = pickle.load(f) 
                
        # Reset runtime to 1 since it would be unexpected to keep the same value between connections         
        _settings["runtime"] = 1
        _settings["maxgrid"] = 5
        
    except: 
        pass
    
    return

def save_settings():

    # This routine will save the current settings if they exist
    
    global _settings
    
    fname = "db2connect.pickle"
    
    try:
        with open(fname,'wb') as f:
            pickle.dump(_settings,f)
            
    except:
        errormsg("Failed trying to write Db2 Configuration Information.")
 
    return

Error and Message Functions

There are three types of messages that are thrown by the %db2 magic command. The first routine will print out a success message with no special formatting:

success(message)

The second message is used for displaying an error message that is not associated with a SQL error. This type of error message is surrounded with a red box to highlight the problem. Note that the success message has code that has been commented out that could also show a successful return code with a green box.

errormsg(message)

The final error message is based on an error occuring in the SQL code that was executed. This code will parse the message returned from the ibm_db interface and parse it to return only the error message portion (and not all of the wrapper code from the driver).

db2_error(quiet,connect=False)

The quiet flag is passed to the db2_error routine so that messages can be suppressed if the user wishes to ignore them with the -q flag. A good example of this is dropping a table that does not exist. We know that an error will be thrown so we can ignore it. The information that the db2_error routine gets is from the stmt_errormsg() function from within the ibm_db driver. The db2_error function should only be called after a SQL failure otherwise there will be no diagnostic information returned from stmt_errormsg().

If the connect flag is True, the routine will get the SQLSTATE and SQLCODE from the connection error message rather than a statement error message.


In [ ]:
def db2_error(quiet,connect=False):
    
    global sqlerror, sqlcode, sqlstate, _environment
    
    
    try:
        if (connect == False):
            errmsg = ibm_db.stmt_errormsg().replace('\r',' ')
            errmsg = errmsg[errmsg.rfind("]")+1:].strip()
        else:
            errmsg = ibm_db.conn_errormsg().replace('\r',' ')
            errmsg = errmsg[errmsg.rfind("]")+1:].strip()
            
        sqlerror = errmsg
 
        msg_start = errmsg.find("SQLSTATE=")
        if (msg_start != -1):
            msg_end = errmsg.find(" ",msg_start)
            if (msg_end == -1):
                msg_end = len(errmsg)
            sqlstate = errmsg[msg_start+9:msg_end]
        else:
            sqlstate = "0"
    
        msg_start = errmsg.find("SQLCODE=")
        if (msg_start != -1):
            msg_end = errmsg.find(" ",msg_start)
            if (msg_end == -1):
                msg_end = len(errmsg)
            sqlcode = errmsg[msg_start+8:msg_end]
            try:
                sqlcode = int(sqlcode)
            except:
                pass
        else:        
            sqlcode = 0
            
    except:
        errmsg = "Unknown error."
        sqlcode = -99999
        sqlstate = "-99999"
        sqlerror = errmsg
        return
        
    
    msg_start = errmsg.find("SQLSTATE=")
    if (msg_start != -1):
        msg_end = errmsg.find(" ",msg_start)
        if (msg_end == -1):
            msg_end = len(errmsg)
        sqlstate = errmsg[msg_start+9:msg_end]
    else:
        sqlstate = "0"
        
    
    msg_start = errmsg.find("SQLCODE=")
    if (msg_start != -1):
        msg_end = errmsg.find(" ",msg_start)
        if (msg_end == -1):
            msg_end = len(errmsg)
        sqlcode = errmsg[msg_start+8:msg_end]
        try:
            sqlcode = int(sqlcode)
        except:
            pass
    else:
        sqlcode = 0
    
    if quiet == True: return
    
    if (errmsg == ""): return

    html = '<p><p style="border:2px; border-style:solid; border-color:#FF0000; background-color:#ffe6e6; padding: 1em;">'
    
    if (_environment["jupyter"] == True):
        pdisplay(pHTML(html+errmsg+"</p>"))
    else:
        print(errmsg)
    
# Print out an error message

def errormsg(message):
    
    global _environment
    
    if (message != ""):
        html = '<p><p style="border:2px; border-style:solid; border-color:#FF0000; background-color:#ffe6e6; padding: 1em;">'
        if (_environment["jupyter"] == True):
            pdisplay(pHTML(html + message + "</p>"))     
        else:
            print(message)
    
def success(message):
    
    if (message != ""):
        print(message)
    return   

def debug(message,error=False):
    
    global _environment
    
    if (_environment["jupyter"] == True):
        spacer = "<br>" + "&nbsp;"
    else:
        spacer = "\n "
    
    if (message != ""):

        lines = message.split('\n')
        msg = ""
        indent = 0
        for line in lines:
            delta = line.count("(") - line.count(")")
            if (msg == ""):
                msg = line
                indent = indent + delta
            else:
                if (delta < 0): indent = indent + delta
                msg = msg + spacer * (indent*2) + line
                if (delta > 0): indent = indent + delta    

            if (indent < 0): indent = 0
        if (error == True):        
            html = '<p><pre style="font-family: monospace; border:2px; border-style:solid; border-color:#FF0000; background-color:#ffe6e6; padding: 1em;">'                  
        else:
            html = '<p><pre style="font-family: monospace; border:2px; border-style:solid; border-color:#008000; background-color:#e6ffe6; padding: 1em;">'
        
        if (_environment["jupyter"] == True):
            pdisplay(pHTML(html + msg + "</pre></p>"))
        else:
            print(msg)
        
    return

Macro Processor

A macro is used to generate SQL to be executed by overriding or creating a new keyword. For instance, the base %sql command does not understand the LIST TABLES command which is usually used in conjunction with the CLP processor. Rather than specifically code this in the base db2.ipynb file, we can create a macro that can execute this code for us.

There are three routines that deal with macros.

  • checkMacro is used to find the macro calls in a string. All macros are sent to parseMacro for checking.
  • runMacro will evaluate the macro and return the string to the parse
  • subvars is used to track the variables used as part of a macro call.
  • setMacro is used to catalog a macro

Set Macro

This code will catalog a macro call.


In [ ]:
def setMacro(inSQL,parms):
      
    global _macros
    
    names = parms.split()
    if (len(names) < 2):
        errormsg("No command name supplied.")
        return None
    
    macroName = names[1].upper()
    _macros[macroName] = inSQL

    return

Check Macro

This code will check to see if there is a macro command in the SQL. It will take the SQL that is supplied and strip out three values: the first and second keywords, and the remainder of the parameters.

For instance, consider the following statement:

CREATE DATABASE GEORGE options....

The name of the macro that we want to run is called CREATE. We know that there is a SQL command called CREATE but this code will call the macro first to see if needs to run any special code. For instance, CREATE DATABASE is not part of the db2.ipynb syntax, but we can add it in by using a macro.

The check macro logic will strip out the subcommand (DATABASE) and place the remainder of the string after DATABASE in options.


In [ ]:
def checkMacro(in_sql):
       
    global _macros
    
    if (len(in_sql) == 0): return(in_sql)          # Nothing to do 
    
    tokens = parseArgs(in_sql,None)                # Take the string and reduce into tokens
    
    macro_name = tokens[0].upper()                 # Uppercase the name of the token
 
    if (macro_name not in _macros): 
        return(in_sql) # No macro by this name so just return the string

    result = runMacro(_macros[macro_name],in_sql,tokens)  # Execute the macro using the tokens we found

    return(result)                                 # Runmacro will either return the original SQL or the new one

Split Assignment

This routine will return the name of a variable and it's value when the format is x=y. If y is enclosed in quotes, the quotes are removed.


In [ ]:
def splitassign(arg):
    
    var_name = "null"
    var_value = "null"
    
    arg = arg.strip()
    eq = arg.find("=")
    if (eq != -1):
        var_name = arg[:eq].strip()
        temp_value = arg[eq+1:].strip()
        if (temp_value != ""):
            ch = temp_value[0]
            if (ch in ["'",'"']):
                if (temp_value[-1:] == ch):
                    var_value = temp_value[1:-1]
                else:
                    var_value = temp_value
            else:
                var_value = temp_value
    else:
        var_value = arg

    return var_name, var_value

Parse Args

The commands that are used in the macros need to be parsed into their separate tokens. The tokens are separated by blanks and strings that enclosed in quotes are kept together.


In [ ]:
def parseArgs(argin,_vars):

    quoteChar = ""
    inQuote = False
    inArg = True
    args = []
    arg = ''
    
    for ch in argin.lstrip():
        if (inQuote == True):
            if (ch == quoteChar):
                inQuote = False   
                arg = arg + ch #z
            else:
                arg = arg + ch
        elif (ch == "\"" or ch == "\'"): # Do we have a quote
            quoteChar = ch
            arg = arg + ch #z
            inQuote = True
        elif (ch == " "):
            if (arg != ""):
                arg = subvars(arg,_vars)
                args.append(arg)
            else:
                args.append("null")
            arg = ""
        else:
            arg = arg + ch
                
    if (arg != ""):
        arg = subvars(arg,_vars)
        args.append(arg)   
               
    return(args)

Run Macro

This code will execute the body of the macro and return the results for that macro call.


In [ ]:
def runMacro(script,in_sql,tokens):
    
    result = ""
    runIT = True 
    code = script.split("\n")
    level = 0
    runlevel = [True,False,False,False,False,False,False,False,False,False]
    ifcount = 0
    _vars = {}
    
    for i in range(0,len(tokens)):
        vstr = str(i)
        _vars[vstr] = tokens[i]
        
    if (len(tokens) == 0):
        _vars["argc"] = "0"
    else:
        _vars["argc"] = str(len(tokens)-1)
          
    for line in code:
        line = line.strip()
        if (line == "" or line == "\n"): continue
        if (line[0] == "#"): continue    # A comment line starts with a # in the first position of the line
        args = parseArgs(line,_vars)     # Get all of the arguments
        if (args[0] == "if"):
            ifcount = ifcount + 1
            if (runlevel[level] == False): # You can't execute this statement
                continue
            level = level + 1    
            if (len(args) < 4):
                print("Macro: Incorrect number of arguments for the if clause.")
                return insql
            arg1 = args[1]
            arg2 = args[3]
            if (len(arg2) > 2):
                ch1 = arg2[0]
                ch2 = arg2[-1:]
                if (ch1 in ['"',"'"] and ch1 == ch2):
                    arg2 = arg2[1:-1].strip()
               
            op   = args[2]
            if (op in ["=","=="]):
                if (arg1 == arg2):
                    runlevel[level] = True
                else:
                    runlevel[level] = False                
            elif (op in ["<=","=<"]):
                if (arg1 <= arg2):
                    runlevel[level] = True
                else:
                    runlevel[level] = False                
            elif (op in [">=","=>"]):                    
                if (arg1 >= arg2):
                    runlevel[level] = True
                else:
                    runlevel[level] = False                                       
            elif (op in ["<>","!="]):                    
                if (arg1 != arg2):
                    runlevel[level] = True
                else:
                    runlevel[level] = False  
            elif (op in ["<"]):
                if (arg1 < arg2):
                    runlevel[level] = True
                else:
                    runlevel[level] = False                
            elif (op in [">"]):
                if (arg1 > arg2):
                    runlevel[level] = True
                else:
                    runlevel[level] = False                
            else:
                print("Macro: Unknown comparison operator in the if statement:" + op)

                continue

        elif (args[0] in ["exit","echo"] and runlevel[level] == True):
            msg = ""
            for msgline in args[1:]:
                if (msg == ""):
                    msg = subvars(msgline,_vars)
                else:
                    msg = msg + " " + subvars(msgline,_vars)
            if (msg != ""): 
                if (args[0] == "echo"):
                    debug(msg,error=False)
                else:
                    debug(msg,error=True)
            if (args[0] == "exit"): return ''
       
        elif (args[0] == "pass" and runlevel[level] == True):
            pass

        elif (args[0] == "var" and runlevel[level] == True):
            value = ""
            for val in args[2:]:
                if (value == ""):
                    value = subvars(val,_vars)
                else:
                    value = value + " " + subvars(val,_vars)
            value.strip()
            _vars[args[1]] = value 

        elif (args[0] == 'else'):

            if (ifcount == level):
                runlevel[level] = not runlevel[level]
                
        elif (args[0] == 'return' and runlevel[level] == True):
            return(result)

        elif (args[0] == "endif"):
            ifcount = ifcount - 1
            if (ifcount < level):
                level = level - 1
                if (level < 0):
                    print("Macro: Unmatched if/endif pairs.")
                    return ''
                
        else:
            if (runlevel[level] == True):
                if (result == ""):
                    result = subvars(line,_vars)
                else:
                    result = result + "\n" + subvars(line,_vars)
                    
    return(result)

Substitute Vars

This routine is used by the runMacro program to track variables that are used within Macros. These are kept separate from the rest of the code.


In [ ]:
def subvars(script,_vars):
    
    if (_vars == None): return script
    
    remainder = script
    result = ""
    done = False
    
    while done == False:
        bv = remainder.find("{")
        if (bv == -1):
            done = True
            continue
        ev = remainder.find("}")
        if (ev == -1):
            done = True
            continue
        result = result + remainder[:bv]
        vvar = remainder[bv+1:ev]
        remainder = remainder[ev+1:]
        
        upper = False
        allvars = False
        if (vvar[0] == "^"):
            upper = True
            vvar = vvar[1:]
        elif (vvar[0] == "*"):
            vvar = vvar[1:]
            allvars = True
        else:
            pass
        
        if (vvar in _vars):
            if (upper == True):
                items = _vars[vvar].upper()
            elif (allvars == True):
                try:
                    iVar = int(vvar)
                except:
                    return(script)
                items = ""
                sVar = str(iVar)
                while sVar in _vars:
                    if (items == ""):
                        items = _vars[sVar]
                    else:
                        items = items + " " + _vars[sVar]
                    iVar = iVar + 1
                    sVar = str(iVar)
            else:
                items = _vars[vvar]
        else:
            if (allvars == True):
                items = ""
            else:
                items = "null"                
                 
        result = result + items
                
    if (remainder != ""):
        result = result + remainder
        
    return(result)

SQL Timer

The calling format of this routine is:

count = sqlTimer(hdbc, runtime, inSQL)

This code runs the SQL string multiple times for one second (by default). The accuracy of the clock is not that great when you are running just one statement, so instead this routine will run the code multiple times for a second to give you an execution count. If you need to run the code for more than one second, the runtime value needs to be set to the number of seconds you want the code to run.

The return result is always the number of times that the code executed. Note, that the program will skip reading the data if it is a SELECT statement so it doesn't included fetch time for the answer set.


In [ ]:
def sqlTimer(hdbc, runtime, inSQL):
    
    count = 0
    t_end = time.time() + runtime
    
    while time.time() < t_end:
        
        try:
            stmt = ibm_db.exec_immediate(hdbc,inSQL) 
            if (stmt == False):
                db2_error(flag(["-q","-quiet"]))
                return(-1)
            ibm_db.free_result(stmt)
            
        except Exception as err:
            db2_error(False)
            return(-1)
        
        count = count + 1
                    
    return(count)

Split Args

This routine takes as an argument a string and then splits the arguments according to the following logic:

  • If the string starts with a ( character, it will check the last character in the string and see if it is a ) and then remove those characters
  • Every parameter is separated by a comma , and commas within quotes are ignored
  • Each parameter returned will have three values returned - one for the value itself, an indicator which will be either True if it was quoted, or False if not, and True or False if it is numeric.

Example:

   "abcdef",abcdef,456,"856"

Three values would be returned:

[abcdef,True,False],[abcdef,False,False],[456,False,True],[856,True,False]

Any quoted string will be False for numeric. The way that the parameters are handled are up to the calling program. However, in the case of Db2, the quoted strings must be in single quotes so any quoted parameter using the double quotes " must be wrapped with single quotes. There is always a possibility that a string contains single quotes (i.e. O'Connor) so any substituted text should use '' so that Db2 can properly interpret the string. This routine does not adjust the strings with quotes, and depends on the variable subtitution routine to do that.


In [ ]:
def splitargs(arguments):
    
    import types
    
    # String the string and remove the ( and ) characters if they at the beginning and end of the string
    
    results = []
    
    step1 = arguments.strip()
    if (len(step1) == 0): return(results)       # Not much to do here - no args found
    
    if (step1[0] == '('):
        if (step1[-1:] == ')'):
            step2 = step1[1:-1]
            step2 = step2.strip()
        else:
            step2 = step1
    else:
        step2 = step1
            
    # Now we have a string without brackets. Start scanning for commas
            
    quoteCH = ""
    pos = 0
    arg = ""
    args = []
            
    while pos < len(step2):
        ch = step2[pos]
        if (quoteCH == ""):                     # Are we in a quote?
            if (ch in ('"',"'")):               # Check to see if we are starting a quote
                quoteCH = ch
                arg = arg + ch
                pos += 1
            elif (ch == ","):                   # Are we at the end of a parameter?
                arg = arg.strip()
                args.append(arg)
                arg = ""
                inarg = False 
                pos += 1
            else:                               # Continue collecting the string
                arg = arg + ch
                pos += 1
        else:
            if (ch == quoteCH):                 # Are we at the end of a quote?
                arg = arg + ch                  # Add the quote to the string
                pos += 1                        # Increment past the quote
                quoteCH = ""                    # Stop quote checking (maybe!)
            else:
                pos += 1
                arg = arg + ch

    if (quoteCH != ""):                         # So we didn't end our string
        arg = arg.strip()
        args.append(arg)
    elif (arg != ""):                           # Something left over as an argument
        arg = arg.strip()
        args.append(arg)
    else:
        pass
    
    results = []
    
    for arg in args:
        result = []
        if (len(arg) > 0):
            if (arg[0] in ('"',"'")):
                value = arg[1:-1]
                isString = True
                isNumber = False
            else:
                isString = False 
                isNumber = False 
                try:
                    value = eval(arg)
                    if (type(value) == int):
                        isNumber = True
                    elif (isinstance(value,float) == True):
                        isNumber = True
                    else:
                        value = arg
                except:
                    value = arg

        else:
            value = ""
            isString = False
            isNumber = False
            
        result = [value,isString,isNumber]
        results.append(result)
        
    return results

SQL Parser

The calling format of this routine is:

sql_cmd, parameter_list, encoded_sql = sqlParser(sql_input)

This code will look at the SQL string that has been passed to it and parse it into four values:

  • sql_cmd: First command in the list (so this may not be the actual SQL command)
  • parameter_list: the values of the parameters that need to passed to the execute/pandas code
  • encoded_sql: SQL with the parameters removed if there are any (replaced with ? markers)

In [ ]:
def sqlParser(sqlin,local_ns):
       
    sql_cmd = ""
    encoded_sql = sqlin
    
    firstCommand = "(?:^\s*)([a-zA-Z]+)(?:\s+.*|$)"
    
    findFirst = re.match(firstCommand,sqlin)
    
    if (findFirst == None): # We did not find a match so we just return the empty string
        return sql_cmd, encoded_sql
    
    cmd = findFirst.group(1)
    sql_cmd = cmd.upper()

    #
    # Scan the input string looking for variables in the format :var. If no : is found just return.
    # Var must be alpha+number+_ to be valid
    #
    
    if (':' not in sqlin): # A quick check to see if parameters are in here, but not fool-proof!         
        return sql_cmd, encoded_sql    
    
    inVar = False 
    inQuote = "" 
    varName = ""
    encoded_sql = ""
    
    STRING = 0
    NUMBER = 1
    LIST = 2
    RAW = 3
    
    for ch in sqlin:
        if (inVar == True): # We are collecting the name of a variable
            if (ch.upper() in "@_ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789[]"):
                varName = varName + ch
                continue
            else:
                if (varName == ""):
                    encode_sql = encoded_sql + ":"
                elif (varName[0] in ('[',']')):
                    encoded_sql = encoded_sql + ":" + varName
                else:
                    if (ch == '.'): # If the variable name is stopped by a period, assume no quotes are used
                        flag_quotes = False
                    else:
                        flag_quotes = True
                    varValue, varType = getContents(varName,flag_quotes,local_ns)
                    if (varValue == None):                 
                        encoded_sql = encoded_sql + ":" + varName
                    else:
                        if (varType == STRING):
                            encoded_sql = encoded_sql + varValue
                        elif (varType == NUMBER):
                            encoded_sql = encoded_sql + str(varValue)
                        elif (varType == RAW):
                            encoded_sql = encoded_sql + varValue
                        elif (varType == LIST):
                            start = True
                            for v in varValue:
                                if (start == False):
                                    encoded_sql = encoded_sql + ","
                                if (isinstance(v,int) == True):         # Integer value 
                                    encoded_sql = encoded_sql + str(v)
                                elif (isinstance(v,float) == True):
                                    encoded_sql = encoded_sql + str(v)
                                else:
                                    flag_quotes = True
                                    try:
                                        if (v.find('0x') == 0):               # Just guessing this is a hex value at beginning
                                            encoded_sql = encoded_sql + v
                                        else:
                                            encoded_sql = encoded_sql + addquotes(v,flag_quotes)      # String
                                    except:
                                        encoded_sql = encoded_sql + addquotes(str(v),flag_quotes)                                   
                                start = False

                encoded_sql = encoded_sql + ch
                varName = ""
                inVar = False  
        elif (inQuote != ""):
            encoded_sql = encoded_sql + ch
            if (ch == inQuote): inQuote = ""
        elif (ch in ("'",'"')):
            encoded_sql = encoded_sql + ch
            inQuote = ch
        elif (ch == ":"): # This might be a variable
            varName = ""
            inVar = True
        else:
            encoded_sql = encoded_sql + ch
    
    if (inVar == True):
        varValue, varType = getContents(varName,True,local_ns) # We assume the end of a line is quoted
        if (varValue == None):                 
            encoded_sql = encoded_sql + ":" + varName  
        else:
            if (varType == STRING):
                encoded_sql = encoded_sql + varValue
            elif (varType == NUMBER):
                encoded_sql = encoded_sql + str(varValue)
            elif (varType == LIST):
                flag_quotes = True
                start = True
                for v in varValue:
                    if (start == False):
                        encoded_sql = encoded_sql + ","
                    if (isinstance(v,int) == True):         # Integer value 
                        encoded_sql = encoded_sql + str(v)
                    elif (isinstance(v,float) == True):
                        encoded_sql = encoded_sql + str(v)
                    else:
                        try:
                            if (v.find('0x') == 0):               # Just guessing this is a hex value
                                encoded_sql = encoded_sql + v
                            else:
                                encoded_sql = encoded_sql + addquotes(v,flag_quotes)              # String
                        except:
                            encoded_sql = encoded_sql + addquotes(str(v),flag_quotes)                                 
                    start = False

    return sql_cmd, encoded_sql

Variable Contents Function

The calling format of this routine is:

value = getContents(varName,quote,name_space)

This code will take the name of a variable as input and return the contents of that variable. If the variable is not found then the program will return None which is the equivalent to empty or null. Note that this function looks at the global variable pool for Python so it is possible that the wrong version of variable is returned if it is used in different functions. For this reason, any variables used in SQL statements should use a unique namimg convention if possible.

The other thing that this function does is replace single quotes with two quotes. The reason for doing this is that Db2 will convert two single quotes into one quote when dealing with strings. This avoids problems when dealing with text that contains multiple quotes within the string. Note that this substitution is done only for single quote characters since the double quote character is used by Db2 for naming columns that are case sensitive or contain special characters.

If the quote value is True, the field will have quotes around it. The name_space is the variables currently that are registered in Python.


In [ ]:
def getContents(varName,flag_quotes,local_ns):
    
    #
    # Get the contents of the variable name that is passed to the routine. Only simple
    # variables are checked, i.e. arrays and lists are not parsed
    #
    
    STRING = 0
    NUMBER = 1
    LIST = 2
    RAW = 3
    DICT = 4
    
    try:
        value = eval(varName,None,local_ns) # globals()[varName] # eval(varName)
    except:
        return(None,STRING)
    
    if (isinstance(value,dict) == True):          # Check to see if this is JSON dictionary
        return(addquotes(value,flag_quotes),STRING)

    elif(isinstance(value,list) == True):         # List - tricky 
        return(value,LIST)

    elif (isinstance(value,int) == True):         # Integer value 
        return(value,NUMBER)

    elif (isinstance(value,float) == True):       # Float value
        return(value,NUMBER)

    else:
        try:
            # The pattern needs to be in the first position (0 in Python terms)
            if (value.find('0x') == 0):               # Just guessing this is a hex value
                return(value,RAW)
            else:
                return(addquotes(value,flag_quotes),STRING)                     # String
        except:
            return(addquotes(str(value),flag_quotes),RAW)

Add Quotes

Quotes are a challenge when dealing with dictionaries and Db2. Db2 wants strings delimited with single quotes, while Dictionaries use double quotes. That wouldn't be a problems except imbedded single quotes within these dictionaries will cause things to fail. This routine attempts to double-quote the single quotes within the dicitonary.


In [ ]:
def addquotes(inString,flag_quotes):
    
    if (isinstance(inString,dict) == True):          # Check to see if this is JSON dictionary
        serialized = json.dumps(inString) 
    else:
        serialized = inString

    # Replace single quotes with '' (two quotes) and wrap everything in single quotes
    if (flag_quotes == False):
        return(serialized)
    else:
        return("'"+serialized.replace("'","''")+"'")    # Convert single quotes to two single quotes

Create the SAMPLE Database Tables

The calling format of this routine is:

db2_create_sample(quiet)

There are a lot of examples that depend on the data within the SAMPLE database. If you are running these examples and the connection is not to the SAMPLE database, then this code will create the two (EMPLOYEE, DEPARTMENT) tables that are used by most examples. If the function finds that these tables already exist, then nothing is done. If the tables are missing then they will be created with the same data as in the SAMPLE database.

The quiet flag tells the program not to print any messages when the creation of the tables is complete.


In [ ]:
def db2_create_sample(quiet):
    
    create_department = """
      BEGIN
        DECLARE FOUND INTEGER; 
        SET FOUND = (SELECT COUNT(*) FROM SYSIBM.SYSTABLES WHERE NAME='DEPARTMENT' AND CREATOR=CURRENT USER); 
        IF FOUND = 0 THEN 
           EXECUTE IMMEDIATE('CREATE TABLE DEPARTMENT(DEPTNO CHAR(3) NOT NULL, DEPTNAME VARCHAR(36) NOT NULL, 
                              MGRNO CHAR(6),ADMRDEPT CHAR(3) NOT NULL)'); 
           EXECUTE IMMEDIATE('INSERT INTO DEPARTMENT VALUES 
             (''A00'',''SPIFFY COMPUTER SERVICE DIV.'',''000010'',''A00''), 
             (''B01'',''PLANNING'',''000020'',''A00''), 
             (''C01'',''INFORMATION CENTER'',''000030'',''A00''), 
             (''D01'',''DEVELOPMENT CENTER'',NULL,''A00''), 
             (''D11'',''MANUFACTURING SYSTEMS'',''000060'',''D01''), 
             (''D21'',''ADMINISTRATION SYSTEMS'',''000070'',''D01''), 
             (''E01'',''SUPPORT SERVICES'',''000050'',''A00''), 
             (''E11'',''OPERATIONS'',''000090'',''E01''), 
             (''E21'',''SOFTWARE SUPPORT'',''000100'',''E01''), 
             (''F22'',''BRANCH OFFICE F2'',NULL,''E01''), 
             (''G22'',''BRANCH OFFICE G2'',NULL,''E01''), 
             (''H22'',''BRANCH OFFICE H2'',NULL,''E01''), 
             (''I22'',''BRANCH OFFICE I2'',NULL,''E01''), 
             (''J22'',''BRANCH OFFICE J2'',NULL,''E01'')');      
           END IF;
      END"""
  
    %sql -d -q {create_department} 
    
    create_employee = """
     BEGIN
        DECLARE FOUND INTEGER; 
        SET FOUND = (SELECT COUNT(*) FROM SYSIBM.SYSTABLES WHERE NAME='EMPLOYEE' AND CREATOR=CURRENT USER); 
        IF FOUND = 0 THEN 
          EXECUTE IMMEDIATE('CREATE TABLE EMPLOYEE(
                             EMPNO CHAR(6) NOT NULL,
                             FIRSTNME VARCHAR(12) NOT NULL,
                             MIDINIT CHAR(1),
                             LASTNAME VARCHAR(15) NOT NULL,
                             WORKDEPT CHAR(3),
                             PHONENO CHAR(4),
                             HIREDATE DATE,
                             JOB CHAR(8),
                             EDLEVEL SMALLINT NOT NULL,
                             SEX CHAR(1),
                             BIRTHDATE DATE,
                             SALARY DECIMAL(9,2),
                             BONUS DECIMAL(9,2),
                             COMM DECIMAL(9,2)
                             )');
          EXECUTE IMMEDIATE('INSERT INTO EMPLOYEE VALUES
             (''000010'',''CHRISTINE'',''I'',''HAAS''      ,''A00'',''3978'',''1995-01-01'',''PRES    '',18,''F'',''1963-08-24'',152750.00,1000.00,4220.00),
             (''000020'',''MICHAEL''  ,''L'',''THOMPSON''  ,''B01'',''3476'',''2003-10-10'',''MANAGER '',18,''M'',''1978-02-02'',94250.00,800.00,3300.00),
             (''000030'',''SALLY''    ,''A'',''KWAN''      ,''C01'',''4738'',''2005-04-05'',''MANAGER '',20,''F'',''1971-05-11'',98250.00,800.00,3060.00),
             (''000050'',''JOHN''     ,''B'',''GEYER''     ,''E01'',''6789'',''1979-08-17'',''MANAGER '',16,''M'',''1955-09-15'',80175.00,800.00,3214.00),
             (''000060'',''IRVING''   ,''F'',''STERN''     ,''D11'',''6423'',''2003-09-14'',''MANAGER '',16,''M'',''1975-07-07'',72250.00,500.00,2580.00),
             (''000070'',''EVA''      ,''D'',''PULASKI''   ,''D21'',''7831'',''2005-09-30'',''MANAGER '',16,''F'',''2003-05-26'',96170.00,700.00,2893.00),
             (''000090'',''EILEEN''   ,''W'',''HENDERSON'' ,''E11'',''5498'',''2000-08-15'',''MANAGER '',16,''F'',''1971-05-15'',89750.00,600.00,2380.00),
             (''000100'',''THEODORE'' ,''Q'',''SPENSER''   ,''E21'',''0972'',''2000-06-19'',''MANAGER '',14,''M'',''1980-12-18'',86150.00,500.00,2092.00),
             (''000110'',''VINCENZO'' ,''G'',''LUCCHESSI'' ,''A00'',''3490'',''1988-05-16'',''SALESREP'',19,''M'',''1959-11-05'',66500.00,900.00,3720.00),
             (''000120'',''SEAN''     ,'' '',''O`CONNELL'' ,''A00'',''2167'',''1993-12-05'',''CLERK   '',14,''M'',''1972-10-18'',49250.00,600.00,2340.00),
             (''000130'',''DELORES''  ,''M'',''QUINTANA''  ,''C01'',''4578'',''2001-07-28'',''ANALYST '',16,''F'',''1955-09-15'',73800.00,500.00,1904.00),
             (''000140'',''HEATHER''  ,''A'',''NICHOLLS''  ,''C01'',''1793'',''2006-12-15'',''ANALYST '',18,''F'',''1976-01-19'',68420.00,600.00,2274.00),
             (''000150'',''BRUCE''    ,'' '',''ADAMSON''   ,''D11'',''4510'',''2002-02-12'',''DESIGNER'',16,''M'',''1977-05-17'',55280.00,500.00,2022.00),
             (''000160'',''ELIZABETH'',''R'',''PIANKA''    ,''D11'',''3782'',''2006-10-11'',''DESIGNER'',17,''F'',''1980-04-12'',62250.00,400.00,1780.00),
             (''000170'',''MASATOSHI'',''J'',''YOSHIMURA'' ,''D11'',''2890'',''1999-09-15'',''DESIGNER'',16,''M'',''1981-01-05'',44680.00,500.00,1974.00),
             (''000180'',''MARILYN''  ,''S'',''SCOUTTEN''  ,''D11'',''1682'',''2003-07-07'',''DESIGNER'',17,''F'',''1979-02-21'',51340.00,500.00,1707.00),
             (''000190'',''JAMES''    ,''H'',''WALKER''    ,''D11'',''2986'',''2004-07-26'',''DESIGNER'',16,''M'',''1982-06-25'',50450.00,400.00,1636.00),
             (''000200'',''DAVID''    ,'' '',''BROWN''     ,''D11'',''4501'',''2002-03-03'',''DESIGNER'',16,''M'',''1971-05-29'',57740.00,600.00,2217.00),
             (''000210'',''WILLIAM''  ,''T'',''JONES''     ,''D11'',''0942'',''1998-04-11'',''DESIGNER'',17,''M'',''2003-02-23'',68270.00,400.00,1462.00),
             (''000220'',''JENNIFER'' ,''K'',''LUTZ''      ,''D11'',''0672'',''1998-08-29'',''DESIGNER'',18,''F'',''1978-03-19'',49840.00,600.00,2387.00),
             (''000230'',''JAMES''    ,''J'',''JEFFERSON'' ,''D21'',''2094'',''1996-11-21'',''CLERK   '',14,''M'',''1980-05-30'',42180.00,400.00,1774.00),
             (''000240'',''SALVATORE'',''M'',''MARINO''    ,''D21'',''3780'',''2004-12-05'',''CLERK   '',17,''M'',''2002-03-31'',48760.00,600.00,2301.00),
             (''000250'',''DANIEL''   ,''S'',''SMITH''     ,''D21'',''0961'',''1999-10-30'',''CLERK   '',15,''M'',''1969-11-12'',49180.00,400.00,1534.00),
             (''000260'',''SYBIL''    ,''P'',''JOHNSON''   ,''D21'',''8953'',''2005-09-11'',''CLERK   '',16,''F'',''1976-10-05'',47250.00,300.00,1380.00),
             (''000270'',''MARIA''    ,''L'',''PEREZ''     ,''D21'',''9001'',''2006-09-30'',''CLERK   '',15,''F'',''2003-05-26'',37380.00,500.00,2190.00),
             (''000280'',''ETHEL''    ,''R'',''SCHNEIDER'' ,''E11'',''8997'',''1997-03-24'',''OPERATOR'',17,''F'',''1976-03-28'',36250.00,500.00,2100.00),
             (''000290'',''JOHN''     ,''R'',''PARKER''    ,''E11'',''4502'',''2006-05-30'',''OPERATOR'',12,''M'',''1985-07-09'',35340.00,300.00,1227.00),
             (''000300'',''PHILIP''   ,''X'',''SMITH''     ,''E11'',''2095'',''2002-06-19'',''OPERATOR'',14,''M'',''1976-10-27'',37750.00,400.00,1420.00),
             (''000310'',''MAUDE''    ,''F'',''SETRIGHT''  ,''E11'',''3332'',''1994-09-12'',''OPERATOR'',12,''F'',''1961-04-21'',35900.00,300.00,1272.00),
             (''000320'',''RAMLAL''   ,''V'',''MEHTA''     ,''E21'',''9990'',''1995-07-07'',''FIELDREP'',16,''M'',''1962-08-11'',39950.00,400.00,1596.00),
             (''000330'',''WING''     ,'' '',''LEE''       ,''E21'',''2103'',''2006-02-23'',''FIELDREP'',14,''M'',''1971-07-18'',45370.00,500.00,2030.00),
             (''000340'',''JASON''    ,''R'',''GOUNOT''    ,''E21'',''5698'',''1977-05-05'',''FIELDREP'',16,''M'',''1956-05-17'',43840.00,500.00,1907.00),
             (''200010'',''DIAN''     ,''J'',''HEMMINGER'' ,''A00'',''3978'',''1995-01-01'',''SALESREP'',18,''F'',''1973-08-14'',46500.00,1000.00,4220.00),
             (''200120'',''GREG''     ,'' '',''ORLANDO''   ,''A00'',''2167'',''2002-05-05'',''CLERK   '',14,''M'',''1972-10-18'',39250.00,600.00,2340.00),
             (''200140'',''KIM''      ,''N'',''NATZ''      ,''C01'',''1793'',''2006-12-15'',''ANALYST '',18,''F'',''1976-01-19'',68420.00,600.00,2274.00),
             (''200170'',''KIYOSHI''  ,'' '',''YAMAMOTO''  ,''D11'',''2890'',''2005-09-15'',''DESIGNER'',16,''M'',''1981-01-05'',64680.00,500.00,1974.00),
             (''200220'',''REBA''     ,''K'',''JOHN''      ,''D11'',''0672'',''2005-08-29'',''DESIGNER'',18,''F'',''1978-03-19'',69840.00,600.00,2387.00),
             (''200240'',''ROBERT''   ,''M'',''MONTEVERDE'',''D21'',''3780'',''2004-12-05'',''CLERK   '',17,''M'',''1984-03-31'',37760.00,600.00,2301.00),
             (''200280'',''EILEEN''   ,''R'',''SCHWARTZ''  ,''E11'',''8997'',''1997-03-24'',''OPERATOR'',17,''F'',''1966-03-28'',46250.00,500.00,2100.00),
             (''200310'',''MICHELLE'' ,''F'',''SPRINGER''  ,''E11'',''3332'',''1994-09-12'',''OPERATOR'',12,''F'',''1961-04-21'',35900.00,300.00,1272.00),
             (''200330'',''HELENA''   ,'' '',''WONG''      ,''E21'',''2103'',''2006-02-23'',''FIELDREP'',14,''F'',''1971-07-18'',35370.00,500.00,2030.00),
             (''200340'',''ROY''      ,''R'',''ALONZO''    ,''E21'',''5698'',''1997-07-05'',''FIELDREP'',16,''M'',''1956-05-17'',31840.00,500.00,1907.00)');                             
        END IF;
     END"""
    
    %sql -d -q {create_employee}    
    
    if (quiet == False): success("Sample tables [EMPLOYEE, DEPARTMENT] created.")

Check option

This function will return the original string with the option removed, and a flag or true or false of the value is found.

args, flag = checkOption(option_string, option, false_value, true_value)

Options are specified with a -x where x is the character that we are searching for. It may actually be more than one character long like -pb/-pi/etc... The false and true values are optional. By default these are the boolean values of T/F but for some options it could be a character string like ';' versus '@' for delimiters.


In [ ]:
def checkOption(args_in, option, vFalse=False, vTrue=True):
    
    args_out = args_in.strip()
    found = vFalse
    
    if (args_out != ""):
        if (args_out.find(option) >= 0):
            args_out = args_out.replace(option," ")
            args_out = args_out.strip()
            found = vTrue

    return args_out, found

Plot Data

This function will plot the data that is returned from the answer set. The plot value determines how we display the data. 1=Bar, 2=Pie, 3=Line, 4=Interactive.

plotData(flag_plot, hdbi, sql, parms)

The hdbi is the ibm_db_sa handle that is used by pandas dataframes to run the sql. The parms contains any of the parameters required to run the query.


In [ ]:
def plotData(hdbi, sql):
    
    try:
        df = pandas.read_sql(sql,hdbi)
          
    except Exception as err:
        db2_error(False)
        return
                
        
    if df.empty:
        errormsg("No results returned")
        return
    
    col_count = len(df.columns)

    if flag(["-pb","-bar"]):                                    # Plot 1 = bar chart
    
        if (col_count in (1,2,3)):
            
            if (col_count == 1):
 
                df.index = df.index + 1
                _ = df.plot(kind='bar');
                _ = plt.plot();
                
            elif (col_count == 2):
 
                xlabel = df.columns.values[0]
                ylabel = df.columns.values[1]
                df.plot(kind='bar',x=xlabel,y=ylabel);
                _ = plt.plot();
                
            else:
 
                values = df.columns.values[2]
                columns = df.columns.values[0]
                index = df.columns.values[1]
                pivoted = pandas.pivot_table(df, values=values, columns=columns, index=index) 
                _ = pivoted.plot.bar(); 
            
        else:
            errormsg("Can't determine what columns to plot")
            return
                    
    elif flag(["-pp","-pie"]):                                  # Plot 2 = pie chart
        
        if (col_count in (1,2)):  
                
            if (col_count == 1):
                df.index = df.index + 1
                yname = df.columns.values[0]
                _ = df.plot(kind='pie',y=yname);                
            else:          
                xlabel = df.columns.values[0]
                xname = df[xlabel].tolist()
                yname = df.columns.values[1]
                _ = df.plot(kind='pie',y=yname,labels=xname);
                
            plt.show();
            
        else:
            errormsg("Can't determine what columns to plot")
            return
                    
    elif flag(["-pl","-line"]):                                  # Plot 3 = line chart
            
        if (col_count in (1,2,3)): 
            
            if (col_count == 1):
                df.index = df.index + 1  
                _ = df.plot(kind='line');          
            elif (col_count == 2):            
                xlabel = df.columns.values[0]
                ylabel = df.columns.values[1]
                _ = df.plot(kind='line',x=xlabel,y=ylabel) ; 
            else:         
                values = df.columns.values[2]
                columns = df.columns.values[0]
                index = df.columns.values[1]
                pivoted = pandas.pivot_table(df, values=values, columns=columns, index=index)
                _ = pivoted.plot();
                
            plt.show();
                
        else:
            errormsg("Can't determine what columns to plot")
            return
    else:
        return

Find a Procedure

This routine will check to see if a procedure exists with the SCHEMA/NAME (or just NAME if no schema is supplied) and returns the number of answer sets returned. Possible values are 0, 1 (or greater) or None. If None is returned then we can't find the procedure anywhere.


In [ ]:
def findProc(procname):
    
    global _hdbc, _hdbi, _connected, _runtime
    
    # Split the procedure name into schema.procname if appropriate
    upper_procname = procname.upper()
    schema, proc = split_string(upper_procname,".") # Expect schema.procname
    if (proc == None):
        proc = schema

    # Call ibm_db.procedures to see if the procedure does exist
    schema = "%"

    try:
        stmt = ibm_db.procedures(_hdbc, None, schema, proc) 
        if (stmt == False):                         # Error executing the code
            errormsg("Procedure " + procname + " not found in the system catalog.")
            return None

        result = ibm_db.fetch_tuple(stmt)
        resultsets = result[5]
        if (resultsets >= 1): resultsets = 1
        return resultsets
            
    except Exception as err:
        errormsg("Procedure " + procname + " not found in the system catalog.")
        return None

Parse Call Arguments

This code will parse a SQL call #name(parm1,...) and return the name and the parameters in the call.


In [ ]:
def parseCallArgs(macro):
    
    quoteChar = ""
    inQuote = False
    inParm = False
    ignore = False
    name = ""
    parms = []
    parm = ''
    
    sqlin = macro.replace("\n","")
    sqlin.lstrip()
    
    for ch in sqlin:
        if (inParm == False):
            # We hit a blank in the name, so ignore everything after the procedure name until a ( is found
            if (ch == " "): 
                ignore == True
            elif (ch ==  "("): # Now we have parameters to send to the stored procedure
                inParm = True
            else:
                if (ignore == False): name = name + ch # The name of the procedure (and no blanks)
        else:
            if (inQuote == True):
                if (ch == quoteChar):
                    inQuote = False  
                else:
                    parm = parm + ch
            elif (ch in ("\"","\'","[")): # Do we have a quote
                if (ch == "["):
                    quoteChar = "]"
                else:
                    quoteChar = ch
                inQuote = True
            elif (ch == ")"):
                if (parm != ""):
                    parms.append(parm)
                parm = ""
                break
            elif (ch == ","):
                if (parm != ""):
                    parms.append(parm)                  
                else:
                    parms.append("null")
                    
                parm = ""

            else:
                parm = parm + ch
                
    if (inParm == True):
        if (parm != ""):
            parms.append(parm_value)      

    return(name,parms)

Get Columns

Given a statement handle, determine what the column names are or the data types.


In [ ]:
def getColumns(stmt):
    
    columns = []
    types = []
    colcount = 0
    try:
        colname = ibm_db.field_name(stmt,colcount)
        coltype = ibm_db.field_type(stmt,colcount)
        while (colname != False):
            columns.append(colname)
            types.append(coltype)
            colcount += 1
            colname = ibm_db.field_name(stmt,colcount)
            coltype = ibm_db.field_type(stmt,colcount)            
        return columns,types   
                
    except Exception as err:
        db2_error(False)
        return None

Call a Procedure

The CALL statement is used for execution of a stored procedure. The format of the CALL statement is:

CALL PROC_NAME(x,y,z,...)

Procedures allow for the return of answer sets (cursors) as well as changing the contents of the parameters being passed to the procedure. In this implementation, the CALL function is limited to returning one answer set (or nothing). If you want to use more complex stored procedures then you will have to use the native python libraries.


In [ ]:
def parseCall(hdbc, inSQL, local_ns):
    
    global _hdbc, _hdbi, _connected, _runtime, _environment
    
    # Check to see if we are connected first
    if (_connected == False):                                      # Check if you are connected 
        db2_doConnect()
        if _connected == False: return None
    
    remainder = inSQL.strip()
    procName, procArgs = parseCallArgs(remainder[5:]) # Assume that CALL ... is the format
    
    resultsets = findProc(procName)
    if (resultsets == None): return None
    
    argvalues = []
 
    if (len(procArgs) > 0): # We have arguments to consider
        for arg in procArgs:
            varname = arg
            if (len(varname) > 0):
                if (varname[0] == ":"):
                    checkvar = varname[1:]
                    varvalue = getContents(checkvar,True,local_ns)
                    if (varvalue == None):
                        errormsg("Variable " + checkvar + " is not defined.")
                        return None
                    argvalues.append(varvalue)
                else:
                    if (varname.upper() == "NULL"):
                        argvalues.append(None)
                    else:
                        argvalues.append(varname)
            else:
                argvalues.append(None)

    
    try:

        if (len(procArgs) > 0):
            argtuple = tuple(argvalues)
            result = ibm_db.callproc(_hdbc,procName,argtuple)
            stmt = result[0]
        else:
            result = ibm_db.callproc(_hdbc,procName)
            stmt = result
        
        if (resultsets == 1 and stmt != None):

            columns, types = getColumns(stmt)
            if (columns == None): return None
            
            rows = []
            rowlist = ibm_db.fetch_tuple(stmt)
            while ( rowlist ) :
                row = []
                colcount = 0
                for col in rowlist:
                    try:
                        if (types[colcount] in ["int","bigint"]):
                            row.append(int(col))
                        elif (types[colcount] in ["decimal","real"]):
                            row.append(float(col))
                        elif (types[colcount] in ["date","time","timestamp"]):
                            row.append(str(col))
                        else:
                            row.append(col)
                    except:
                        row.append(col)
                    colcount += 1
                rows.append(row)
                rowlist = ibm_db.fetch_tuple(stmt)
            
            if flag(["-r","-array"]):
                rows.insert(0,columns)
                if len(procArgs) > 0:
                    allresults = []
                    allresults.append(rows)
                    for x in result[1:]:
                        allresults.append(x)
                    return allresults # rows,returned_results
                else:
                    return rows
            else:
                df = pandas.DataFrame.from_records(rows,columns=columns)
                if flag("-grid") or _settings['display'] == 'GRID':
                    if (_environment['qgrid'] == False):
                        with pandas.option_context('display.max_rows', None, 'display.max_columns', None):  
                            pdisplay(df)
                    else:
                        try:
                            pdisplay(qgrid.show_grid(df))
                        except:
                            errormsg("Grid cannot be used to display data with duplicate column names. Use option -a or %sql OPTION DISPLAY PANDAS instead.")
                            
                    return                             
                else:
                    if flag(["-a","-all"]) or _settings["maxrows"] == -1 : # All of the rows
                        with pandas.option_context('display.max_rows', None, 'display.max_columns', None): 
                            pdisplay(df)
                    else:
                        return df
            
        else:
            if len(procArgs) > 0:
                allresults = []
                for x in result[1:]:
                    allresults.append(x)
                return allresults # rows,returned_results
            else:
                return None
            
    except Exception as err:
        db2_error(False)
        return None

Parse Prepare/Execute

The PREPARE statement is used for repeated execution of a SQL statement. The PREPARE statement has the format:

stmt = PREPARE SELECT EMPNO FROM EMPLOYEE WHERE WORKDEPT=? AND SALARY<?

The SQL statement that you want executed is placed after the PREPARE statement with the location of variables marked with ? (parameter) markers. The variable stmt contains the prepared statement that need to be passed to the EXECUTE statement. The EXECUTE statement has the format:

EXECUTE :x USING z, y, s

The first variable (:x) is the name of the variable that you assigned the results of the prepare statement. The values after the USING clause are substituted into the prepare statement where the ? markers are found.

If the values in USING clause are variable names (z, y, s), a link is created to these variables as part of the execute statement. If you use the variable subsitution form of variable name (:z, :y, :s), the contents of the variable are placed into the USING clause. Normally this would not make much of a difference except when you are dealing with binary strings or JSON strings where the quote characters may cause some problems when subsituted into the statement.


In [ ]:
def parsePExec(hdbc, inSQL):
     
    import ibm_db    
    global _stmt, _stmtID, _stmtSQL, sqlcode
    
    cParms = inSQL.split()
    parmCount = len(cParms)
    if (parmCount == 0): return(None)                          # Nothing to do but this shouldn't happen
    
    keyword = cParms[0].upper()                                  # Upper case the keyword
    
    if (keyword == "PREPARE"):                                   # Prepare the following SQL
        uSQL = inSQL.upper()
        found = uSQL.find("PREPARE")
        sql = inSQL[found+7:].strip()

        try:
            pattern = "\?\*[0-9]+"
            findparm = re.search(pattern,sql)
            while findparm != None:
                found = findparm.group(0)
                count = int(found[2:])
                markers = ('?,' * count)[:-1]
                sql = sql.replace(found,markers)
                findparm = re.search(pattern,sql)
            
            stmt = ibm_db.prepare(hdbc,sql) # Check error code here
            if (stmt == False): 
                db2_error(False)
                return(False)
            
            stmttext = str(stmt).strip()
            stmtID = stmttext[33:48].strip()
            
            if (stmtID in _stmtID) == False:
                _stmt.append(stmt)              # Prepare and return STMT to caller
                _stmtID.append(stmtID)
            else:
                stmtIX = _stmtID.index(stmtID)
                _stmt[stmtiX] = stmt
                 
            return(stmtID)
        
        except Exception as err:
            print(err)
            db2_error(False)
            return(False)

    if (keyword == "EXECUTE"):                                  # Execute the prepare statement
        if (parmCount < 2): return(False)                    # No stmtID available
        
        stmtID = cParms[1].strip()
        if (stmtID in _stmtID) == False:
            errormsg("Prepared statement not found or invalid.")
            return(False)

        stmtIX = _stmtID.index(stmtID)
        stmt = _stmt[stmtIX]

        try:        

            if (parmCount == 2):                           # Only the statement handle available
                result = ibm_db.execute(stmt)               # Run it
            elif (parmCount == 3):                          # Not quite enough arguments
                errormsg("Missing or invalid USING clause on EXECUTE statement.")
                sqlcode = -99999
                return(False)
            else:
                using = cParms[2].upper()
                if (using != "USING"):                     # Bad syntax again
                    errormsg("Missing USING clause on EXECUTE statement.")
                    sqlcode = -99999
                    return(False)
                
                uSQL = inSQL.upper()
                found = uSQL.find("USING")
                parmString = inSQL[found+5:].strip()
                parmset = splitargs(parmString)
 
                if (len(parmset) == 0):
                    errormsg("Missing parameters after the USING clause.")
                    sqlcode = -99999
                    return(False)
                    
                parms = []

                parm_count = 0
                
                CONSTANT = 0
                VARIABLE = 1
                const = [0]
                const_cnt = 0
                
                for v in parmset:
                    
                    parm_count = parm_count + 1
                    
                    if (v[1] == True or v[2] == True): # v[1] true if string, v[2] true if num
                        
                        parm_type = CONSTANT                        
                        const_cnt = const_cnt + 1
                        if (v[2] == True):
                            if (isinstance(v[0],int) == True):         # Integer value 
                                sql_type = ibm_db.SQL_INTEGER
                            elif (isinstance(v[0],float) == True):       # Float value
                                sql_type = ibm_db.SQL_DOUBLE
                            else:
                                sql_type = ibm_db.SQL_INTEGER
                        else:
                            sql_type = ibm_db.SQL_CHAR
                        
                        const.append(v[0])

                        
                    else:
                    
                        parm_type = VARIABLE
                    
                        # See if the variable has a type associated with it varname@type
                    
                        varset = v[0].split("@")
                        parm_name = varset[0]
                        
                        parm_datatype = "char"

                        # Does the variable exist?
                        if (parm_name not in globals()):
                            errormsg("SQL Execute parameter " + parm_name + " not found")
                            sqlcode = -99999
                            return(false)                        
        
                        if (len(varset) > 1):                # Type provided
                            parm_datatype = varset[1]

                        if (parm_datatype == "dec" or parm_datatype == "decimal"):
                            sql_type = ibm_db.SQL_DOUBLE
                        elif (parm_datatype == "bin" or parm_datatype == "binary"):
                            sql_type = ibm_db.SQL_BINARY
                        elif (parm_datatype == "int" or parm_datatype == "integer"):
                            sql_type = ibm_db.SQL_INTEGER
                        else:
                            sql_type = ibm_db.SQL_CHAR
                    
                    try:
                        if (parm_type == VARIABLE):
                            result = ibm_db.bind_param(stmt, parm_count, globals()[parm_name], ibm_db.SQL_PARAM_INPUT, sql_type)
                        else:
                            result = ibm_db.bind_param(stmt, parm_count, const[const_cnt], ibm_db.SQL_PARAM_INPUT, sql_type)
                            
                    except:
                        result = False
                        
                    if (result == False):
                        errormsg("SQL Bind on variable " + parm_name + " failed.")
                        sqlcode = -99999
                        return(false) 
                    
                result = ibm_db.execute(stmt) # ,tuple(parms))
                
            if (result == False): 
                errormsg("SQL Execute failed.")      
                return(False)
            
            if (ibm_db.num_fields(stmt) == 0): return(True) # Command successfully completed
                          
            return(fetchResults(stmt))
                        
        except Exception as err:
            db2_error(False)
            return(False)
        
        return(False)
  
    return(False)

Fetch Result Set

This code will take the stmt handle and then produce a result set of rows as either an array (-r,-array) or as an array of json records (-json).


In [ ]:
def fetchResults(stmt):
     
    global sqlcode
    
    rows = []
    columns, types = getColumns(stmt)
    
    # By default we assume that the data will be an array
    is_array = True
    
    # Check what type of data we want returned - array or json
    if (flag(["-r","-array"]) == False):
        # See if we want it in JSON format, if not it remains as an array
        if (flag("-json") == True):
            is_array = False
    
    # Set column names to lowercase for JSON records
    if (is_array == False):
        columns = [col.lower() for col in columns] # Convert to lowercase for each of access
    
    # First row of an array has the column names in it
    if (is_array == True):
        rows.append(columns)
        
    result = ibm_db.fetch_tuple(stmt)
    rowcount = 0
    while (result):
        
        rowcount += 1
        
        if (is_array == True):
            row = []
        else:
            row = {}
            
        colcount = 0
        for col in result:
            try:
                if (types[colcount] in ["int","bigint"]):
                    if (is_array == True):
                        row.append(int(col))
                    else:
                        row[columns[colcount]] = int(col)
                elif (types[colcount] in ["decimal","real"]):
                    if (is_array == True):
                        row.append(float(col))
                    else:
                        row[columns[colcount]] = float(col)
                elif (types[colcount] in ["date","time","timestamp"]):
                    if (is_array == True):
                        row.append(str(col))
                    else:
                        row[columns[colcount]] = str(col)
                else:
                    if (is_array == True):
                        row.append(col)
                    else:
                        row[columns[colcount]] = col
                        
            except:
                if (is_array == True):
                    row.append(col)
                else:
                    row[columns[colcount]] = col
                    
            colcount += 1
        
        rows.append(row)
        result = ibm_db.fetch_tuple(stmt)
        
    if (rowcount == 0): 
        sqlcode = 100        
    else:
        sqlcode = 0
        
    return rows

Parse Commit

There are three possible COMMIT verbs that can bs used:

  • COMMIT [WORK] - Commit the work in progress - The WORK keyword is not checked for
  • ROLLBACK - Roll back the unit of work
  • AUTOCOMMIT ON/OFF - Are statements committed on or off?

The statement is passed to this routine and then checked.


In [ ]:
def parseCommit(sql):
    
    global _hdbc, _hdbi, _connected, _runtime, _stmt, _stmtID, _stmtSQL

    if (_connected == False): return                        # Nothing to do if we are not connected
    
    cParms = sql.split()
    if (len(cParms) == 0): return                           # Nothing to do but this shouldn't happen
    
    keyword = cParms[0].upper()                             # Upper case the keyword
    
    if (keyword == "COMMIT"):                               # Commit the work that was done
        try:
            result = ibm_db.commit (_hdbc)                  # Commit the connection
            if (len(cParms) > 1):
                keyword = cParms[1].upper()
                if (keyword == "HOLD"):
                    return
            
            del _stmt[:]
            del _stmtID[:]

        except Exception as err:
            db2_error(False)
        
        return
        
    if (keyword == "ROLLBACK"):                             # Rollback the work that was done
        try:
            result = ibm_db.rollback(_hdbc)                  # Rollback the connection
            del _stmt[:]
            del _stmtID[:]            

        except Exception as err:
            db2_error(False)
        
        return
    
    if (keyword == "AUTOCOMMIT"):                           # Is autocommit on or off
        if (len(cParms) > 1): 
            op = cParms[1].upper()                          # Need ON or OFF value
        else:
            return
        
        try:
            if (op == "OFF"):
                ibm_db.autocommit(_hdbc, False)
            elif (op == "ON"):
                ibm_db.autocommit (_hdbc, True)
            return    
        
        except Exception as err:
            db2_error(False)
            return 
        
    return

Set Flags

This code will take the input SQL block and update the global flag list. The global flag list is just a list of options that are set at the beginning of a code block. The absence of a flag means it is false. If it exists it is true.


In [ ]:
def setFlags(inSQL):
    
    global _flags
    
    _flags = [] # Delete all of the current flag settings
    
    pos = 0
    end = len(inSQL)-1
    inFlag = False
    ignore = False
    outSQL = ""
    flag = ""
    
    while (pos <= end):
        ch = inSQL[pos]
        if (ignore == True):   
            outSQL = outSQL + ch
        else:
            if (inFlag == True):
                if (ch != " "):
                    flag = flag + ch
                else:
                    _flags.append(flag)
                    inFlag = False
            else:
                if (ch == "-"):
                    flag = "-"
                    inFlag = True
                elif (ch == ' '):
                    outSQL = outSQL + ch
                else:
                    outSQL = outSQL + ch
                    ignore = True
        pos += 1
        
    if (inFlag == True):
        _flags.append(flag)
        
    return outSQL

Check to see if flag Exists

This function determines whether or not a flag exists in the global flag array. Absence of a value means it is false. The parameter can be a single value, or an array of values.


In [ ]:
def flag(inflag):
    
    global _flags

    if isinstance(inflag,list):
        for x in inflag:
            if (x in _flags):
                return True
        return False
    else:
        if (inflag in _flags):
            return True
        else:
            return False

Generate a list of SQL lines based on a delimiter

Note that this function will make sure that quotes are properly maintained so that delimiters inside of quoted strings do not cause errors.


In [ ]:
def splitSQL(inputString, delimiter):
     
    pos = 0
    arg = ""
    results = []
    quoteCH = ""
    
    inSQL = inputString.strip()
    if (len(inSQL) == 0): return(results)       # Not much to do here - no args found
            
    while pos < len(inSQL):
        ch = inSQL[pos]
        pos += 1
        if (ch in ('"',"'")):                   # Is this a quote characters?
            arg = arg + ch                      # Keep appending the characters to the current arg
            if (ch == quoteCH):                 # Is this quote character we are in
                quoteCH = ""
            elif (quoteCH == ""):               # Create the quote
                quoteCH = ch
            else:
                None
        elif (quoteCH != ""):                   # Still in a quote
            arg = arg + ch
        elif (ch == delimiter):                 # Is there a delimiter?
            results.append(arg)
            arg = ""
        else:
            arg = arg + ch
            
    if (arg != ""):
        results.append(arg)
        
    return(results)

Main %sql Magic Definition

The main %sql Magic logic is found in this section of code. This code will register the Magic command and allow Jupyter notebooks to interact with Db2 by using this extension.


In [ ]:
@magics_class
class DB2(Magics):
   
    @needs_local_scope    
    @line_cell_magic
    def sql(self, line, cell=None, local_ns=None):
            
        # Before we event get started, check to see if you have connected yet. Without a connection we 
        # can't do anything. You may have a connection request in the code, so if that is true, we run those,
        # otherwise we connect immediately
        
        # If your statement is not a connect, and you haven't connected, we need to do it for you
    
        global _settings, _environment
        global _hdbc, _hdbi, _connected, _runtime, sqlstate, sqlerror, sqlcode, sqlelapsed
             
        # If you use %sql (line) we just run the SQL. If you use %%SQL the entire cell is run.
        
        flag_cell = False
        flag_output = False
        sqlstate = "0"
        sqlerror = ""
        sqlcode = 0
        sqlelapsed = 0
        
        start_time = time.time()
        end_time = time.time()
              
        # Macros gets expanded before anything is done
                
        SQL1 = setFlags(line.strip())  
        SQL1 = checkMacro(SQL1)                                   # Update the SQL if any macros are in there
        SQL2 = cell    
        
        if flag("-sampledata"):                                   # Check if you only want sample data loaded
            if (_connected == False):
                if (db2_doConnect() == False):
                    errormsg('A CONNECT statement must be issued before issuing SQL statements.')
                    return                              
                
            db2_create_sample(flag(["-q","-quiet"]))
            return  
        
        if SQL1 == "?" or flag(["-h","-help"]):                   # Are you asking for help
            sqlhelp()
            return
        
        if len(SQL1) == 0 and SQL2 == None: return                # Nothing to do here
                
        # Check for help

        if SQL1.upper() == "? CONNECT":                           # Are you asking for help on CONNECT
            connected_help()
            return        
        
        sqlType,remainder = sqlParser(SQL1,local_ns)              # What type of command do you have?
                
        if (sqlType == "CONNECT"):                                # A connect request 
            parseConnect(SQL1,local_ns)
            return
        elif (sqlType == "DEFINE"):                               # Create a macro from the body
            result = setMacro(SQL2,remainder)
            return
        elif (sqlType == "OPTION"):
            setOptions(SQL1)
            return 
        elif (sqlType == 'COMMIT' or sqlType == 'ROLLBACK' or sqlType == 'AUTOCOMMIT'):
            parseCommit(remainder)
            return
        elif (sqlType == "PREPARE"):
            pstmt = parsePExec(_hdbc, remainder)
            return(pstmt)
        elif (sqlType == "EXECUTE"):
            result = parsePExec(_hdbc, remainder)
            return(result)    
        elif (sqlType == "CALL"):
            result = parseCall(_hdbc, remainder, local_ns)
            return(result)
        else:
            pass        
 
        sql = SQL1
    
        if (sql == ""): sql = SQL2
        
        if (sql == ""): return                                   # Nothing to do here
    
        if (_connected == False):
            if (db2_doConnect() == False):
                errormsg('A CONNECT statement must be issued before issuing SQL statements.')
                return      
        
        if _settings["maxrows"] == -1:                                 # Set the return result size
            pandas.reset_option('display.max_rows')
        else:
            pandas.options.display.max_rows = _settings["maxrows"]
      
        runSQL = re.sub('.*?--.*$',"",sql,flags=re.M)
        remainder = runSQL.replace("\n"," ") 
        if flag(["-d","-delim"]):
            sqlLines = splitSQL(remainder,"@")
        else:
            sqlLines = splitSQL(remainder,";")
        flag_cell = True
                      
        # For each line figure out if you run it as a command (db2) or select (sql)

        for sqlin in sqlLines:          # Run each command
            
            sqlin = checkMacro(sqlin)                                 # Update based on any macros

            sqlType, sql = sqlParser(sqlin,local_ns)                           # Parse the SQL  
            if (sql.strip() == ""): continue
            if flag(["-e","-echo"]): debug(sql,False)
                
            if flag("-t"):
                cnt = sqlTimer(_hdbc, _settings["runtime"], sql)            # Given the sql and parameters, clock the time
                if (cnt >= 0): print("Total iterations in %s second(s): %s" % (_settings["runtime"],cnt))                
                return(cnt)
            
            elif flag(["-pb","-bar","-pp","-pie","-pl","-line"]):                       # We are plotting some results 
                
                plotData(_hdbi, sql)                            # Plot the data and return
                return
 
            else:
        
                try:                                                  # See if we have an answer set
                    stmt = ibm_db.prepare(_hdbc,sql)
                    if (ibm_db.num_fields(stmt) == 0):                # No, so we just execute the code
                        result = ibm_db.execute(stmt)                 # Run it                            
                        if (result == False):                         # Error executing the code
                            db2_error(flag(["-q","-quiet"])) 
                            continue
                            
                        rowcount = ibm_db.num_rows(stmt)    
                    
                        if (rowcount == 0 and flag(["-q","-quiet"]) == False):
                            errormsg("No rows found.")     
                            
                        continue                                      # Continue running
                    
                    elif flag(["-r","-array","-j","-json"]):                     # raw, json, format json
                        row_count = 0
                        resultSet = []
                        try:
                            result = ibm_db.execute(stmt)             # Run it
                            if (result == False):                         # Error executing the code
                                db2_error(flag(["-q","-quiet"]))  
                                return
                                
                            if flag("-j"):                          # JSON single output
                                row_count = 0
                                json_results = []
                                while( ibm_db.fetch_row(stmt) ):
                                    row_count = row_count + 1
                                    jsonVal = ibm_db.result(stmt,0)
                                    jsonDict = json.loads(jsonVal)
                                    json_results.append(jsonDict)
                                    flag_output = True                                    
                                
                                if (row_count == 0): sqlcode = 100
                                return(json_results)
                            
                            else:
                                return(fetchResults(stmt))
                                  
                        except Exception as err:
                            db2_error(flag(["-q","-quiet"]))
                            return
                            
                    else:
                        
                        try:
                            df = pandas.read_sql(sql,_hdbi)
          
                        except Exception as err:
                            db2_error(False)
                            return
                    
                        if (len(df) == 0):
                            sqlcode = 100
                            if (flag(["-q","-quiet"]) == False): 
                                errormsg("No rows found")
                            continue                    
                    
                        flag_output = True
                        if flag("-grid") or _settings['display'] == 'GRID':   # Check to see if we can display the results
                            if (_environment['qgrid'] == False):
                                with pandas.option_context('display.max_rows', None, 'display.max_columns', None):  
                                    print(df.to_string())
                            else:
                                try:
                                    pdisplay(qgrid.show_grid(df))
                                except:
                                    errormsg("Grid cannot be used to display data with duplicate column names. Use option -a or %sql OPTION DISPLAY PANDAS instead.")
                                    return 
                        else:
                            if flag(["-a","-all"]) or _settings["maxrows"] == -1 : # All of the rows
                                pandas.options.display.max_rows = None
                                pandas.options.display.max_columns = None
                                return df # print(df.to_string())
                            else:
                                pandas.options.display.max_rows = _settings["maxrows"]
                                pandas.options.display.max_columns = None
                                return df # pdisplay(df) # print(df.to_string())
 
                except:
                    db2_error(flag(["-q","-quiet"]))
                    continue # return
                
        end_time = time.time()
        sqlelapsed = end_time - start_time
        if (flag_output == False and flag(["-q","-quiet"]) == False): print("Command completed.")
            
# Register the Magic extension in Jupyter    
ip = get_ipython()          
ip.register_magics(DB2)
load_settings()
   
success("Db2 Extensions Loaded.")

Pre-defined Macros

These macros are used to simulate the LIST TABLES and DESCRIBE commands that are available from within the Db2 command line.


In [ ]:
%%sql define LIST
#
# The LIST macro is used to list all of the tables in the current schema or for all schemas
#
var syntax Syntax: LIST TABLES [FOR ALL | FOR SCHEMA name]
# 
# Only LIST TABLES is supported by this macro
#
if {^1} <> 'TABLES'
    exit {syntax}
endif

#
# This SQL is a temporary table that contains the description of the different table types
#
WITH TYPES(TYPE,DESCRIPTION) AS (
  VALUES
    ('A','Alias'),
    ('G','Created temporary table'),
    ('H','Hierarchy table'),
    ('L','Detached table'),
    ('N','Nickname'),
    ('S','Materialized query table'),
    ('T','Table'),
    ('U','Typed table'),
    ('V','View'),
    ('W','Typed view')
)
SELECT TABNAME, TABSCHEMA, T.DESCRIPTION FROM SYSCAT.TABLES S, TYPES T
       WHERE T.TYPE = S.TYPE 

#
# Case 1: No arguments - LIST TABLES
#
if {argc} == 1
   AND OWNER = CURRENT USER
   ORDER BY TABNAME, TABSCHEMA
   return
endif 

#
# Case 2: Need 3 arguments - LIST TABLES FOR ALL
#
if {argc} == 3
    if {^2}&{^3} == 'FOR&ALL'
        ORDER BY TABNAME, TABSCHEMA
        return
    endif
    exit {syntax}
endif

#
# Case 3: Need FOR SCHEMA something here
#
if {argc} == 4
    if {^2}&{^3} == 'FOR&SCHEMA'
        AND TABSCHEMA = '{^4}'
        ORDER BY TABNAME, TABSCHEMA
        return
    else
        exit {syntax}
    endif
endif

#
# Nothing matched - Error
#
exit {syntax}

In [ ]:
%%sql define describe
#
# The DESCRIBE command can either use the syntax DESCRIBE TABLE <name> or DESCRIBE TABLE SELECT ...
#
var syntax Syntax: DESCRIBE [TABLE name | SELECT statement] 
#
# Check to see what count of variables is... Must be at least 2 items DESCRIBE TABLE x or SELECT x
#
if {argc} < 2
   exit {syntax}
endif

CALL ADMIN_CMD('{*0}');

Set the table formatting to left align a table in a cell. By default, tables are centered in a cell. Remove this cell if you don't want to change Jupyter notebook formatting for tables. In addition, we skip this code if you are running in a shell environment rather than a Jupyter notebook


In [ ]:
#%%html
#<style>
#  table {margin-left: 0 !important; text-align: left;}
#</style>

Credits: IBM 202, George Baklarz [baklarz@ca.ibm.com]