This code is imported as a Jupyter notebook extension in any notebooks you create with DB2 code in it. Place the following line of code in any notebook that you want to use these commands with:
%run db2.ipynb
This code defines a Jupyter/Python magic command called %sql
which allows you to execute DB2 specific calls to
the database. There are other packages available for manipulating databases, but this one has been specifically
designed for demonstrating a number of the SQL features available in DB2.
There are two ways of executing the %sql
command. A single line SQL statement would use the
line format of the magic command:
%sql SELECT * FROM EMPLOYEEIf you have a large block of sql then you would place the %%sql command at the beginning of the block and then place the SQL statements into the remainder of the block. Using this form of the `%%sql` statement means that the notebook cell can only contain SQL and no other statements.
%%sql SELECT * FROM EMPLOYEE ORDER BY LASTNAMEYou can have multiple lines in the SQL block (`%%sql`). The default SQL delimiter is the semi-column (`;`). If you have scripts (triggers, procedures, functions) that use the semi-colon as part of the script, you will need to use the `-d` option to change the delimiter to an at "`@`" sign.
%%sql -d SELECT * FROM EMPLOYEE @ CREATE PROCEDURE ... @
The %sql
command allows most DB2 commands to execute and has a special version of the CONNECT statement.
A CONNECT by itself will attempt to reconnect to the database using previously used settings. If it cannot
connect, it will prompt the user for additional information.
The CONNECT command has the following format:
%sql CONNECT TO <database> USER <userid> USING <password | ?> HOST <ip address> PORT <port number>If you use a "
?
" for the password field, the system will prompt you for a password. This avoids typing the
password as clear text on the screen. If a connection is not successful, the system will print the error
message associated with the connect request.
If the connection is successful, the parameters are saved on your system and will be used the next time you run a SQL statement, or when you issue the %sql CONNECT command with no parameters.
In addition to the -d option, there are a number different options that you can specify at the beginning of the SQL:
-d, -delim
- Change SQL delimiter to "@
" from ";
"-q, -quiet
- Quiet results - no messages returned from the function-r, -array
- Return the result set as an array of values instead of a dataframe-t, -time
- Time the following SQL statement and return the number of times it executes in 1 second-j
- Format the first character column of the result set as a JSON record-json
- Return result set as an array of json records-a, -all
- Return all rows in answer set and do not limit display-grid
- Display the results in a scrollable grid-pb, -bar
- Plot the results as a bar chart-pl, -line
- Plot the results as a line chart-pp, -pie
- Plot the results as a pie chart-e, -echo
- Any macro expansions are displayed in an output box -sampledata
- Create and load the EMPLOYEE and DEPARTMENT tables
You can pass python variables to the %sql
command by using the {}
braces with the name of the
variable inbetween. Note that you will need to place proper punctuation around the variable in the event the
SQL command requires it. For instance, the following example will find employee '000010' in the EMPLOYEE table.
empno = '000010' %sql SELECT LASTNAME FROM EMPLOYEE WHERE EMPNO='{empno}'
The other option is to use parameter markers. What you would need to do is use the name of the variable with a colon in front of it and the program will prepare the statement and then pass the variable to Db2 when the statement is executed. This allows you to create complex strings that might contain quote characters and other special characters and not have to worry about enclosing the string with the correct quotes. Note that you do not place the quotes around the variable even though it is a string.
empno = '000020' %sql SELECT LASTNAME FROM EMPLOYEE WHERE EMPNO=:empno
The previous set of %sql
and %%sql
commands deals with SQL statements and commands that are run in an interactive manner. There is a class of SQL commands that are more suited to a development environment where code is iterated or requires changing input. The commands that are associated with this form of SQL are:
Autocommit is the default manner in which SQL statements are executed. At the end of the successful completion of a statement, the results are commited to the database. There is no concept of a transaction where multiple DML/DDL statements are considered one transaction. The AUTOCOMMIT
command allows you to turn autocommit OFF
or ON
. This means that the set of SQL commands run after the AUTOCOMMIT OFF
command are executed are not commited to the database until a COMMIT
or ROLLBACK
command is issued.
COMMIT
(WORK
) will finalize all of the transactions (COMMIT
) to the database and ROLLBACK
will undo all of the changes. If you issue a SELECT
statement during the execution of your block, the results will reflect all of your changes. If you ROLLBACK
the transaction, the changes will be lost.
PREPARE
is typically used in a situation where you want to repeatidly execute a SQL statement with different variables without incurring the SQL compilation overhead. For instance:
x = %sql PREPARE SELECT LASTNAME FROM EMPLOYEE WHERE EMPNO=?
for y in ['000010','000020','000030']:
%sql execute :x using :y
EXECUTE
is used to execute a previously compiled statement.
To retrieve the error codes that might be associated with any SQL call, the following variables are updated after every call:
In [ ]:
#
# Set up Jupyter MAGIC commands "sql".
# %sql will return results from a DB2 select statement or execute a DB2 command
#
# IBM 2019: George Baklarz
# Version 2019-10-03
#
from __future__ import print_function
from IPython.display import HTML as pHTML, Image as pImage, display as pdisplay, Javascript as Javascript
from IPython.core.magic import (Magics, magics_class, line_magic,
cell_magic, line_cell_magic, needs_local_scope)
import ibm_db
import pandas
import ibm_db_dbi
import json
import matplotlib
import matplotlib.pyplot as plt
import getpass
import os
import pickle
import time
import sys
import re
import warnings
warnings.filterwarnings("ignore")
# Python Hack for Input between 2 and 3
try:
input = raw_input
except NameError:
pass
_settings = {
"maxrows" : 10,
"maxgrid" : 5,
"runtime" : 1,
"display" : "PANDAS",
"database" : "",
"hostname" : "localhost",
"port" : "50000",
"protocol" : "TCPIP",
"uid" : "DB2INST1",
"pwd" : "password",
"ssl" : ""
}
_environment = {
"jupyter" : True,
"qgrid" : True
}
_display = {
'fullWidthRows': True,
'syncColumnCellResize': True,
'forceFitColumns': False,
'defaultColumnWidth': 150,
'rowHeight': 28,
'enableColumnReorder': False,
'enableTextSelectionOnCells': True,
'editable': False,
'autoEdit': False,
'explicitInitialization': True,
'maxVisibleRows': 5,
'minVisibleRows': 5,
'sortable': True,
'filterable': False,
'highlightSelectedCell': False,
'highlightSelectedRow': True
}
# Connection settings for statements
_connected = False
_hdbc = None
_hdbi = None
_stmt = []
_stmtID = []
_stmtSQL = []
_vars = {}
_macros = {}
_flags = []
_debug = False
# Db2 Error Messages and Codes
sqlcode = 0
sqlstate = "0"
sqlerror = ""
sqlelapsed = 0
# Check to see if QGrid is installed
try:
import qgrid
qgrid.set_defaults(grid_options=_display)
except:
_environment['qgrid'] = False
# Check if we are running in iPython or Jupyter
try:
if (get_ipython().config == {}):
_environment['jupyter'] = False
_environment['qgrid'] = False
else:
_environment['jupyter'] = True
except:
_environment['jupyter'] = False
_environment['qgrid'] = False
There are four options that can be set with the %sql
command. These options are shown below with the default value shown in parenthesis.
MAXROWS n (10)
- The maximum number of rows that will be displayed before summary information is shown. If the answer set is less than this number of rows, it will be completely shown on the screen. If the answer set is larger than this amount, only the first 5 rows and last 5 rows of the answer set will be displayed. If you want to display a very large answer set, you may want to consider using the grid option -g
to display the results in a scrollable table. If you really want to show all results then setting MAXROWS to -1 will return all output.
MAXGRID n (5)
- The maximum size of a grid display. When displaying a result set in a grid -g
, the default size of the display window is 5 rows. You can set this to a larger size so that more rows are shown on the screen. Note that the minimum size always remains at 5 which means that if the system is unable to display your maximum row size it will reduce the table display until it fits.
DISPLAY PANDAS | GRID (PANDAS)
- Display the results as a PANDAS dataframe (default) or as a scrollable GRID
RUNTIME n (1)
- When using the timer option on a SQL statement, the statement will execute for n
number of seconds. The result that is returned is the number of times the SQL statement executed rather than the execution time of the statement. The default value for runtime is one second, so if the SQL is very complex you will need to increase the run time.
LIST
- Display the current settings
To set an option use the following syntax:
%sql option option_name value option_name value ....
The following example sets all options:
%sql option maxrows 100 runtime 2 display grid maxgrid 10
The values will not be saved between Jupyter notebooks sessions. If you need to retrieve the current options values, use the LIST command as the only argument:
%sql option list
In [ ]:
def setOptions(inSQL):
global _settings, _display
cParms = inSQL.split()
cnt = 0
while cnt < len(cParms):
if cParms[cnt].upper() == 'MAXROWS':
if cnt+1 < len(cParms):
try:
_settings["maxrows"] = int(cParms[cnt+1])
except Exception as err:
errormsg("Invalid MAXROWS value provided.")
pass
cnt = cnt + 1
else:
errormsg("No maximum rows specified for the MAXROWS option.")
return
elif cParms[cnt].upper() == 'MAXGRID':
if cnt+1 < len(cParms):
try:
maxgrid = int(cParms[cnt+1])
if (maxgrid <= 5): # Minimum window size is 5
maxgrid = 5
_display["maxVisibleRows"] = int(cParms[cnt+1])
try:
import qgrid
qgrid.set_defaults(grid_options=_display)
except:
_environment['qgrid'] = False
except Exception as err:
errormsg("Invalid MAXGRID value provided.")
pass
cnt = cnt + 1
else:
errormsg("No maximum rows specified for the MAXROWS option.")
return
elif cParms[cnt].upper() == 'RUNTIME':
if cnt+1 < len(cParms):
try:
_settings["runtime"] = int(cParms[cnt+1])
except Exception as err:
errormsg("Invalid RUNTIME value provided.")
pass
cnt = cnt + 1
else:
errormsg("No value provided for the RUNTIME option.")
return
elif cParms[cnt].upper() == 'DISPLAY':
if cnt+1 < len(cParms):
if (cParms[cnt+1].upper() == 'GRID'):
_settings["display"] = 'GRID'
elif (cParms[cnt+1].upper() == 'PANDAS'):
_settings["display"] = 'PANDAS'
else:
errormsg("Invalid DISPLAY value provided.")
cnt = cnt + 1
else:
errormsg("No value provided for the DISPLAY option.")
return
elif (cParms[cnt].upper() == 'LIST'):
print("(MAXROWS) Maximum number of rows displayed: " + str(_settings["maxrows"]))
print("(MAXGRID) Maximum grid display size: " + str(_settings["maxgrid"]))
print("(RUNTIME) How many seconds to a run a statement for performance testing: " + str(_settings["runtime"]))
print("(DISPLAY) Use PANDAS or GRID display format for output: " + _settings["display"])
return
else:
cnt = cnt + 1
save_settings()
In [ ]:
def sqlhelp():
global _environment
if (_environment["jupyter"] == True):
sd = '<td style="text-align:left;">'
ed1 = '</td>'
ed2 = '</td>'
sh = '<th style="text-align:left;">'
eh1 = '</th>'
eh2 = '</th>'
sr = '<tr>'
er = '</tr>'
helpSQL = """
<h3>SQL Options</h3>
<p>The following options are available as part of a SQL statement. The options are always preceded with a
minus sign (i.e. -q).
<table>
{sr}
{sh}Option{eh1}{sh}Description{eh2}
{er}
{sr}
{sd}a, all{ed1}{sd}Return all rows in answer set and do not limit display{ed2}
{er}
{sr}
{sd}d{ed1}{sd}Change SQL delimiter to "@" from ";"{ed2}
{er}
{sr}
{sd}e, echo{ed1}{sd}Echo the SQL command that was generated after macro and variable substituion.{ed2}
{er}
{sr}
{sd}h, help{ed1}{sd}Display %sql help information.{ed2}
{er}
{sr}
{sd}j{ed1}{sd}Create a pretty JSON representation. Only the first column is formatted{ed2}
{er}
{sr}
{sd}json{ed1}{sd}Retrieve the result set as a JSON record{ed2}
{er}
{sr}
{sd}pb, bar{ed1}{sd}Plot the results as a bar chart{ed2}
{er}
{sr}
{sd}pl, line{ed1}{sd}Plot the results as a line chart{ed2}
{er}
{sr}
{sd}pp, pie{ed1}{sd}Plot Pie: Plot the results as a pie chart{ed2}
{er}
{sr}
{sd}q, quiet{ed1}{sd}Quiet results - no answer set or messages returned from the function{ed2}
{er}
{sr}
{sd}r, array{ed1}{sd}Return the result set as an array of values{ed2}
{er}
{sr}
{sd}sampledata{ed1}{sd}Create and load the EMPLOYEE and DEPARTMENT tables{ed2}
{er}
{sr}
{sd}t,time{ed1}{sd}Time the following SQL statement and return the number of times it executes in 1 second{ed2}
{er}
{sr}
{sd}grid{ed1}{sd}Display the results in a scrollable grid{ed2}
{er}
</table>
"""
else:
helpSQL = """
SQL Options
The following options are available as part of a SQL statement. Options are always
preceded with a minus sign (i.e. -q).
Option Description
a, all Return all rows in answer set and do not limit display
d Change SQL delimiter to "@" from ";"
e, echo Echo the SQL command that was generated after substitution
h, help Display %sql help information
j Create a pretty JSON representation. Only the first column is formatted
json Retrieve the result set as a JSON record
pb, bar Plot the results as a bar chart
pl, line Plot the results as a line chart
pp, pie Plot Pie: Plot the results as a pie chart
q, quiet Quiet results - no answer set or messages returned from the function
r, array Return the result set as an array of values
sampledata Create and load the EMPLOYEE and DEPARTMENT tables
t,time Time the SQL statement and return the execution count per second
grid Display the results in a scrollable grid
"""
helpSQL = helpSQL.format(**locals())
if (_environment["jupyter"] == True):
pdisplay(pHTML(helpSQL))
else:
print(helpSQL)
In [ ]:
def connected_help():
sd = '<td style="text-align:left;">'
ed = '</td>'
sh = '<th style="text-align:left;">'
eh = '</th>'
sr = '<tr>'
er = '</tr>'
if (_environment['jupyter'] == True):
helpConnect = """
<h3>Connecting to Db2</h3>
<p>The CONNECT command has the following format:
<p>
<pre>
%sql CONNECT TO <database> USER <userid> USING <password|?> HOST <ip address> PORT <port number> <SSL>
%sql CONNECT CREDENTIALS <varname>
%sql CONNECT CLOSE
%sql CONNECT RESET
%sql CONNECT PROMPT - use this to be prompted for values
</pre>
<p>
If you use a "?" for the password field, the system will prompt you for a password. This avoids typing the
password as clear text on the screen. If a connection is not successful, the system will print the error
message associated with the connect request.
<p>
The <b>CREDENTIALS</b> option allows you to use credentials that are supplied by Db2 on Cloud instances.
The credentials can be supplied as a variable and if successful, the variable will be saved to disk
for future use. If you create another notebook and use the identical syntax, if the variable
is not defined, the contents on disk will be used as the credentials. You should assign the
credentials to a variable that represents the database (or schema) that you are communicating with.
Using familiar names makes it easier to remember the credentials when connecting.
<p>
<b>CONNECT CLOSE</b> will close the current connection, but will not reset the database parameters. This means that
if you issue the CONNECT command again, the system should be able to reconnect you to the database.
<p>
<b>CONNECT RESET</b> will close the current connection and remove any information on the connection. You will need
to issue a new CONNECT statement with all of the connection information.
<p>
If the connection is successful, the parameters are saved on your system and will be used the next time you
run an SQL statement, or when you issue the %sql CONNECT command with no parameters.
<p>If you issue CONNECT RESET, all of the current values will be deleted and you will need to
issue a new CONNECT statement.
<p>A CONNECT command without any parameters will attempt to re-connect to the previous database you
were using. If the connection could not be established, the program to prompt you for
the values. To cancel the connection attempt, enter a blank value for any of the values. The connection
panel will request the following values in order to connect to Db2:
<table>
{sr}
{sh}Setting{eh}
{sh}Description{eh}
{er}
{sr}
{sd}Database{ed}{sd}Database name you want to connect to.{ed}
{er}
{sr}
{sd}Hostname{ed}
{sd}Use localhost if Db2 is running on your own machine, but this can be an IP address or host name.
{er}
{sr}
{sd}PORT{ed}
{sd}The port to use for connecting to Db2. This is usually 50000.{ed}
{er}
{sr}
{sd}SSL{ed}
{sd}If you are connecting to a secure port (50001) with SSL then you must include this keyword in the connect string.{ed}
{sr}
{sd}Userid{ed}
{sd}The userid to use when connecting (usually DB2INST1){ed}
{er}
{sr}
{sd}Password{ed}
{sd}No password is provided so you have to enter a value{ed}
{er}
</table>
"""
else:
helpConnect = """\
Connecting to Db2
The CONNECT command has the following format:
%sql CONNECT TO database USER userid USING password | ?
HOST ip address PORT port number SSL
%sql CONNECT CREDENTIALS varname
%sql CONNECT CLOSE
%sql CONNECT RESET
If you use a "?" for the password field, the system will prompt you for a password.
This avoids typing the password as clear text on the screen. If a connection is
not successful, the system will print the error message associated with the connect
request.
The CREDENTIALS option allows you to use credentials that are supplied by Db2 on
Cloud instances. The credentials can be supplied as a variable and if successful,
the variable will be saved to disk for future use. If you create another notebook
and use the identical syntax, if the variable is not defined, the contents on disk
will be used as the credentials. You should assign the credentials to a variable
that represents the database (or schema) that you are communicating with. Using
familiar names makes it easier to remember the credentials when connecting.
CONNECT CLOSE will close the current connection, but will not reset the database
parameters. This means that if you issue the CONNECT command again, the system
should be able to reconnect you to the database.
CONNECT RESET will close the current connection and remove any information on the
connection. You will need to issue a new CONNECT statement with all of the connection
information.
If the connection is successful, the parameters are saved on your system and will be
used the next time you run an SQL statement, or when you issue the %sql CONNECT
command with no parameters. If you issue CONNECT RESET, all of the current values
will be deleted and you will need to issue a new CONNECT statement.
A CONNECT command without any parameters will attempt to re-connect to the previous
database you were using. If the connection could not be established, the program to
prompt you for the values. To cancel the connection attempt, enter a blank value for
any of the values. The connection panel will request the following values in order
to connect to Db2:
Setting Description
Database Database name you want to connect to
Hostname Use localhost if Db2 is running on your own machine, but this can
be an IP address or host name.
PORT The port to use for connecting to Db2. This is usually 50000.
Userid The userid to use when connecting (usually DB2INST1)
Password No password is provided so you have to enter a value
SSL Include this keyword to indicate you are connecting via SSL (usually port 50001)
"""
helpConnect = helpConnect.format(**locals())
if (_environment['jupyter'] == True):
pdisplay(pHTML(helpConnect))
else:
print(helpConnect)
If you are running an SQL statement and have not yet connected to a database, the %sql command will prompt you for connection information. In order to connect to a database, you must supply:
The routine is called without any parameters:
connected_prompt()
In [ ]:
# Prompt for Connection information
def connected_prompt():
global _settings
_database = ''
_hostname = ''
_port = ''
_uid = ''
_pwd = ''
_ssl = ''
print("Enter the database connection details (Any empty value will cancel the connection)")
_database = input("Enter the database name: ");
if (_database.strip() == ""): return False
_hostname = input("Enter the HOST IP address or symbolic name: ");
if (_hostname.strip() == ""): return False
_port = input("Enter the PORT number: ");
if (_port.strip() == ""): return False
_ssl = input("Is this a secure (SSL) port (y or n)");
if (_ssl.strip() == ""): return False
if (_ssl == "n"):
_ssl = ""
else:
_ssl = "Security=SSL;"
_uid = input("Enter Userid on the DB2 system: ").upper();
if (_uid.strip() == ""): return False
_pwd = getpass.getpass("Password [password]: ");
if (_pwd.strip() == ""): return False
_settings["database"] = _database.strip()
_settings["hostname"] = _hostname.strip()
_settings["port"] = _port.strip()
_settings["uid"] = _uid.strip()
_settings["pwd"] = _pwd.strip()
_settings["ssl"] = _ssl.strip()
_settings["maxrows"] = 10
_settings["maxgrid"] = 5
_settings["runtime"] = 1
return True
# Split port and IP addresses
def split_string(in_port,splitter=":"):
# Split input into an IP address and Port number
global _settings
checkports = in_port.split(splitter)
ip = checkports[0]
if (len(checkports) > 1):
port = checkports[1]
else:
port = None
return ip, port
The parseConnect routine is used to parse the CONNECT command that the user issued within the %sql command. The format of the command is:
parseConnect(inSQL)
The inSQL string contains the CONNECT keyword with some additional parameters. The format of the CONNECT command is one of:
CONNECT RESET
CONNECT CLOSE
CONNECT CREDENTIALS <variable>
CONNECT TO database USER userid USING password HOST hostname PORT portnumber <SSL>
If you have credentials available from Db2 on Cloud, place the contents of the credentials into a variable and then use the CONNECT CREDENTIALS <var>
syntax to connect to the database.
In addition, supplying a question mark (?) for password will result in the program prompting you for the password rather than having it as clear text in your scripts.
When all of the information is checked in the command, the db2_doConnect function is called to actually do the connection to the database.
In [ ]:
# Parse the CONNECT statement and execute if possible
def parseConnect(inSQL,local_ns):
global _settings, _connected
_connected = False
cParms = inSQL.split()
cnt = 0
_settings["ssl"] = ""
while cnt < len(cParms):
if cParms[cnt].upper() == 'TO':
if cnt+1 < len(cParms):
_settings["database"] = cParms[cnt+1].upper()
cnt = cnt + 1
else:
errormsg("No database specified in the CONNECT statement")
return
elif cParms[cnt].upper() == "SSL":
_settings["ssl"] = "Security=SSL;"
cnt = cnt + 1
elif cParms[cnt].upper() == 'CREDENTIALS':
if cnt+1 < len(cParms):
credentials = cParms[cnt+1]
tempid = eval(credentials,local_ns)
if (isinstance(tempid,dict) == False):
errormsg("The CREDENTIALS variable (" + credentials + ") does not contain a valid Python dictionary (JSON object)")
return
if (tempid == None):
fname = credentials + ".pickle"
try:
with open(fname,'rb') as f:
_id = pickle.load(f)
except:
errormsg("Unable to find credential variable or file.")
return
else:
_id = tempid
try:
_settings["database"] = _id["db"]
_settings["hostname"] = _id["hostname"]
_settings["port"] = _id["port"]
_settings["uid"] = _id["username"]
_settings["pwd"] = _id["password"]
try:
fname = credentials + ".pickle"
with open(fname,'wb') as f:
pickle.dump(_id,f)
except:
errormsg("Failed trying to write Db2 Credentials.")
return
except:
errormsg("Credentials file is missing information. db/hostname/port/username/password required.")
return
else:
errormsg("No Credentials name supplied")
return
cnt = cnt + 1
elif cParms[cnt].upper() == 'USER':
if cnt+1 < len(cParms):
_settings["uid"] = cParms[cnt+1].upper()
cnt = cnt + 1
else:
errormsg("No userid specified in the CONNECT statement")
return
elif cParms[cnt].upper() == 'USING':
if cnt+1 < len(cParms):
_settings["pwd"] = cParms[cnt+1]
if (_settings["pwd"] == '?'):
_settings["pwd"] = getpass.getpass("Password [password]: ") or "password"
cnt = cnt + 1
else:
errormsg("No password specified in the CONNECT statement")
return
elif cParms[cnt].upper() == 'HOST':
if cnt+1 < len(cParms):
hostport = cParms[cnt+1].upper()
ip, port = split_string(hostport)
if (port == None): _settings["port"] = "50000"
_settings["hostname"] = ip
cnt = cnt + 1
else:
errormsg("No hostname specified in the CONNECT statement")
return
elif cParms[cnt].upper() == 'PORT':
if cnt+1 < len(cParms):
_settings["port"] = cParms[cnt+1].upper()
cnt = cnt + 1
else:
errormsg("No port specified in the CONNECT statement")
return
elif cParms[cnt].upper() == 'PROMPT':
if (connected_prompt() == False):
print("Connection canceled.")
return
else:
cnt = cnt + 1
elif cParms[cnt].upper() in ('CLOSE','RESET') :
try:
result = ibm_db.close(_hdbc)
_hdbi.close()
except:
pass
success("Connection closed.")
if cParms[cnt].upper() == 'RESET':
_settings["database"] = ''
return
else:
cnt = cnt + 1
_ = db2_doConnect()
The db2_doConnect routine is called when a connection needs to be established to a Db2 database. The command does not require any parameters since it relies on the settings variable which contains all of the information it needs to connect to a Db2 database.
db2_doConnect()
There are 4 additional variables that are used throughout the routines to stay connected with the Db2 database. These variables are:
The only database driver that is used in this program is the IBM DB2 ODBC DRIVER. This driver needs to be loaded on the system that is connecting to Db2. The Jupyter notebook that is built by this system installs the driver for you so you shouldn't have to do anything other than build the container.
If the connection is successful, the connected flag is set to True. Any subsequent %sql call will check to see if you are connected and initiate another prompted connection if you do not have a connection to a database.
In [ ]:
def db2_doConnect():
global _hdbc, _hdbi, _connected, _runtime
global _settings
if _connected == False:
if len(_settings["database"]) == 0:
return False
dsn = (
"DRIVER={{IBM DB2 ODBC DRIVER}};"
"DATABASE={0};"
"HOSTNAME={1};"
"PORT={2};"
"PROTOCOL=TCPIP;"
"UID={3};"
"PWD={4};{5}").format(_settings["database"],
_settings["hostname"],
_settings["port"],
_settings["uid"],
_settings["pwd"],
_settings["ssl"])
# Get a database handle (hdbc) and a statement handle (hstmt) for subsequent access to DB2
try:
_hdbc = ibm_db.connect(dsn, "", "")
except Exception as err:
db2_error(False,True) # errormsg(str(err))
_connected = False
_settings["database"] = ''
return False
try:
_hdbi = ibm_db_dbi.Connection(_hdbc)
except Exception as err:
db2_error(False,True) # errormsg(str(err))
_connected = False
_settings["database"] = ''
return False
_connected = True
# Save the values for future use
save_settings()
success("Connection successful.")
return True
There are two routines that load and save settings between Jupyter notebooks. These routines are called without any parameters.
load_settings() save_settings()
There is a global structure called settings which contains the following fields:
_settings = {
"maxrows" : 10,
"maxgrid" : 5,
"runtime" : 1,
"display" : "TEXT",
"database" : "",
"hostname" : "localhost",
"port" : "50000",
"protocol" : "TCPIP",
"uid" : "DB2INST1",
"pwd" : "password"
}
The information in the settings structure is used for re-connecting to a database when you start up a Jupyter notebook. When the session is established for the first time, the load_settings() function is called to get the contents of the pickle file (db2connect.pickle, a Jupyter session file) that will be used for the first connection to the database. Whenever a new connection is made, the file is updated with the save_settings() function.
In [ ]:
def load_settings():
# This routine will load the settings from the previous session if they exist
global _settings
fname = "db2connect.pickle"
try:
with open(fname,'rb') as f:
_settings = pickle.load(f)
# Reset runtime to 1 since it would be unexpected to keep the same value between connections
_settings["runtime"] = 1
_settings["maxgrid"] = 5
except:
pass
return
def save_settings():
# This routine will save the current settings if they exist
global _settings
fname = "db2connect.pickle"
try:
with open(fname,'wb') as f:
pickle.dump(_settings,f)
except:
errormsg("Failed trying to write Db2 Configuration Information.")
return
There are three types of messages that are thrown by the %db2 magic command. The first routine will print out a success message with no special formatting:
success(message)
The second message is used for displaying an error message that is not associated with a SQL error. This type of error message is surrounded with a red box to highlight the problem. Note that the success message has code that has been commented out that could also show a successful return code with a green box.
errormsg(message)
The final error message is based on an error occuring in the SQL code that was executed. This code will parse the message returned from the ibm_db interface and parse it to return only the error message portion (and not all of the wrapper code from the driver).
db2_error(quiet,connect=False)
The quiet flag is passed to the db2_error routine so that messages can be suppressed if the user wishes to ignore them with the -q flag. A good example of this is dropping a table that does not exist. We know that an error will be thrown so we can ignore it. The information that the db2_error routine gets is from the stmt_errormsg() function from within the ibm_db driver. The db2_error function should only be called after a SQL failure otherwise there will be no diagnostic information returned from stmt_errormsg().
If the connect flag is True, the routine will get the SQLSTATE and SQLCODE from the connection error message rather than a statement error message.
In [ ]:
def db2_error(quiet,connect=False):
global sqlerror, sqlcode, sqlstate, _environment
try:
if (connect == False):
errmsg = ibm_db.stmt_errormsg().replace('\r',' ')
errmsg = errmsg[errmsg.rfind("]")+1:].strip()
else:
errmsg = ibm_db.conn_errormsg().replace('\r',' ')
errmsg = errmsg[errmsg.rfind("]")+1:].strip()
sqlerror = errmsg
msg_start = errmsg.find("SQLSTATE=")
if (msg_start != -1):
msg_end = errmsg.find(" ",msg_start)
if (msg_end == -1):
msg_end = len(errmsg)
sqlstate = errmsg[msg_start+9:msg_end]
else:
sqlstate = "0"
msg_start = errmsg.find("SQLCODE=")
if (msg_start != -1):
msg_end = errmsg.find(" ",msg_start)
if (msg_end == -1):
msg_end = len(errmsg)
sqlcode = errmsg[msg_start+8:msg_end]
try:
sqlcode = int(sqlcode)
except:
pass
else:
sqlcode = 0
except:
errmsg = "Unknown error."
sqlcode = -99999
sqlstate = "-99999"
sqlerror = errmsg
return
msg_start = errmsg.find("SQLSTATE=")
if (msg_start != -1):
msg_end = errmsg.find(" ",msg_start)
if (msg_end == -1):
msg_end = len(errmsg)
sqlstate = errmsg[msg_start+9:msg_end]
else:
sqlstate = "0"
msg_start = errmsg.find("SQLCODE=")
if (msg_start != -1):
msg_end = errmsg.find(" ",msg_start)
if (msg_end == -1):
msg_end = len(errmsg)
sqlcode = errmsg[msg_start+8:msg_end]
try:
sqlcode = int(sqlcode)
except:
pass
else:
sqlcode = 0
if quiet == True: return
if (errmsg == ""): return
html = '<p><p style="border:2px; border-style:solid; border-color:#FF0000; background-color:#ffe6e6; padding: 1em;">'
if (_environment["jupyter"] == True):
pdisplay(pHTML(html+errmsg+"</p>"))
else:
print(errmsg)
# Print out an error message
def errormsg(message):
global _environment
if (message != ""):
html = '<p><p style="border:2px; border-style:solid; border-color:#FF0000; background-color:#ffe6e6; padding: 1em;">'
if (_environment["jupyter"] == True):
pdisplay(pHTML(html + message + "</p>"))
else:
print(message)
def success(message):
if (message != ""):
print(message)
return
def debug(message,error=False):
global _environment
if (_environment["jupyter"] == True):
spacer = "<br>" + " "
else:
spacer = "\n "
if (message != ""):
lines = message.split('\n')
msg = ""
indent = 0
for line in lines:
delta = line.count("(") - line.count(")")
if (msg == ""):
msg = line
indent = indent + delta
else:
if (delta < 0): indent = indent + delta
msg = msg + spacer * (indent*2) + line
if (delta > 0): indent = indent + delta
if (indent < 0): indent = 0
if (error == True):
html = '<p><pre style="font-family: monospace; border:2px; border-style:solid; border-color:#FF0000; background-color:#ffe6e6; padding: 1em;">'
else:
html = '<p><pre style="font-family: monospace; border:2px; border-style:solid; border-color:#008000; background-color:#e6ffe6; padding: 1em;">'
if (_environment["jupyter"] == True):
pdisplay(pHTML(html + msg + "</pre></p>"))
else:
print(msg)
return
A macro is used to generate SQL to be executed by overriding or creating a new keyword. For instance, the base %sql
command does not understand the LIST TABLES
command which is usually used in conjunction with the CLP
processor. Rather than specifically code this in the base db2.ipynb
file, we can create a macro that can execute this code for us.
There are three routines that deal with macros.
In [ ]:
def setMacro(inSQL,parms):
global _macros
names = parms.split()
if (len(names) < 2):
errormsg("No command name supplied.")
return None
macroName = names[1].upper()
_macros[macroName] = inSQL
return
This code will check to see if there is a macro command in the SQL. It will take the SQL that is supplied and strip out three values: the first and second keywords, and the remainder of the parameters.
For instance, consider the following statement:
CREATE DATABASE GEORGE options....
The name of the macro that we want to run is called CREATE
. We know that there is a SQL command called CREATE
but this code will call the macro first to see if needs to run any special code. For instance, CREATE DATABASE
is not part of the db2.ipynb
syntax, but we can add it in by using a macro.
The check macro logic will strip out the subcommand (DATABASE
) and place the remainder of the string after DATABASE
in options.
In [ ]:
def checkMacro(in_sql):
global _macros
if (len(in_sql) == 0): return(in_sql) # Nothing to do
tokens = parseArgs(in_sql,None) # Take the string and reduce into tokens
macro_name = tokens[0].upper() # Uppercase the name of the token
if (macro_name not in _macros):
return(in_sql) # No macro by this name so just return the string
result = runMacro(_macros[macro_name],in_sql,tokens) # Execute the macro using the tokens we found
return(result) # Runmacro will either return the original SQL or the new one
In [ ]:
def splitassign(arg):
var_name = "null"
var_value = "null"
arg = arg.strip()
eq = arg.find("=")
if (eq != -1):
var_name = arg[:eq].strip()
temp_value = arg[eq+1:].strip()
if (temp_value != ""):
ch = temp_value[0]
if (ch in ["'",'"']):
if (temp_value[-1:] == ch):
var_value = temp_value[1:-1]
else:
var_value = temp_value
else:
var_value = temp_value
else:
var_value = arg
return var_name, var_value
In [ ]:
def parseArgs(argin,_vars):
quoteChar = ""
inQuote = False
inArg = True
args = []
arg = ''
for ch in argin.lstrip():
if (inQuote == True):
if (ch == quoteChar):
inQuote = False
arg = arg + ch #z
else:
arg = arg + ch
elif (ch == "\"" or ch == "\'"): # Do we have a quote
quoteChar = ch
arg = arg + ch #z
inQuote = True
elif (ch == " "):
if (arg != ""):
arg = subvars(arg,_vars)
args.append(arg)
else:
args.append("null")
arg = ""
else:
arg = arg + ch
if (arg != ""):
arg = subvars(arg,_vars)
args.append(arg)
return(args)
In [ ]:
def runMacro(script,in_sql,tokens):
result = ""
runIT = True
code = script.split("\n")
level = 0
runlevel = [True,False,False,False,False,False,False,False,False,False]
ifcount = 0
_vars = {}
for i in range(0,len(tokens)):
vstr = str(i)
_vars[vstr] = tokens[i]
if (len(tokens) == 0):
_vars["argc"] = "0"
else:
_vars["argc"] = str(len(tokens)-1)
for line in code:
line = line.strip()
if (line == "" or line == "\n"): continue
if (line[0] == "#"): continue # A comment line starts with a # in the first position of the line
args = parseArgs(line,_vars) # Get all of the arguments
if (args[0] == "if"):
ifcount = ifcount + 1
if (runlevel[level] == False): # You can't execute this statement
continue
level = level + 1
if (len(args) < 4):
print("Macro: Incorrect number of arguments for the if clause.")
return insql
arg1 = args[1]
arg2 = args[3]
if (len(arg2) > 2):
ch1 = arg2[0]
ch2 = arg2[-1:]
if (ch1 in ['"',"'"] and ch1 == ch2):
arg2 = arg2[1:-1].strip()
op = args[2]
if (op in ["=","=="]):
if (arg1 == arg2):
runlevel[level] = True
else:
runlevel[level] = False
elif (op in ["<=","=<"]):
if (arg1 <= arg2):
runlevel[level] = True
else:
runlevel[level] = False
elif (op in [">=","=>"]):
if (arg1 >= arg2):
runlevel[level] = True
else:
runlevel[level] = False
elif (op in ["<>","!="]):
if (arg1 != arg2):
runlevel[level] = True
else:
runlevel[level] = False
elif (op in ["<"]):
if (arg1 < arg2):
runlevel[level] = True
else:
runlevel[level] = False
elif (op in [">"]):
if (arg1 > arg2):
runlevel[level] = True
else:
runlevel[level] = False
else:
print("Macro: Unknown comparison operator in the if statement:" + op)
continue
elif (args[0] in ["exit","echo"] and runlevel[level] == True):
msg = ""
for msgline in args[1:]:
if (msg == ""):
msg = subvars(msgline,_vars)
else:
msg = msg + " " + subvars(msgline,_vars)
if (msg != ""):
if (args[0] == "echo"):
debug(msg,error=False)
else:
debug(msg,error=True)
if (args[0] == "exit"): return ''
elif (args[0] == "pass" and runlevel[level] == True):
pass
elif (args[0] == "var" and runlevel[level] == True):
value = ""
for val in args[2:]:
if (value == ""):
value = subvars(val,_vars)
else:
value = value + " " + subvars(val,_vars)
value.strip()
_vars[args[1]] = value
elif (args[0] == 'else'):
if (ifcount == level):
runlevel[level] = not runlevel[level]
elif (args[0] == 'return' and runlevel[level] == True):
return(result)
elif (args[0] == "endif"):
ifcount = ifcount - 1
if (ifcount < level):
level = level - 1
if (level < 0):
print("Macro: Unmatched if/endif pairs.")
return ''
else:
if (runlevel[level] == True):
if (result == ""):
result = subvars(line,_vars)
else:
result = result + "\n" + subvars(line,_vars)
return(result)
In [ ]:
def subvars(script,_vars):
if (_vars == None): return script
remainder = script
result = ""
done = False
while done == False:
bv = remainder.find("{")
if (bv == -1):
done = True
continue
ev = remainder.find("}")
if (ev == -1):
done = True
continue
result = result + remainder[:bv]
vvar = remainder[bv+1:ev]
remainder = remainder[ev+1:]
upper = False
allvars = False
if (vvar[0] == "^"):
upper = True
vvar = vvar[1:]
elif (vvar[0] == "*"):
vvar = vvar[1:]
allvars = True
else:
pass
if (vvar in _vars):
if (upper == True):
items = _vars[vvar].upper()
elif (allvars == True):
try:
iVar = int(vvar)
except:
return(script)
items = ""
sVar = str(iVar)
while sVar in _vars:
if (items == ""):
items = _vars[sVar]
else:
items = items + " " + _vars[sVar]
iVar = iVar + 1
sVar = str(iVar)
else:
items = _vars[vvar]
else:
if (allvars == True):
items = ""
else:
items = "null"
result = result + items
if (remainder != ""):
result = result + remainder
return(result)
The calling format of this routine is:
count = sqlTimer(hdbc, runtime, inSQL)
This code runs the SQL string multiple times for one second (by default). The accuracy of the clock is not that great when you are running just one statement, so instead this routine will run the code multiple times for a second to give you an execution count. If you need to run the code for more than one second, the runtime value needs to be set to the number of seconds you want the code to run.
The return result is always the number of times that the code executed. Note, that the program will skip reading the data if it is a SELECT statement so it doesn't included fetch time for the answer set.
In [ ]:
def sqlTimer(hdbc, runtime, inSQL):
count = 0
t_end = time.time() + runtime
while time.time() < t_end:
try:
stmt = ibm_db.exec_immediate(hdbc,inSQL)
if (stmt == False):
db2_error(flag(["-q","-quiet"]))
return(-1)
ibm_db.free_result(stmt)
except Exception as err:
db2_error(False)
return(-1)
count = count + 1
return(count)
This routine takes as an argument a string and then splits the arguments according to the following logic:
(
character, it will check the last character in the string and see if it is a )
and then remove those characters,
and commas within quotes are ignoredExample:
"abcdef",abcdef,456,"856"
Three values would be returned:
[abcdef,True,False],[abcdef,False,False],[456,False,True],[856,True,False]
Any quoted string will be False for numeric. The way that the parameters are handled are up to the calling program. However, in the case of Db2, the quoted strings must be in single quotes so any quoted parameter using the double quotes "
must be wrapped with single quotes. There is always a possibility that a string contains single quotes (i.e. O'Connor) so any substituted text should use ''
so that Db2 can properly interpret the string. This routine does not adjust the strings with quotes, and depends on the variable subtitution routine to do that.
In [ ]:
def splitargs(arguments):
import types
# String the string and remove the ( and ) characters if they at the beginning and end of the string
results = []
step1 = arguments.strip()
if (len(step1) == 0): return(results) # Not much to do here - no args found
if (step1[0] == '('):
if (step1[-1:] == ')'):
step2 = step1[1:-1]
step2 = step2.strip()
else:
step2 = step1
else:
step2 = step1
# Now we have a string without brackets. Start scanning for commas
quoteCH = ""
pos = 0
arg = ""
args = []
while pos < len(step2):
ch = step2[pos]
if (quoteCH == ""): # Are we in a quote?
if (ch in ('"',"'")): # Check to see if we are starting a quote
quoteCH = ch
arg = arg + ch
pos += 1
elif (ch == ","): # Are we at the end of a parameter?
arg = arg.strip()
args.append(arg)
arg = ""
inarg = False
pos += 1
else: # Continue collecting the string
arg = arg + ch
pos += 1
else:
if (ch == quoteCH): # Are we at the end of a quote?
arg = arg + ch # Add the quote to the string
pos += 1 # Increment past the quote
quoteCH = "" # Stop quote checking (maybe!)
else:
pos += 1
arg = arg + ch
if (quoteCH != ""): # So we didn't end our string
arg = arg.strip()
args.append(arg)
elif (arg != ""): # Something left over as an argument
arg = arg.strip()
args.append(arg)
else:
pass
results = []
for arg in args:
result = []
if (len(arg) > 0):
if (arg[0] in ('"',"'")):
value = arg[1:-1]
isString = True
isNumber = False
else:
isString = False
isNumber = False
try:
value = eval(arg)
if (type(value) == int):
isNumber = True
elif (isinstance(value,float) == True):
isNumber = True
else:
value = arg
except:
value = arg
else:
value = ""
isString = False
isNumber = False
result = [value,isString,isNumber]
results.append(result)
return results
The calling format of this routine is:
sql_cmd, parameter_list, encoded_sql = sqlParser(sql_input)
This code will look at the SQL string that has been passed to it and parse it into four values:
In [ ]:
def sqlParser(sqlin,local_ns):
sql_cmd = ""
encoded_sql = sqlin
firstCommand = "(?:^\s*)([a-zA-Z]+)(?:\s+.*|$)"
findFirst = re.match(firstCommand,sqlin)
if (findFirst == None): # We did not find a match so we just return the empty string
return sql_cmd, encoded_sql
cmd = findFirst.group(1)
sql_cmd = cmd.upper()
#
# Scan the input string looking for variables in the format :var. If no : is found just return.
# Var must be alpha+number+_ to be valid
#
if (':' not in sqlin): # A quick check to see if parameters are in here, but not fool-proof!
return sql_cmd, encoded_sql
inVar = False
inQuote = ""
varName = ""
encoded_sql = ""
STRING = 0
NUMBER = 1
LIST = 2
RAW = 3
for ch in sqlin:
if (inVar == True): # We are collecting the name of a variable
if (ch.upper() in "@_ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789[]"):
varName = varName + ch
continue
else:
if (varName == ""):
encode_sql = encoded_sql + ":"
elif (varName[0] in ('[',']')):
encoded_sql = encoded_sql + ":" + varName
else:
if (ch == '.'): # If the variable name is stopped by a period, assume no quotes are used
flag_quotes = False
else:
flag_quotes = True
varValue, varType = getContents(varName,flag_quotes,local_ns)
if (varValue == None):
encoded_sql = encoded_sql + ":" + varName
else:
if (varType == STRING):
encoded_sql = encoded_sql + varValue
elif (varType == NUMBER):
encoded_sql = encoded_sql + str(varValue)
elif (varType == RAW):
encoded_sql = encoded_sql + varValue
elif (varType == LIST):
start = True
for v in varValue:
if (start == False):
encoded_sql = encoded_sql + ","
if (isinstance(v,int) == True): # Integer value
encoded_sql = encoded_sql + str(v)
elif (isinstance(v,float) == True):
encoded_sql = encoded_sql + str(v)
else:
flag_quotes = True
try:
if (v.find('0x') == 0): # Just guessing this is a hex value at beginning
encoded_sql = encoded_sql + v
else:
encoded_sql = encoded_sql + addquotes(v,flag_quotes) # String
except:
encoded_sql = encoded_sql + addquotes(str(v),flag_quotes)
start = False
encoded_sql = encoded_sql + ch
varName = ""
inVar = False
elif (inQuote != ""):
encoded_sql = encoded_sql + ch
if (ch == inQuote): inQuote = ""
elif (ch in ("'",'"')):
encoded_sql = encoded_sql + ch
inQuote = ch
elif (ch == ":"): # This might be a variable
varName = ""
inVar = True
else:
encoded_sql = encoded_sql + ch
if (inVar == True):
varValue, varType = getContents(varName,True,local_ns) # We assume the end of a line is quoted
if (varValue == None):
encoded_sql = encoded_sql + ":" + varName
else:
if (varType == STRING):
encoded_sql = encoded_sql + varValue
elif (varType == NUMBER):
encoded_sql = encoded_sql + str(varValue)
elif (varType == LIST):
flag_quotes = True
start = True
for v in varValue:
if (start == False):
encoded_sql = encoded_sql + ","
if (isinstance(v,int) == True): # Integer value
encoded_sql = encoded_sql + str(v)
elif (isinstance(v,float) == True):
encoded_sql = encoded_sql + str(v)
else:
try:
if (v.find('0x') == 0): # Just guessing this is a hex value
encoded_sql = encoded_sql + v
else:
encoded_sql = encoded_sql + addquotes(v,flag_quotes) # String
except:
encoded_sql = encoded_sql + addquotes(str(v),flag_quotes)
start = False
return sql_cmd, encoded_sql
The calling format of this routine is:
value = getContents(varName,quote,name_space)
This code will take the name of a variable as input and return the contents of that variable. If the variable is not found then the program will return None which is the equivalent to empty or null. Note that this function looks at the global variable pool for Python so it is possible that the wrong version of variable is returned if it is used in different functions. For this reason, any variables used in SQL statements should use a unique namimg convention if possible.
The other thing that this function does is replace single quotes with two quotes. The reason for doing this is that Db2 will convert two single quotes into one quote when dealing with strings. This avoids problems when dealing with text that contains multiple quotes within the string. Note that this substitution is done only for single quote characters since the double quote character is used by Db2 for naming columns that are case sensitive or contain special characters.
If the quote value is True, the field will have quotes around it. The name_space is the variables currently that are registered in Python.
In [ ]:
def getContents(varName,flag_quotes,local_ns):
#
# Get the contents of the variable name that is passed to the routine. Only simple
# variables are checked, i.e. arrays and lists are not parsed
#
STRING = 0
NUMBER = 1
LIST = 2
RAW = 3
DICT = 4
try:
value = eval(varName,None,local_ns) # globals()[varName] # eval(varName)
except:
return(None,STRING)
if (isinstance(value,dict) == True): # Check to see if this is JSON dictionary
return(addquotes(value,flag_quotes),STRING)
elif(isinstance(value,list) == True): # List - tricky
return(value,LIST)
elif (isinstance(value,int) == True): # Integer value
return(value,NUMBER)
elif (isinstance(value,float) == True): # Float value
return(value,NUMBER)
else:
try:
# The pattern needs to be in the first position (0 in Python terms)
if (value.find('0x') == 0): # Just guessing this is a hex value
return(value,RAW)
else:
return(addquotes(value,flag_quotes),STRING) # String
except:
return(addquotes(str(value),flag_quotes),RAW)
Quotes are a challenge when dealing with dictionaries and Db2. Db2 wants strings delimited with single quotes, while Dictionaries use double quotes. That wouldn't be a problems except imbedded single quotes within these dictionaries will cause things to fail. This routine attempts to double-quote the single quotes within the dicitonary.
In [ ]:
def addquotes(inString,flag_quotes):
if (isinstance(inString,dict) == True): # Check to see if this is JSON dictionary
serialized = json.dumps(inString)
else:
serialized = inString
# Replace single quotes with '' (two quotes) and wrap everything in single quotes
if (flag_quotes == False):
return(serialized)
else:
return("'"+serialized.replace("'","''")+"'") # Convert single quotes to two single quotes
The calling format of this routine is:
db2_create_sample(quiet)
There are a lot of examples that depend on the data within the SAMPLE database. If you are running these examples and the connection is not to the SAMPLE database, then this code will create the two (EMPLOYEE, DEPARTMENT) tables that are used by most examples. If the function finds that these tables already exist, then nothing is done. If the tables are missing then they will be created with the same data as in the SAMPLE database.
The quiet flag tells the program not to print any messages when the creation of the tables is complete.
In [ ]:
def db2_create_sample(quiet):
create_department = """
BEGIN
DECLARE FOUND INTEGER;
SET FOUND = (SELECT COUNT(*) FROM SYSIBM.SYSTABLES WHERE NAME='DEPARTMENT' AND CREATOR=CURRENT USER);
IF FOUND = 0 THEN
EXECUTE IMMEDIATE('CREATE TABLE DEPARTMENT(DEPTNO CHAR(3) NOT NULL, DEPTNAME VARCHAR(36) NOT NULL,
MGRNO CHAR(6),ADMRDEPT CHAR(3) NOT NULL)');
EXECUTE IMMEDIATE('INSERT INTO DEPARTMENT VALUES
(''A00'',''SPIFFY COMPUTER SERVICE DIV.'',''000010'',''A00''),
(''B01'',''PLANNING'',''000020'',''A00''),
(''C01'',''INFORMATION CENTER'',''000030'',''A00''),
(''D01'',''DEVELOPMENT CENTER'',NULL,''A00''),
(''D11'',''MANUFACTURING SYSTEMS'',''000060'',''D01''),
(''D21'',''ADMINISTRATION SYSTEMS'',''000070'',''D01''),
(''E01'',''SUPPORT SERVICES'',''000050'',''A00''),
(''E11'',''OPERATIONS'',''000090'',''E01''),
(''E21'',''SOFTWARE SUPPORT'',''000100'',''E01''),
(''F22'',''BRANCH OFFICE F2'',NULL,''E01''),
(''G22'',''BRANCH OFFICE G2'',NULL,''E01''),
(''H22'',''BRANCH OFFICE H2'',NULL,''E01''),
(''I22'',''BRANCH OFFICE I2'',NULL,''E01''),
(''J22'',''BRANCH OFFICE J2'',NULL,''E01'')');
END IF;
END"""
%sql -d -q {create_department}
create_employee = """
BEGIN
DECLARE FOUND INTEGER;
SET FOUND = (SELECT COUNT(*) FROM SYSIBM.SYSTABLES WHERE NAME='EMPLOYEE' AND CREATOR=CURRENT USER);
IF FOUND = 0 THEN
EXECUTE IMMEDIATE('CREATE TABLE EMPLOYEE(
EMPNO CHAR(6) NOT NULL,
FIRSTNME VARCHAR(12) NOT NULL,
MIDINIT CHAR(1),
LASTNAME VARCHAR(15) NOT NULL,
WORKDEPT CHAR(3),
PHONENO CHAR(4),
HIREDATE DATE,
JOB CHAR(8),
EDLEVEL SMALLINT NOT NULL,
SEX CHAR(1),
BIRTHDATE DATE,
SALARY DECIMAL(9,2),
BONUS DECIMAL(9,2),
COMM DECIMAL(9,2)
)');
EXECUTE IMMEDIATE('INSERT INTO EMPLOYEE VALUES
(''000010'',''CHRISTINE'',''I'',''HAAS'' ,''A00'',''3978'',''1995-01-01'',''PRES '',18,''F'',''1963-08-24'',152750.00,1000.00,4220.00),
(''000020'',''MICHAEL'' ,''L'',''THOMPSON'' ,''B01'',''3476'',''2003-10-10'',''MANAGER '',18,''M'',''1978-02-02'',94250.00,800.00,3300.00),
(''000030'',''SALLY'' ,''A'',''KWAN'' ,''C01'',''4738'',''2005-04-05'',''MANAGER '',20,''F'',''1971-05-11'',98250.00,800.00,3060.00),
(''000050'',''JOHN'' ,''B'',''GEYER'' ,''E01'',''6789'',''1979-08-17'',''MANAGER '',16,''M'',''1955-09-15'',80175.00,800.00,3214.00),
(''000060'',''IRVING'' ,''F'',''STERN'' ,''D11'',''6423'',''2003-09-14'',''MANAGER '',16,''M'',''1975-07-07'',72250.00,500.00,2580.00),
(''000070'',''EVA'' ,''D'',''PULASKI'' ,''D21'',''7831'',''2005-09-30'',''MANAGER '',16,''F'',''2003-05-26'',96170.00,700.00,2893.00),
(''000090'',''EILEEN'' ,''W'',''HENDERSON'' ,''E11'',''5498'',''2000-08-15'',''MANAGER '',16,''F'',''1971-05-15'',89750.00,600.00,2380.00),
(''000100'',''THEODORE'' ,''Q'',''SPENSER'' ,''E21'',''0972'',''2000-06-19'',''MANAGER '',14,''M'',''1980-12-18'',86150.00,500.00,2092.00),
(''000110'',''VINCENZO'' ,''G'',''LUCCHESSI'' ,''A00'',''3490'',''1988-05-16'',''SALESREP'',19,''M'',''1959-11-05'',66500.00,900.00,3720.00),
(''000120'',''SEAN'' ,'' '',''O`CONNELL'' ,''A00'',''2167'',''1993-12-05'',''CLERK '',14,''M'',''1972-10-18'',49250.00,600.00,2340.00),
(''000130'',''DELORES'' ,''M'',''QUINTANA'' ,''C01'',''4578'',''2001-07-28'',''ANALYST '',16,''F'',''1955-09-15'',73800.00,500.00,1904.00),
(''000140'',''HEATHER'' ,''A'',''NICHOLLS'' ,''C01'',''1793'',''2006-12-15'',''ANALYST '',18,''F'',''1976-01-19'',68420.00,600.00,2274.00),
(''000150'',''BRUCE'' ,'' '',''ADAMSON'' ,''D11'',''4510'',''2002-02-12'',''DESIGNER'',16,''M'',''1977-05-17'',55280.00,500.00,2022.00),
(''000160'',''ELIZABETH'',''R'',''PIANKA'' ,''D11'',''3782'',''2006-10-11'',''DESIGNER'',17,''F'',''1980-04-12'',62250.00,400.00,1780.00),
(''000170'',''MASATOSHI'',''J'',''YOSHIMURA'' ,''D11'',''2890'',''1999-09-15'',''DESIGNER'',16,''M'',''1981-01-05'',44680.00,500.00,1974.00),
(''000180'',''MARILYN'' ,''S'',''SCOUTTEN'' ,''D11'',''1682'',''2003-07-07'',''DESIGNER'',17,''F'',''1979-02-21'',51340.00,500.00,1707.00),
(''000190'',''JAMES'' ,''H'',''WALKER'' ,''D11'',''2986'',''2004-07-26'',''DESIGNER'',16,''M'',''1982-06-25'',50450.00,400.00,1636.00),
(''000200'',''DAVID'' ,'' '',''BROWN'' ,''D11'',''4501'',''2002-03-03'',''DESIGNER'',16,''M'',''1971-05-29'',57740.00,600.00,2217.00),
(''000210'',''WILLIAM'' ,''T'',''JONES'' ,''D11'',''0942'',''1998-04-11'',''DESIGNER'',17,''M'',''2003-02-23'',68270.00,400.00,1462.00),
(''000220'',''JENNIFER'' ,''K'',''LUTZ'' ,''D11'',''0672'',''1998-08-29'',''DESIGNER'',18,''F'',''1978-03-19'',49840.00,600.00,2387.00),
(''000230'',''JAMES'' ,''J'',''JEFFERSON'' ,''D21'',''2094'',''1996-11-21'',''CLERK '',14,''M'',''1980-05-30'',42180.00,400.00,1774.00),
(''000240'',''SALVATORE'',''M'',''MARINO'' ,''D21'',''3780'',''2004-12-05'',''CLERK '',17,''M'',''2002-03-31'',48760.00,600.00,2301.00),
(''000250'',''DANIEL'' ,''S'',''SMITH'' ,''D21'',''0961'',''1999-10-30'',''CLERK '',15,''M'',''1969-11-12'',49180.00,400.00,1534.00),
(''000260'',''SYBIL'' ,''P'',''JOHNSON'' ,''D21'',''8953'',''2005-09-11'',''CLERK '',16,''F'',''1976-10-05'',47250.00,300.00,1380.00),
(''000270'',''MARIA'' ,''L'',''PEREZ'' ,''D21'',''9001'',''2006-09-30'',''CLERK '',15,''F'',''2003-05-26'',37380.00,500.00,2190.00),
(''000280'',''ETHEL'' ,''R'',''SCHNEIDER'' ,''E11'',''8997'',''1997-03-24'',''OPERATOR'',17,''F'',''1976-03-28'',36250.00,500.00,2100.00),
(''000290'',''JOHN'' ,''R'',''PARKER'' ,''E11'',''4502'',''2006-05-30'',''OPERATOR'',12,''M'',''1985-07-09'',35340.00,300.00,1227.00),
(''000300'',''PHILIP'' ,''X'',''SMITH'' ,''E11'',''2095'',''2002-06-19'',''OPERATOR'',14,''M'',''1976-10-27'',37750.00,400.00,1420.00),
(''000310'',''MAUDE'' ,''F'',''SETRIGHT'' ,''E11'',''3332'',''1994-09-12'',''OPERATOR'',12,''F'',''1961-04-21'',35900.00,300.00,1272.00),
(''000320'',''RAMLAL'' ,''V'',''MEHTA'' ,''E21'',''9990'',''1995-07-07'',''FIELDREP'',16,''M'',''1962-08-11'',39950.00,400.00,1596.00),
(''000330'',''WING'' ,'' '',''LEE'' ,''E21'',''2103'',''2006-02-23'',''FIELDREP'',14,''M'',''1971-07-18'',45370.00,500.00,2030.00),
(''000340'',''JASON'' ,''R'',''GOUNOT'' ,''E21'',''5698'',''1977-05-05'',''FIELDREP'',16,''M'',''1956-05-17'',43840.00,500.00,1907.00),
(''200010'',''DIAN'' ,''J'',''HEMMINGER'' ,''A00'',''3978'',''1995-01-01'',''SALESREP'',18,''F'',''1973-08-14'',46500.00,1000.00,4220.00),
(''200120'',''GREG'' ,'' '',''ORLANDO'' ,''A00'',''2167'',''2002-05-05'',''CLERK '',14,''M'',''1972-10-18'',39250.00,600.00,2340.00),
(''200140'',''KIM'' ,''N'',''NATZ'' ,''C01'',''1793'',''2006-12-15'',''ANALYST '',18,''F'',''1976-01-19'',68420.00,600.00,2274.00),
(''200170'',''KIYOSHI'' ,'' '',''YAMAMOTO'' ,''D11'',''2890'',''2005-09-15'',''DESIGNER'',16,''M'',''1981-01-05'',64680.00,500.00,1974.00),
(''200220'',''REBA'' ,''K'',''JOHN'' ,''D11'',''0672'',''2005-08-29'',''DESIGNER'',18,''F'',''1978-03-19'',69840.00,600.00,2387.00),
(''200240'',''ROBERT'' ,''M'',''MONTEVERDE'',''D21'',''3780'',''2004-12-05'',''CLERK '',17,''M'',''1984-03-31'',37760.00,600.00,2301.00),
(''200280'',''EILEEN'' ,''R'',''SCHWARTZ'' ,''E11'',''8997'',''1997-03-24'',''OPERATOR'',17,''F'',''1966-03-28'',46250.00,500.00,2100.00),
(''200310'',''MICHELLE'' ,''F'',''SPRINGER'' ,''E11'',''3332'',''1994-09-12'',''OPERATOR'',12,''F'',''1961-04-21'',35900.00,300.00,1272.00),
(''200330'',''HELENA'' ,'' '',''WONG'' ,''E21'',''2103'',''2006-02-23'',''FIELDREP'',14,''F'',''1971-07-18'',35370.00,500.00,2030.00),
(''200340'',''ROY'' ,''R'',''ALONZO'' ,''E21'',''5698'',''1997-07-05'',''FIELDREP'',16,''M'',''1956-05-17'',31840.00,500.00,1907.00)');
END IF;
END"""
%sql -d -q {create_employee}
if (quiet == False): success("Sample tables [EMPLOYEE, DEPARTMENT] created.")
This function will return the original string with the option removed, and a flag or true or false of the value is found.
args, flag = checkOption(option_string, option, false_value, true_value)
Options are specified with a -x where x is the character that we are searching for. It may actually be more than one character long like -pb/-pi/etc... The false and true values are optional. By default these are the boolean values of T/F but for some options it could be a character string like ';' versus '@' for delimiters.
In [ ]:
def checkOption(args_in, option, vFalse=False, vTrue=True):
args_out = args_in.strip()
found = vFalse
if (args_out != ""):
if (args_out.find(option) >= 0):
args_out = args_out.replace(option," ")
args_out = args_out.strip()
found = vTrue
return args_out, found
This function will plot the data that is returned from the answer set. The plot value determines how we display the data. 1=Bar, 2=Pie, 3=Line, 4=Interactive.
plotData(flag_plot, hdbi, sql, parms)
The hdbi is the ibm_db_sa handle that is used by pandas dataframes to run the sql. The parms contains any of the parameters required to run the query.
In [ ]:
def plotData(hdbi, sql):
try:
df = pandas.read_sql(sql,hdbi)
except Exception as err:
db2_error(False)
return
if df.empty:
errormsg("No results returned")
return
col_count = len(df.columns)
if flag(["-pb","-bar"]): # Plot 1 = bar chart
if (col_count in (1,2,3)):
if (col_count == 1):
df.index = df.index + 1
_ = df.plot(kind='bar');
_ = plt.plot();
elif (col_count == 2):
xlabel = df.columns.values[0]
ylabel = df.columns.values[1]
df.plot(kind='bar',x=xlabel,y=ylabel);
_ = plt.plot();
else:
values = df.columns.values[2]
columns = df.columns.values[0]
index = df.columns.values[1]
pivoted = pandas.pivot_table(df, values=values, columns=columns, index=index)
_ = pivoted.plot.bar();
else:
errormsg("Can't determine what columns to plot")
return
elif flag(["-pp","-pie"]): # Plot 2 = pie chart
if (col_count in (1,2)):
if (col_count == 1):
df.index = df.index + 1
yname = df.columns.values[0]
_ = df.plot(kind='pie',y=yname);
else:
xlabel = df.columns.values[0]
xname = df[xlabel].tolist()
yname = df.columns.values[1]
_ = df.plot(kind='pie',y=yname,labels=xname);
plt.show();
else:
errormsg("Can't determine what columns to plot")
return
elif flag(["-pl","-line"]): # Plot 3 = line chart
if (col_count in (1,2,3)):
if (col_count == 1):
df.index = df.index + 1
_ = df.plot(kind='line');
elif (col_count == 2):
xlabel = df.columns.values[0]
ylabel = df.columns.values[1]
_ = df.plot(kind='line',x=xlabel,y=ylabel) ;
else:
values = df.columns.values[2]
columns = df.columns.values[0]
index = df.columns.values[1]
pivoted = pandas.pivot_table(df, values=values, columns=columns, index=index)
_ = pivoted.plot();
plt.show();
else:
errormsg("Can't determine what columns to plot")
return
else:
return
In [ ]:
def findProc(procname):
global _hdbc, _hdbi, _connected, _runtime
# Split the procedure name into schema.procname if appropriate
upper_procname = procname.upper()
schema, proc = split_string(upper_procname,".") # Expect schema.procname
if (proc == None):
proc = schema
# Call ibm_db.procedures to see if the procedure does exist
schema = "%"
try:
stmt = ibm_db.procedures(_hdbc, None, schema, proc)
if (stmt == False): # Error executing the code
errormsg("Procedure " + procname + " not found in the system catalog.")
return None
result = ibm_db.fetch_tuple(stmt)
resultsets = result[5]
if (resultsets >= 1): resultsets = 1
return resultsets
except Exception as err:
errormsg("Procedure " + procname + " not found in the system catalog.")
return None
In [ ]:
def parseCallArgs(macro):
quoteChar = ""
inQuote = False
inParm = False
ignore = False
name = ""
parms = []
parm = ''
sqlin = macro.replace("\n","")
sqlin.lstrip()
for ch in sqlin:
if (inParm == False):
# We hit a blank in the name, so ignore everything after the procedure name until a ( is found
if (ch == " "):
ignore == True
elif (ch == "("): # Now we have parameters to send to the stored procedure
inParm = True
else:
if (ignore == False): name = name + ch # The name of the procedure (and no blanks)
else:
if (inQuote == True):
if (ch == quoteChar):
inQuote = False
else:
parm = parm + ch
elif (ch in ("\"","\'","[")): # Do we have a quote
if (ch == "["):
quoteChar = "]"
else:
quoteChar = ch
inQuote = True
elif (ch == ")"):
if (parm != ""):
parms.append(parm)
parm = ""
break
elif (ch == ","):
if (parm != ""):
parms.append(parm)
else:
parms.append("null")
parm = ""
else:
parm = parm + ch
if (inParm == True):
if (parm != ""):
parms.append(parm_value)
return(name,parms)
In [ ]:
def getColumns(stmt):
columns = []
types = []
colcount = 0
try:
colname = ibm_db.field_name(stmt,colcount)
coltype = ibm_db.field_type(stmt,colcount)
while (colname != False):
columns.append(colname)
types.append(coltype)
colcount += 1
colname = ibm_db.field_name(stmt,colcount)
coltype = ibm_db.field_type(stmt,colcount)
return columns,types
except Exception as err:
db2_error(False)
return None
The CALL statement is used for execution of a stored procedure. The format of the CALL statement is:
CALL PROC_NAME(x,y,z,...)
Procedures allow for the return of answer sets (cursors) as well as changing the contents of the parameters being passed to the procedure. In this implementation, the CALL function is limited to returning one answer set (or nothing). If you want to use more complex stored procedures then you will have to use the native python libraries.
In [ ]:
def parseCall(hdbc, inSQL, local_ns):
global _hdbc, _hdbi, _connected, _runtime, _environment
# Check to see if we are connected first
if (_connected == False): # Check if you are connected
db2_doConnect()
if _connected == False: return None
remainder = inSQL.strip()
procName, procArgs = parseCallArgs(remainder[5:]) # Assume that CALL ... is the format
resultsets = findProc(procName)
if (resultsets == None): return None
argvalues = []
if (len(procArgs) > 0): # We have arguments to consider
for arg in procArgs:
varname = arg
if (len(varname) > 0):
if (varname[0] == ":"):
checkvar = varname[1:]
varvalue = getContents(checkvar,True,local_ns)
if (varvalue == None):
errormsg("Variable " + checkvar + " is not defined.")
return None
argvalues.append(varvalue)
else:
if (varname.upper() == "NULL"):
argvalues.append(None)
else:
argvalues.append(varname)
else:
argvalues.append(None)
try:
if (len(procArgs) > 0):
argtuple = tuple(argvalues)
result = ibm_db.callproc(_hdbc,procName,argtuple)
stmt = result[0]
else:
result = ibm_db.callproc(_hdbc,procName)
stmt = result
if (resultsets == 1 and stmt != None):
columns, types = getColumns(stmt)
if (columns == None): return None
rows = []
rowlist = ibm_db.fetch_tuple(stmt)
while ( rowlist ) :
row = []
colcount = 0
for col in rowlist:
try:
if (types[colcount] in ["int","bigint"]):
row.append(int(col))
elif (types[colcount] in ["decimal","real"]):
row.append(float(col))
elif (types[colcount] in ["date","time","timestamp"]):
row.append(str(col))
else:
row.append(col)
except:
row.append(col)
colcount += 1
rows.append(row)
rowlist = ibm_db.fetch_tuple(stmt)
if flag(["-r","-array"]):
rows.insert(0,columns)
if len(procArgs) > 0:
allresults = []
allresults.append(rows)
for x in result[1:]:
allresults.append(x)
return allresults # rows,returned_results
else:
return rows
else:
df = pandas.DataFrame.from_records(rows,columns=columns)
if flag("-grid") or _settings['display'] == 'GRID':
if (_environment['qgrid'] == False):
with pandas.option_context('display.max_rows', None, 'display.max_columns', None):
pdisplay(df)
else:
try:
pdisplay(qgrid.show_grid(df))
except:
errormsg("Grid cannot be used to display data with duplicate column names. Use option -a or %sql OPTION DISPLAY PANDAS instead.")
return
else:
if flag(["-a","-all"]) or _settings["maxrows"] == -1 : # All of the rows
with pandas.option_context('display.max_rows', None, 'display.max_columns', None):
pdisplay(df)
else:
return df
else:
if len(procArgs) > 0:
allresults = []
for x in result[1:]:
allresults.append(x)
return allresults # rows,returned_results
else:
return None
except Exception as err:
db2_error(False)
return None
The PREPARE statement is used for repeated execution of a SQL statement. The PREPARE statement has the format:
stmt = PREPARE SELECT EMPNO FROM EMPLOYEE WHERE WORKDEPT=? AND SALARY<?
The SQL statement that you want executed is placed after the PREPARE statement with the location of variables marked with ? (parameter) markers. The variable stmt contains the prepared statement that need to be passed to the EXECUTE statement. The EXECUTE statement has the format:
EXECUTE :x USING z, y, s
The first variable (:x) is the name of the variable that you assigned the results of the prepare statement. The values after the USING clause are substituted into the prepare statement where the ? markers are found.
If the values in USING clause are variable names (z, y, s), a link is created to these variables as part of the execute statement. If you use the variable subsitution form of variable name (:z, :y, :s), the contents of the variable are placed into the USING clause. Normally this would not make much of a difference except when you are dealing with binary strings or JSON strings where the quote characters may cause some problems when subsituted into the statement.
In [ ]:
def parsePExec(hdbc, inSQL):
import ibm_db
global _stmt, _stmtID, _stmtSQL, sqlcode
cParms = inSQL.split()
parmCount = len(cParms)
if (parmCount == 0): return(None) # Nothing to do but this shouldn't happen
keyword = cParms[0].upper() # Upper case the keyword
if (keyword == "PREPARE"): # Prepare the following SQL
uSQL = inSQL.upper()
found = uSQL.find("PREPARE")
sql = inSQL[found+7:].strip()
try:
pattern = "\?\*[0-9]+"
findparm = re.search(pattern,sql)
while findparm != None:
found = findparm.group(0)
count = int(found[2:])
markers = ('?,' * count)[:-1]
sql = sql.replace(found,markers)
findparm = re.search(pattern,sql)
stmt = ibm_db.prepare(hdbc,sql) # Check error code here
if (stmt == False):
db2_error(False)
return(False)
stmttext = str(stmt).strip()
stmtID = stmttext[33:48].strip()
if (stmtID in _stmtID) == False:
_stmt.append(stmt) # Prepare and return STMT to caller
_stmtID.append(stmtID)
else:
stmtIX = _stmtID.index(stmtID)
_stmt[stmtiX] = stmt
return(stmtID)
except Exception as err:
print(err)
db2_error(False)
return(False)
if (keyword == "EXECUTE"): # Execute the prepare statement
if (parmCount < 2): return(False) # No stmtID available
stmtID = cParms[1].strip()
if (stmtID in _stmtID) == False:
errormsg("Prepared statement not found or invalid.")
return(False)
stmtIX = _stmtID.index(stmtID)
stmt = _stmt[stmtIX]
try:
if (parmCount == 2): # Only the statement handle available
result = ibm_db.execute(stmt) # Run it
elif (parmCount == 3): # Not quite enough arguments
errormsg("Missing or invalid USING clause on EXECUTE statement.")
sqlcode = -99999
return(False)
else:
using = cParms[2].upper()
if (using != "USING"): # Bad syntax again
errormsg("Missing USING clause on EXECUTE statement.")
sqlcode = -99999
return(False)
uSQL = inSQL.upper()
found = uSQL.find("USING")
parmString = inSQL[found+5:].strip()
parmset = splitargs(parmString)
if (len(parmset) == 0):
errormsg("Missing parameters after the USING clause.")
sqlcode = -99999
return(False)
parms = []
parm_count = 0
CONSTANT = 0
VARIABLE = 1
const = [0]
const_cnt = 0
for v in parmset:
parm_count = parm_count + 1
if (v[1] == True or v[2] == True): # v[1] true if string, v[2] true if num
parm_type = CONSTANT
const_cnt = const_cnt + 1
if (v[2] == True):
if (isinstance(v[0],int) == True): # Integer value
sql_type = ibm_db.SQL_INTEGER
elif (isinstance(v[0],float) == True): # Float value
sql_type = ibm_db.SQL_DOUBLE
else:
sql_type = ibm_db.SQL_INTEGER
else:
sql_type = ibm_db.SQL_CHAR
const.append(v[0])
else:
parm_type = VARIABLE
# See if the variable has a type associated with it varname@type
varset = v[0].split("@")
parm_name = varset[0]
parm_datatype = "char"
# Does the variable exist?
if (parm_name not in globals()):
errormsg("SQL Execute parameter " + parm_name + " not found")
sqlcode = -99999
return(false)
if (len(varset) > 1): # Type provided
parm_datatype = varset[1]
if (parm_datatype == "dec" or parm_datatype == "decimal"):
sql_type = ibm_db.SQL_DOUBLE
elif (parm_datatype == "bin" or parm_datatype == "binary"):
sql_type = ibm_db.SQL_BINARY
elif (parm_datatype == "int" or parm_datatype == "integer"):
sql_type = ibm_db.SQL_INTEGER
else:
sql_type = ibm_db.SQL_CHAR
try:
if (parm_type == VARIABLE):
result = ibm_db.bind_param(stmt, parm_count, globals()[parm_name], ibm_db.SQL_PARAM_INPUT, sql_type)
else:
result = ibm_db.bind_param(stmt, parm_count, const[const_cnt], ibm_db.SQL_PARAM_INPUT, sql_type)
except:
result = False
if (result == False):
errormsg("SQL Bind on variable " + parm_name + " failed.")
sqlcode = -99999
return(false)
result = ibm_db.execute(stmt) # ,tuple(parms))
if (result == False):
errormsg("SQL Execute failed.")
return(False)
if (ibm_db.num_fields(stmt) == 0): return(True) # Command successfully completed
return(fetchResults(stmt))
except Exception as err:
db2_error(False)
return(False)
return(False)
return(False)
In [ ]:
def fetchResults(stmt):
global sqlcode
rows = []
columns, types = getColumns(stmt)
# By default we assume that the data will be an array
is_array = True
# Check what type of data we want returned - array or json
if (flag(["-r","-array"]) == False):
# See if we want it in JSON format, if not it remains as an array
if (flag("-json") == True):
is_array = False
# Set column names to lowercase for JSON records
if (is_array == False):
columns = [col.lower() for col in columns] # Convert to lowercase for each of access
# First row of an array has the column names in it
if (is_array == True):
rows.append(columns)
result = ibm_db.fetch_tuple(stmt)
rowcount = 0
while (result):
rowcount += 1
if (is_array == True):
row = []
else:
row = {}
colcount = 0
for col in result:
try:
if (types[colcount] in ["int","bigint"]):
if (is_array == True):
row.append(int(col))
else:
row[columns[colcount]] = int(col)
elif (types[colcount] in ["decimal","real"]):
if (is_array == True):
row.append(float(col))
else:
row[columns[colcount]] = float(col)
elif (types[colcount] in ["date","time","timestamp"]):
if (is_array == True):
row.append(str(col))
else:
row[columns[colcount]] = str(col)
else:
if (is_array == True):
row.append(col)
else:
row[columns[colcount]] = col
except:
if (is_array == True):
row.append(col)
else:
row[columns[colcount]] = col
colcount += 1
rows.append(row)
result = ibm_db.fetch_tuple(stmt)
if (rowcount == 0):
sqlcode = 100
else:
sqlcode = 0
return rows
There are three possible COMMIT verbs that can bs used:
The statement is passed to this routine and then checked.
In [ ]:
def parseCommit(sql):
global _hdbc, _hdbi, _connected, _runtime, _stmt, _stmtID, _stmtSQL
if (_connected == False): return # Nothing to do if we are not connected
cParms = sql.split()
if (len(cParms) == 0): return # Nothing to do but this shouldn't happen
keyword = cParms[0].upper() # Upper case the keyword
if (keyword == "COMMIT"): # Commit the work that was done
try:
result = ibm_db.commit (_hdbc) # Commit the connection
if (len(cParms) > 1):
keyword = cParms[1].upper()
if (keyword == "HOLD"):
return
del _stmt[:]
del _stmtID[:]
except Exception as err:
db2_error(False)
return
if (keyword == "ROLLBACK"): # Rollback the work that was done
try:
result = ibm_db.rollback(_hdbc) # Rollback the connection
del _stmt[:]
del _stmtID[:]
except Exception as err:
db2_error(False)
return
if (keyword == "AUTOCOMMIT"): # Is autocommit on or off
if (len(cParms) > 1):
op = cParms[1].upper() # Need ON or OFF value
else:
return
try:
if (op == "OFF"):
ibm_db.autocommit(_hdbc, False)
elif (op == "ON"):
ibm_db.autocommit (_hdbc, True)
return
except Exception as err:
db2_error(False)
return
return
In [ ]:
def setFlags(inSQL):
global _flags
_flags = [] # Delete all of the current flag settings
pos = 0
end = len(inSQL)-1
inFlag = False
ignore = False
outSQL = ""
flag = ""
while (pos <= end):
ch = inSQL[pos]
if (ignore == True):
outSQL = outSQL + ch
else:
if (inFlag == True):
if (ch != " "):
flag = flag + ch
else:
_flags.append(flag)
inFlag = False
else:
if (ch == "-"):
flag = "-"
inFlag = True
elif (ch == ' '):
outSQL = outSQL + ch
else:
outSQL = outSQL + ch
ignore = True
pos += 1
if (inFlag == True):
_flags.append(flag)
return outSQL
In [ ]:
def flag(inflag):
global _flags
if isinstance(inflag,list):
for x in inflag:
if (x in _flags):
return True
return False
else:
if (inflag in _flags):
return True
else:
return False
In [ ]:
def splitSQL(inputString, delimiter):
pos = 0
arg = ""
results = []
quoteCH = ""
inSQL = inputString.strip()
if (len(inSQL) == 0): return(results) # Not much to do here - no args found
while pos < len(inSQL):
ch = inSQL[pos]
pos += 1
if (ch in ('"',"'")): # Is this a quote characters?
arg = arg + ch # Keep appending the characters to the current arg
if (ch == quoteCH): # Is this quote character we are in
quoteCH = ""
elif (quoteCH == ""): # Create the quote
quoteCH = ch
else:
None
elif (quoteCH != ""): # Still in a quote
arg = arg + ch
elif (ch == delimiter): # Is there a delimiter?
results.append(arg)
arg = ""
else:
arg = arg + ch
if (arg != ""):
results.append(arg)
return(results)
In [ ]:
@magics_class
class DB2(Magics):
@needs_local_scope
@line_cell_magic
def sql(self, line, cell=None, local_ns=None):
# Before we event get started, check to see if you have connected yet. Without a connection we
# can't do anything. You may have a connection request in the code, so if that is true, we run those,
# otherwise we connect immediately
# If your statement is not a connect, and you haven't connected, we need to do it for you
global _settings, _environment
global _hdbc, _hdbi, _connected, _runtime, sqlstate, sqlerror, sqlcode, sqlelapsed
# If you use %sql (line) we just run the SQL. If you use %%SQL the entire cell is run.
flag_cell = False
flag_output = False
sqlstate = "0"
sqlerror = ""
sqlcode = 0
sqlelapsed = 0
start_time = time.time()
end_time = time.time()
# Macros gets expanded before anything is done
SQL1 = setFlags(line.strip())
SQL1 = checkMacro(SQL1) # Update the SQL if any macros are in there
SQL2 = cell
if flag("-sampledata"): # Check if you only want sample data loaded
if (_connected == False):
if (db2_doConnect() == False):
errormsg('A CONNECT statement must be issued before issuing SQL statements.')
return
db2_create_sample(flag(["-q","-quiet"]))
return
if SQL1 == "?" or flag(["-h","-help"]): # Are you asking for help
sqlhelp()
return
if len(SQL1) == 0 and SQL2 == None: return # Nothing to do here
# Check for help
if SQL1.upper() == "? CONNECT": # Are you asking for help on CONNECT
connected_help()
return
sqlType,remainder = sqlParser(SQL1,local_ns) # What type of command do you have?
if (sqlType == "CONNECT"): # A connect request
parseConnect(SQL1,local_ns)
return
elif (sqlType == "DEFINE"): # Create a macro from the body
result = setMacro(SQL2,remainder)
return
elif (sqlType == "OPTION"):
setOptions(SQL1)
return
elif (sqlType == 'COMMIT' or sqlType == 'ROLLBACK' or sqlType == 'AUTOCOMMIT'):
parseCommit(remainder)
return
elif (sqlType == "PREPARE"):
pstmt = parsePExec(_hdbc, remainder)
return(pstmt)
elif (sqlType == "EXECUTE"):
result = parsePExec(_hdbc, remainder)
return(result)
elif (sqlType == "CALL"):
result = parseCall(_hdbc, remainder, local_ns)
return(result)
else:
pass
sql = SQL1
if (sql == ""): sql = SQL2
if (sql == ""): return # Nothing to do here
if (_connected == False):
if (db2_doConnect() == False):
errormsg('A CONNECT statement must be issued before issuing SQL statements.')
return
if _settings["maxrows"] == -1: # Set the return result size
pandas.reset_option('display.max_rows')
else:
pandas.options.display.max_rows = _settings["maxrows"]
runSQL = re.sub('.*?--.*$',"",sql,flags=re.M)
remainder = runSQL.replace("\n"," ")
if flag(["-d","-delim"]):
sqlLines = splitSQL(remainder,"@")
else:
sqlLines = splitSQL(remainder,";")
flag_cell = True
# For each line figure out if you run it as a command (db2) or select (sql)
for sqlin in sqlLines: # Run each command
sqlin = checkMacro(sqlin) # Update based on any macros
sqlType, sql = sqlParser(sqlin,local_ns) # Parse the SQL
if (sql.strip() == ""): continue
if flag(["-e","-echo"]): debug(sql,False)
if flag("-t"):
cnt = sqlTimer(_hdbc, _settings["runtime"], sql) # Given the sql and parameters, clock the time
if (cnt >= 0): print("Total iterations in %s second(s): %s" % (_settings["runtime"],cnt))
return(cnt)
elif flag(["-pb","-bar","-pp","-pie","-pl","-line"]): # We are plotting some results
plotData(_hdbi, sql) # Plot the data and return
return
else:
try: # See if we have an answer set
stmt = ibm_db.prepare(_hdbc,sql)
if (ibm_db.num_fields(stmt) == 0): # No, so we just execute the code
result = ibm_db.execute(stmt) # Run it
if (result == False): # Error executing the code
db2_error(flag(["-q","-quiet"]))
continue
rowcount = ibm_db.num_rows(stmt)
if (rowcount == 0 and flag(["-q","-quiet"]) == False):
errormsg("No rows found.")
continue # Continue running
elif flag(["-r","-array","-j","-json"]): # raw, json, format json
row_count = 0
resultSet = []
try:
result = ibm_db.execute(stmt) # Run it
if (result == False): # Error executing the code
db2_error(flag(["-q","-quiet"]))
return
if flag("-j"): # JSON single output
row_count = 0
json_results = []
while( ibm_db.fetch_row(stmt) ):
row_count = row_count + 1
jsonVal = ibm_db.result(stmt,0)
jsonDict = json.loads(jsonVal)
json_results.append(jsonDict)
flag_output = True
if (row_count == 0): sqlcode = 100
return(json_results)
else:
return(fetchResults(stmt))
except Exception as err:
db2_error(flag(["-q","-quiet"]))
return
else:
try:
df = pandas.read_sql(sql,_hdbi)
except Exception as err:
db2_error(False)
return
if (len(df) == 0):
sqlcode = 100
if (flag(["-q","-quiet"]) == False):
errormsg("No rows found")
continue
flag_output = True
if flag("-grid") or _settings['display'] == 'GRID': # Check to see if we can display the results
if (_environment['qgrid'] == False):
with pandas.option_context('display.max_rows', None, 'display.max_columns', None):
print(df.to_string())
else:
try:
pdisplay(qgrid.show_grid(df))
except:
errormsg("Grid cannot be used to display data with duplicate column names. Use option -a or %sql OPTION DISPLAY PANDAS instead.")
return
else:
if flag(["-a","-all"]) or _settings["maxrows"] == -1 : # All of the rows
pandas.options.display.max_rows = None
pandas.options.display.max_columns = None
return df # print(df.to_string())
else:
pandas.options.display.max_rows = _settings["maxrows"]
pandas.options.display.max_columns = None
return df # pdisplay(df) # print(df.to_string())
except:
db2_error(flag(["-q","-quiet"]))
continue # return
end_time = time.time()
sqlelapsed = end_time - start_time
if (flag_output == False and flag(["-q","-quiet"]) == False): print("Command completed.")
# Register the Magic extension in Jupyter
ip = get_ipython()
ip.register_magics(DB2)
load_settings()
success("Db2 Extensions Loaded.")
In [ ]:
%%sql define LIST
#
# The LIST macro is used to list all of the tables in the current schema or for all schemas
#
var syntax Syntax: LIST TABLES [FOR ALL | FOR SCHEMA name]
#
# Only LIST TABLES is supported by this macro
#
if {^1} <> 'TABLES'
exit {syntax}
endif
#
# This SQL is a temporary table that contains the description of the different table types
#
WITH TYPES(TYPE,DESCRIPTION) AS (
VALUES
('A','Alias'),
('G','Created temporary table'),
('H','Hierarchy table'),
('L','Detached table'),
('N','Nickname'),
('S','Materialized query table'),
('T','Table'),
('U','Typed table'),
('V','View'),
('W','Typed view')
)
SELECT TABNAME, TABSCHEMA, T.DESCRIPTION FROM SYSCAT.TABLES S, TYPES T
WHERE T.TYPE = S.TYPE
#
# Case 1: No arguments - LIST TABLES
#
if {argc} == 1
AND OWNER = CURRENT USER
ORDER BY TABNAME, TABSCHEMA
return
endif
#
# Case 2: Need 3 arguments - LIST TABLES FOR ALL
#
if {argc} == 3
if {^2}&{^3} == 'FOR&ALL'
ORDER BY TABNAME, TABSCHEMA
return
endif
exit {syntax}
endif
#
# Case 3: Need FOR SCHEMA something here
#
if {argc} == 4
if {^2}&{^3} == 'FOR&SCHEMA'
AND TABSCHEMA = '{^4}'
ORDER BY TABNAME, TABSCHEMA
return
else
exit {syntax}
endif
endif
#
# Nothing matched - Error
#
exit {syntax}
In [ ]:
%%sql define describe
#
# The DESCRIBE command can either use the syntax DESCRIBE TABLE <name> or DESCRIBE TABLE SELECT ...
#
var syntax Syntax: DESCRIBE [TABLE name | SELECT statement]
#
# Check to see what count of variables is... Must be at least 2 items DESCRIBE TABLE x or SELECT x
#
if {argc} < 2
exit {syntax}
endif
CALL ADMIN_CMD('{*0}');
Set the table formatting to left align a table in a cell. By default, tables are centered in a cell. Remove this cell if you don't want to change Jupyter notebook formatting for tables. In addition, we skip this code if you are running in a shell environment rather than a Jupyter notebook
In [ ]:
#%%html
#<style>
# table {margin-left: 0 !important; text-align: left;}
#</style>