There are two parts of ibis that users typically want to extend:
This notebook will show you how to add a new elementwise operation--sha1
--to an existing backend (PostgreSQL).
We're going to add a sha1
method to ibis. SHA1 is a hash algorithm, employed in systems such as git.
Let's define the sha
operation as a function that takes one string input argument and returns a hexidecimal string.
sha1 :: String -> String
In [ ]:
import ibis.expr.datatypes as dt
import ibis.expr.rules as rlz
from ibis.expr.operations import ValueOp, Arg
class SHA1(ValueOp):
arg = Arg(rlz.string)
output_type = rlz.shape_like('arg', 'string')
We just defined a SHA1
class that takes one argument of type string or binary, and returns a binary. This matches the description of the function provided by BigQuery.
Because we know the output type of the operation, to make an expression out of SHA1
we simply need to construct it and call its ibis.expr.types.Node.to_expr
method.
We still need to add a method to StringValue
and BinaryValue
(this needs to work on both scalars and columns).
When you add a method to any of the expression classes whose name matches *Value
both the scalar and column child classes will pick it up, making it easy to define operations for both scalars and columns in one place.
We can do this by defining a function and assigning it to the appropriate class of expressions.
In [ ]:
from ibis.expr.types import StringValue, BinaryValue
def sha1(string_value):
return SHA1(string_value).to_expr()
StringValue.sha1 = sha1
In [ ]:
import ibis
In [ ]:
t = ibis.table([('string_col', 'string')], name='t')
In [ ]:
t.string_col.sha1()
In [ ]:
import sqlalchemy as sa
@ibis.postgres.compiles(SHA1)
def compile_sha1(translator, expr):
# pull out the arguments to the expression
arg, = expr.op().args
# compile the argument
compiled_arg = translator.translate(arg)
# return a SQLAlchemy expression that calls into the PostgreSQL pgcrypto extension
return sa.func.encode(sa.func.digest(compiled_arg, 'sha1'), 'hex')
NOTE:
To be able to execute the rest of this notebook you need to run the following command from your ibis clone:
ci/build.sh
In [ ]:
import ibis
con = ibis.postgres.connect(
database='ibis_testing', user='postgres', host='postgres', password='postgres')
See https://www.postgresql.org/docs/10/static/pgcrypto.html for details about this extension
In [ ]:
# the output here is an AlchemyProxy instance that cannot iterate
# (because there's no output from the database) so we hide it with a semicolon
con.raw_sql('CREATE EXTENSION IF NOT EXISTS pgcrypto');
In [ ]:
t = con.table('functional_alltypes')
t
In [ ]:
sha1_expr = t.string_col.sha1()
sha1_expr
In [ ]:
sql_expr = sha1_expr.compile()
print(sql_expr)
In [ ]:
result = sha1_expr.execute()
In [ ]:
result.head()
Because we've defined our operation on StringValue
, and not just on StringColumn
we get operations on both string scalars and string columns for free
In [ ]:
string_scalar = ibis.literal('abcdefg')
string_scalar
In [ ]:
sha1_scalar = string_scalar.sha1()
In [ ]:
con.execute(sha1_scalar)