Setup database connectivity

We'll reuse our module from the previous notebook (00_database_connectivity_setup.ipynb) to establish connectivity to the database


In [18]:
%run '00_database_connectivity_setup.ipynb'
IPython.display.clear_output()

Your connection object is conn:

  1. Queries: You can run your queries using psql.read_sql("""<YOUR SQL>""", conn). Alternatively, if you don't want a handle to the resulting dataframe, you can run the code inline using the magic command we defined in previously in cell: %%showsql.
  2. Create/Delete/Updates: You can run these statements using psql.execute("""<YOUR SQL>""", conn), followed by a conn.commit() command to ensure your transaction is committed. Otherwise your changes will be rolledback if you terminate your kernel. Alternatively, you could use the magic command that we previously defined in the cell: %%execsql.

If you created a new connection object (say to connect to a new cluster) as shown in the last section of 00_database_connectivity_setup.ipynb notebook, use that connection object where needed.

Data Exploration

CREATE/UPDATE/DELETE query


In [19]:
%%execsql
drop table if exists gp_ds_sample_table;
create temp table gp_ds_sample_table 
as 
(
    select 
        random() as x,
        random() as y
    from 
        generate_series(1, 10) x
) distributed randomly;

SELECT query


In [20]:
%%showsql
select
    *
from
    gp_ds_sample_table;


x y
0 0.701585 0.425520
1 0.449032 0.751700
2 0.208922 0.632537
3 0.214895 0.250567
4 0.554919 0.080594
5 0.039295 0.551742
6 0.884097 0.467457
7 0.490939 0.805070
8 0.474168 0.628187
9 0.672156 0.594877