This spark notebook connects to BigInsights on Cloud using BigSQL.

This notebook runs succesfully on stand alone spark-1.6.1-bin-hadoop2.6 and will output a dataframe like this:

[Row(F1=77.0, F2=-16.200000762939453, F3=7.81678581237793), Row(F1=77.0, F2=-16.200000762939453, F3=7.528648376464844), Row(F1=77.0, F2=-16.200000762939453, F3=7.240304946899414), Row(F1=77.0, F2=-16.200000762939453, F3=6.9515509605407715), Row(F1=77.0, F2=-16.200000762939453, F3=6.6621809005737305), Row(F1=77.0, F2=-16.200000762939453, F3=8.371989250183105), Row(F1=77.0, F2=-16.200000762939453, F3=10.080772399902344), Row(F1=77.0, F2=-16.200000762939453, F3=11.788325309753418), Row(F1=77.0, F2=-16.200000762939453, F3=13.494444847106934), Row(F1=77.0, F2=-16.200000762939453, F3=15.198928833007812)]

The notebook environment is:

Notebook server: 3.2.0-8b0eef4 | Python 2.7.11 |Anaconda 2.3.0 (x86_64)| (default, Dec  6 2015, 18:57:58) 
[GCC 4.2.1 (Apple Inc. build 5577)]

Credentials - keep this secret!


In [11]:
cluster  = '10451'    #  E.g. 10000
username = 'biadmin'  #  E.g. biadmin
password = ''         #  Please request password from chris.snow@uk.ibm.com
table    = 'biadmin.rowapplyout'  #  BigSQL table to query

Code to connect to BigInsights on Cloud via Hive and BigSQL ...


In [12]:
import os
cwd = os.getcwd()

cls_host = 'ehaasp-{0}-mastermanager.bi.services.bluemix.net'.format(cluster)
sql_host = 'ehaasp-{0}-master-2.bi.services.bluemix.net'.format(cluster)

Get the cluster certificate


In [13]:
!openssl s_client -showcerts -connect {cls_host}:9443 < /dev/null | openssl x509 -outform PEM > certificate
    
# uncomment this for debugging
#!cat certificate


depth=0 CN = ehaasp-10451-mastermanager.bi.services.bluemix.net, O = IBM, C = US
verify error:num=18:self signed certificate
verify return:1
depth=0 CN = ehaasp-10451-mastermanager.bi.services.bluemix.net, O = IBM, C = US
verify return:1
DONE

Add the cluster certificate to a truststore


In [14]:
!rm -f truststore.jks
!keytool -import -trustcacerts -alias biginsights -file certificate -keystore truststore.jks -storepass mypassword -noprompt


Certificate was added to keystore

Now attempt to connect to BigInsights on Cloud


In [15]:
# test bigsql
url  = 'jdbc:db2://{0}:51000/bigsql:user={1};password={2};sslConnection=true;sslTrustStoreLocation={3}/truststore.jks;Password=mypassword;'.format(sql_host, username, password, cwd)
df = sqlContext.read.format('jdbc').options(url=url, driver='com.ibm.db2.jcc.DB2Driver', dbtable=table).load()

print(df.take(10))


[Row(F1=77.0, F2=-16.200000762939453, F3=7.81678581237793), Row(F1=77.0, F2=-16.200000762939453, F3=7.528648376464844), Row(F1=77.0, F2=-16.200000762939453, F3=7.240304946899414), Row(F1=77.0, F2=-16.200000762939453, F3=6.9515509605407715), Row(F1=77.0, F2=-16.200000762939453, F3=6.6621809005737305), Row(F1=77.0, F2=-16.200000762939453, F3=8.371989250183105), Row(F1=77.0, F2=-16.200000762939453, F3=10.080772399902344), Row(F1=77.0, F2=-16.200000762939453, F3=11.788325309753418), Row(F1=77.0, F2=-16.200000762939453, F3=13.494444847106934), Row(F1=77.0, F2=-16.200000762939453, F3=15.198928833007812)]

In [ ]: