In [1]:
#SKIP_COMPARE_OUTPUT
import pixiedust

pixiedust.installPackage("com.databricks:spark-csv_2.10:1.5.0")
pixiedust.installPackage("org.apache.commons:commons-csv:0")


Pixiedust database opened successfully
Pixiedust version 1.1.10
Package already installed: com.databricks:spark-csv_2.10:1.5.0
Package already installed: org.apache.commons:commons-csv:0
Out[1]:
<pixiedust.packageManager.package.Package at 0x110763050>

In [2]:
pixiedust.sampleData()


Id Name Topic Publisher
1 Car performance data Transportation IBM
2 Sample retail sales transactions, January 2009 Economy & Business IBM Cloud Data Services
3 Total population by country Society IBM Cloud Data Services
4 GoSales Transactions for Naive Bayes Model Leisure IBM
5 Election results by County Society IBM
6 Million dollar home sales in Massachusetts, USA Feb 2017 through Jan 2018 Economy & Business Redfin.com
7 Boston Crime data, 2-week sample Society City of Boston

In [4]:
#SKIP_COMPARE_OUTPUT
sqlContext = SQLContext(sc)
dd = pixiedust.sampleData(1)


Creating pySpark DataFrame for 'Car performance data'. Please wait...
Loading file using 'com.databricks.spark.csv'
Successfully created pySpark DataFrame for 'Car performance data'

In [5]:
dd.count()


Out[5]:
406

In [6]:
#SKIP_COMPARE_OUTPUT
display(dd, no_gen_tests='true')


Hey, there's something awesome here! To see it, open this notebook outside GitHub, in a viewer like Jupyter
Some labels are not displayed because of a lack of space. Click on Stretch image to see them all
Some labels are not displayed because of a lack of space. Click on Stretch image to see them all
Warning: While great, D3 rendering is using MPLD3 library which has limitations that have not yet been fixed

In [7]:
display(dd,cell_id='174EF8FEFACF47F9811183C2C0EE3DC3',showLegend='true',rowCount='25',mpld3='true',aggregation='SUM',valueFields='mpg,engine',charttype='subplots',keyFields='name',handlerId='barChart',rendererId='matplotlib',nostore_figureOnly='true',nostore_cw='1098',nostore_bokeh='false',prefix='850663c0')


Some labels are not displayed because of a lack of space. Click on Stretch image to see them all
Some labels are not displayed because of a lack of space. Click on Stretch image to see them all
Warning: While great, D3 rendering is using MPLD3 library which has limitations that have not yet been fixed

In [8]:
display(dd,cell_id='174EF8FEFACF47F9811183C2C0EE3DC3',showLegend='true',rowCount='25',mpld3='true',aggregation='SUM',valueFields='mpg,engine',charttype='stacked',keyFields='name',handlerId='barChart',rendererId='matplotlib',nostore_figureOnly='true',nostore_cw='1098',nostore_bokeh='false',prefix='7e232629')


Some labels are not displayed because of a lack of space. Click on Stretch image to see them all
Warning: While great, D3 rendering is using MPLD3 library which has limitations that have not yet been fixed

In [9]:
display(dd,cell_id='174EF8FEFACF47F9811183C2C0EE3DC3',showLegend='true',rowCount='25',mpld3='true',aggregation='SUM',valueFields='mpg,engine',charttype='grouped',keyFields='name',handlerId='barChart',rendererId='matplotlib',nostore_figureOnly='true',nostore_cw='1098',nostore_bokeh='false',prefix='3cdb283f')


Some labels are not displayed because of a lack of space. Click on Stretch image to see them all
Warning: While great, D3 rendering is using MPLD3 library which has limitations that have not yet been fixed

In [ ]: