Great Expectations Basics

This example will walk through using basic expectations with crop data from the Food and Agricultural Organization of the United States.

Data are available here: http://www.fao.org/faostat/en/#home


In [0]:
import pandas as pd

import great_expectations as ge

In [0]:
df = pd.read_csv('tests/examples/FAO-Rice-Production-Asia.csv')

Exploratory Data Analysis

We typically will need to investigate some properites of our data to understand what we can do with it. Jupyter makes that easy, and we will take advantage of its features, including autocomplete, extensively.


In [0]:
df.head()

Reshape data


In [0]:
pivoted = df.pivot(index='Year', columns='Area', values='Value')

In [0]:
pivoted.head()

Initialize the new dataset to work with Great Expectations


In [0]:
df = ge.df(pivoted)

In [0]:
df.expect_column_mean_to_be_between('Afghanistan', 15000, 25000)

In [0]:
### We might want to make expectations about lots of columns

In [0]:
for column in df.columns:
    #print('Column: ' + column + "\nResult: " + str(df.expect_column_mean_to_be_between(column, 15000, 25000)))
    result = df.expect_column_mean_to_be_between(column, 15000, 25000)
    if (result['success'] == False):
        print(column)
        print(result)

Now, we can view and save the expectations that we have created


In [0]:
import json

In [0]:
print(json.dumps(df.get_expectation_suite(), indent = 2))