pandas-profiling School Food Orders


In [1]:
!pip install pandas-profiling


Requirement already satisfied: pandas-profiling in /Users/monkee/anaconda3/lib/python3.6/site-packages
Requirement already satisfied: pandas>=0.19 in /Users/monkee/anaconda3/lib/python3.6/site-packages (from pandas-profiling)
Requirement already satisfied: matplotlib>=1.4 in /Users/monkee/anaconda3/lib/python3.6/site-packages (from pandas-profiling)
Requirement already satisfied: six>=1.9 in /Users/monkee/anaconda3/lib/python3.6/site-packages (from pandas-profiling)
Requirement already satisfied: jinja2>=2.8 in /Users/monkee/anaconda3/lib/python3.6/site-packages (from pandas-profiling)
Requirement already satisfied: python-dateutil>=2 in /Users/monkee/anaconda3/lib/python3.6/site-packages (from pandas>=0.19->pandas-profiling)
Requirement already satisfied: pytz>=2011k in /Users/monkee/anaconda3/lib/python3.6/site-packages (from pandas>=0.19->pandas-profiling)
Requirement already satisfied: numpy>=1.7.0 in /Users/monkee/anaconda3/lib/python3.6/site-packages (from pandas>=0.19->pandas-profiling)
Requirement already satisfied: cycler>=0.10 in /Users/monkee/anaconda3/lib/python3.6/site-packages (from matplotlib>=1.4->pandas-profiling)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=1.5.6 in /Users/monkee/anaconda3/lib/python3.6/site-packages (from matplotlib>=1.4->pandas-profiling)
Requirement already satisfied: MarkupSafe>=0.23 in /Users/monkee/anaconda3/lib/python3.6/site-packages (from jinja2>=2.8->pandas-profiling)

Import libraries


In [2]:
from IPython.core.debugger import set_trace
import pandas as pd
import pandas_profiling
import glob

Load and prepare dataset


In [3]:
#df=pd.read_csv("School_Food_Orders.csv", parse_dates=['Delivery_Time'], encoding='UTF-8')
# df=pd.read_csv("pima-indians-diabetes.csv", encoding='UTF-8', delimiter='\t')
all_files = glob.glob("School_Food_Orders_Parts.*")
concatenated_df = pd.concat((pd.read_csv(f, parse_dates=['Delivery_Time']) for f in all_files), ignore_index=True)

Inline report without saving object


In [ ]:
pandas_profiling.ProfileReport(concatenated_df)


--Call--
> /Users/monkee/anaconda3/lib/python3.6/site-packages/IPython/core/displayhook.py(247)__call__()
    245         sys.stdout.flush()
    246 
--> 247     def __call__(self, result=None):
    248         """Printing with history cache management.
    249 

Save report to file


In [ ]:
pfr = pandas_profiling.ProfileReport(df)
pfr.to_file("school_food_orders.html")

In [ ]:
#### Print existing ProfileReport object inline
pfr