Coal Mining

Coal mining data from eia.gov

Combining and cleaning the raw csv files into a cleaned data set and coherent database.

Generally a good idea to have a separate data folder with the raw data.

When you clean the raw data, leave the raw in place, and create cleaned version with the steps included (ideal situation for Notebook).


In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()

import numpy as np
import pandas as pd

In [2]:
df = pd.DataFrame.from_csv("../data/coal_prod_cleaned.csv")

In [3]:
df.head()


Out[3]:
Average_Employees Company_Type Labor_Hours Mine_Basin Mine_County Mine_Name Mine_State Mine_Status Mine_Type Operating_Company Operating_Company_Address Operation_Type Production_short_tons Union_Code Year
MSHA_ID
102838 4 Independent Producer Operator 2712 Appalachia Southern Bibb Hebron Mine Alabama Permanently abandoned Surface Birmingham Coal & Coke Company 2477 Valleydale Rd. S. B3, Birmingham, AL 35244 Mine only 10572 NaN 2002
103184 5 Independent Producer Operator 2480 Appalachia Southern Fayette Berry Mine Alabama Temporarily closed Surface Midas Coal Company Incorporate 401 10th Avenue, S. E, Cullman, AL 35055 Mine only 9725 NaN 2002
100329 55 Operating Subsidiary 123618 Appalachia Southern Jefferson Concord Mine Alabama Active Underground U S Steel Mining Company Llc 8800 Oak Grove Mine Road, Adger, AL 35006 Preparation Plant 0 United Mine Workers of America 2002
100851 331 Operating Subsidiary 748182 Appalachia Southern Jefferson Oak Grove Mine Alabama Active Underground U S Steel Mining Company Llc 8800 Oak Grove Mine Rd, Adger, AL 35006 Mine only 1942153 United Mine Workers of America 2002
102354 28 Independent Producer Operator 55306 Appalachia Southern Jefferson Lindbergh Alabama Active Surface C & H Mining Company Inc P.O. Box 70250, Tuscaloosa, AL 35407 Mine only 168446 NaN 2002

In [ ]:


In [ ]:


In [ ]:


In [ ]: