Tutorial content

This tutorial will show how to use the Feed Grains open data source, provided by the USDA Economic Research Service to learn about grain production and historical effects. This tutorial will also explore grain prices in relation to production levels and determine whether there is evidence of some direct correlation, as people would expect based on supply and demand expectations.

Before we start the tutorial though, let's just give a simple introduction to the different grain types we'll be talking about!

Barley: Member of the grass family, a major cereal grain. First cultivated in Eurasia around 10,000 years ago. Versatile grain that can be grown in many climates globally. Nutlike flavor and chewy. Grown for human consumption, unlike oats, it's not usually grown for animal feed, but for beer. barley

Hay: Grass, legumes, or other assorted plants that have been cut, dried, and stored. Used as animal fodder for grazing animals like cows, horses, goats, sheep. Generally not used for human consumption, but some crazy people may eat it). One of, if not the most used feed for livestock. Easy to grow since basically anything can turn into hay. hay

Corn: Also known as maize, a large grain first domesticated in southern Mexico around 10,000 years ago. It is a very popular grain, one of the most widely grown foods in the world. People eat it, but also use it for a variety of other stuff like fuel. corn

Oats: Also known as the common oat, a cereal grain that is grown for its seed. Another popular grain that is widely grown. Although suitable for human consumption, in the form of oatmeal and rolled oats, typically it is grown for livestock feed.

Sorghum: A genus of flower plants, the majority of people have no idea what it is (I never heard of it until making this tutorial). Versatile crop that can be grown as a grain or sweet crop, supposedl one of the top 5 cereal crops in the world (probably not in the US though).

Now that you know what these grains are (not that it really matters), we can get started with the tutorial!

The following topics will be covered in this tutorial:

Loading data
Processing data
Supply and use
- Annual grain stocks
Prices
- Annual grain prices
Summary and references

Before we start, first run the lines below the import the necessary libraries. numpy will be used for numerical calculations, pandas will be used to load the data and perform the majority of parsing and data processing, matplotlib will be the used to graph the data.



In [1]:

    
import numpy as np
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt

Loading Data

Now that the necessary libraries are loaded, lets load the grain data. The data will be in csv format (comma separated values), which stores comma separated values for each cell element. No manual data reading or formatting is necessary as it can be easily parsed using pandas's method read_csv.

Download the FeedGrains.zip file from the USDA Economic Research Service website: https://www.ers.usda.gov/data-products/feed-grains-database/feed-grains-yearbook-tables.aspx. Unzip the FeedGrains.zip file so FeedGrains.csv is in a FeedGrains folder. There's no real reason to put it in a separate folder since it's just one file, but why not. Copy the foder into the same folder as this notebook and load the data using the below command.



In [2]:

    
df = pd.read_csv("FeedGrains/FeedGrains.csv")
df.head()









    Out[2]:






  
    
      
      SC_Group_ID
      SC_Group_Desc
      SC_GroupCommod_ID
      SC_GroupCommod_Desc
      SC_Geography_ID
      SortOrder
      SC_GeographyIndented_Desc
      SC_Commodity_ID
      SC_Commodity_Desc
      SC_Attribute_ID
      SC_Attribute_Desc
      SC_Unit_ID
      SC_Unit_Desc
      Year_ID
      SC_Frequency_ID
      SC_Frequency_Desc
      Timeperiod_ID
      Timeperiod_Desc
      Amount
    
  
  
    
      0
      2
      Supply and use
      9.0
      Barley
      1
      0.8
      United States
      1
      Barley
      1
      Planted acreage
      2
      Million acres
      1926
      3
      Annual
      69
      Commodity Market Year
      8.796
    
    
      1
      2
      Supply and use
      9.0
      Barley
      1
      0.8
      United States
      1
      Barley
      1
      Planted acreage
      2
      Million acres
      1927
      3
      Annual
      69
      Commodity Market Year
      9.513
    
    
      2
      2
      Supply and use
      9.0
      Barley
      1
      0.8
      United States
      1
      Barley
      1
      Planted acreage
      2
      Million acres
      1928
      3
      Annual
      69
      Commodity Market Year
      12.828
    
    
      3
      2
      Supply and use
      9.0
      Barley
      1
      0.8
      United States
      1
      Barley
      1
      Planted acreage
      2
      Million acres
      1929
      3
      Annual
      69
      Commodity Market Year
      14.703
    
    
      4
      2
      Supply and use
      9.0
      Barley
      1
      0.8
      United States
      1
      Barley
      1
      Planted acreage
      2
      Million acres
      1930
      3
      Annual
      69
      Commodity Market Year
      13.581

1 line, that's how easy it is to load a csv file using pandas! Next, we will be processing the data to make it easier to work with.

Processing Data

As you can see, the column names are odd and meaningless without prior knowledge of the dataset. To help with that, here is a list describing what each column represents:

Column name	Description
SC_Group_ID	# ID of data group
SC_Group_Desc	description of data group
SC_GroupCommod_ID	# ID of commodity group
SC_GroupCommod_Desc	description of commodity group
SC_Geography_ID	# ID of data location
SortOrder	weighted sort order by country? unsure
SC_GeographyIndented_Desc	description of location
SC_Commodity_ID	# ID of specific commodity type
SC_Commodity_Desc	description of specific commodity type
SC_Attribute_ID	# ID of data's attribute
SC_Attribute_Desc	description of data's attribute
SC_Unit_ID	# ID of unit type
SC_Unit_Desc	description of unit type
Year_ID	year the data represents
SC_Frequency_ID	# ID of frequency data is collected
SC_Frequency_Desc	description of frequency data is collected
Timeperiod_ID	# ID of timeperiod the data represents
Timeperiod_Desc	description of timeperiod the data represents
Amount	amount of units describing the attribute for the given commodity type

Note that 'commodity' is asynchronous to grain for the most part. Since the data includes products such as malted barley and alcohol, which aren't grains, we will use 'commodity'.

You probably realized that there are two columns describing each column (the ID and description). To make it easier to work with and shrink the width of the table, we will now delete all the 'duplicate' ID columns by using the code below.



In [3]:

    
del df['SC_Group_ID']
del df['SC_GroupCommod_ID']
del df['SortOrder']
del df['SC_Geography_ID']
del df['SC_Commodity_ID']
del df['SC_Attribute_ID']
del df['SC_Unit_ID']
del df['SC_Frequency_ID']
del df['Timeperiod_ID']
df.head()









    Out[3]:






  
    
      
      SC_Group_Desc
      SC_GroupCommod_Desc
      SC_GeographyIndented_Desc
      SC_Commodity_Desc
      SC_Attribute_Desc
      SC_Unit_Desc
      Year_ID
      SC_Frequency_Desc
      Timeperiod_Desc
      Amount
    
  
  
    
      0
      Supply and use
      Barley
      United States
      Barley
      Planted acreage
      Million acres
      1926
      Annual
      Commodity Market Year
      8.796
    
    
      1
      Supply and use
      Barley
      United States
      Barley
      Planted acreage
      Million acres
      1927
      Annual
      Commodity Market Year
      9.513
    
    
      2
      Supply and use
      Barley
      United States
      Barley
      Planted acreage
      Million acres
      1928
      Annual
      Commodity Market Year
      12.828
    
    
      3
      Supply and use
      Barley
      United States
      Barley
      Planted acreage
      Million acres
      1929
      Annual
      Commodity Market Year
      14.703
    
    
      4
      Supply and use
      Barley
      United States
      Barley
      Planted acreage
      Million acres
      1930
      Annual
      Commodity Market Year
      13.581

You can see that the table is much cleaner, now that there are no more duplicate columns. Also note that you can only run the above code once on the dataframe. Running it more times will result in an error since the columns can't be found, as they have already been deleted.

Now that we have cleared the duplicate columns, lets rename the columns to be more descriptive and intuitive for our purposes. You can use the following line to manually change the column names.



In [4]:

    
df.columns = ['Group', 'Commod_Group', 'Location', 'Commod', 'Attribute', 'Unit', 'Year', 'Frequency', 'Time Period', 'Amount']
print(df.shape)
df.head()









    



(425703, 10)






    Out[4]:






  
    
      
      Group
      Commod_Group
      Location
      Commod
      Attribute
      Unit
      Year
      Frequency
      Time Period
      Amount
    
  
  
    
      0
      Supply and use
      Barley
      United States
      Barley
      Planted acreage
      Million acres
      1926
      Annual
      Commodity Market Year
      8.796
    
    
      1
      Supply and use
      Barley
      United States
      Barley
      Planted acreage
      Million acres
      1927
      Annual
      Commodity Market Year
      9.513
    
    
      2
      Supply and use
      Barley
      United States
      Barley
      Planted acreage
      Million acres
      1928
      Annual
      Commodity Market Year
      12.828
    
    
      3
      Supply and use
      Barley
      United States
      Barley
      Planted acreage
      Million acres
      1929
      Annual
      Commodity Market Year
      14.703
    
    
      4
      Supply and use
      Barley
      United States
      Barley
      Planted acreage
      Million acres
      1930
      Annual
      Commodity Market Year
      13.581

This looks much more pleasing to look at than the initial table (in my opinion, you might think different, but at least the table is smaller).

You might've noticed that there's an extra line before the table above. This is the shape of the dataframe. It consists of a tuple (# rows, # columns). There is a huge amount of data totaling to over 400,000 rows. Computers are fast enough that now they can process that amount of data relatively quickly, but we will still be extracting only the data we need. It is a good habit since the dataframe will be easier to work with after clearing out unnecessary data, as well as if working with even larger datasets in the future, say trillions of values.

For the purposes of this tutorial, we will only be looking at the data for the US, so we can begin by extracting all the rows whose Location is United States. Run the following code to achieve this.



In [5]:

    
df_us = df[df.Location.isin(['United States'])]
print(df_us.shape)









    



(40278, 10)

You can see that now the number of rows has decreased from over 400,000 to just a little over 40,000.

We saved the new extracted dataframe into a new variable df_us. This way, if we still need to find something in the original dataset, we can easily reference it without having to load the data again.

There's multiple groups that the data is categorized in. Before we can look at them, let's first sort the dataframe and find a list of unique groups.



In [6]:

    
df_us = df_us.sort_values(by=['Group'])
names = df_us['Group'].unique().tolist()
names









    Out[6]:





['Animal unit indexes',
 'Feed-price ratios',
 'Prices',
 'Quantities fed',
 'Supply and use',
 'Transportation']

We now have a list of unique groups in names. The next step before we can begin to analyze our data is to filter the dataframe again. We will achieve this by creating 6 different dataframes, one for each data group.



In [7]:

    
data = {}
for i in range(0,6):
    data[names[i]] = df_us[df_us.Group.isin([names[i]])]

Now that we have separated the data into their respective groups, we can begin analyzing and visualizing it.

Supply and Use

We will start by looking at the Supply and use data group. Let's start by looking at the corresponding data.



In [8]:

    
data['Supply and use'].head()









    Out[8]:






  
    
      
      Group
      Commod_Group
      Location
      Commod
      Attribute
      Unit
      Year
      Frequency
      Time Period
      Amount
    
  
  
    
      245518
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2007
      Quarterly
      MMY Q2
      273.864472
    
    
      245519
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2007
      Quarterly
      MMY Q3
      183.268794
    
    
      245520
      Supply and use
      Feed grains
      United States
      Feed grains
      Food, seed, and industrial use
      Million metric tons
      2006
      Quarterly
      MMY Q1
      22.064118
    
    
      245521
      Supply and use
      Feed grains
      United States
      Feed grains
      Food, seed, and industrial use
      Million metric tons
      2006
      Quarterly
      MMY Q2
      22.485139
    
    
      245522
      Supply and use
      Feed grains
      United States
      Feed grains
      Food, seed, and industrial use
      Million metric tons
      2006
      Quarterly
      MMY Q3
      25.190198
    
    
      245524
      Supply and use
      Feed grains
      United States
      Feed grains
      Food, seed, and industrial use
      Million metric tons
      2006
      Annual
      MMY AT
      95.814822
    
    
      245517
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2007
      Quarterly
      MMY Q1
      36.162470
    
    
      245525
      Supply and use
      Feed grains
      United States
      Feed grains
      Food, seed, and industrial use
      Million metric tons
      2007
      Quarterly
      MMY Q1
      26.469205
    
    
      245526
      Supply and use
      Feed grains
      United States
      Feed grains
      Food, seed, and industrial use
      Million metric tons
      2007
      Quarterly
      MMY Q2
      27.929693
    
    
      245527
      Supply and use
      Feed grains
      United States
      Feed grains
      Food, seed, and industrial use
      Million metric tons
      2007
      Quarterly
      MMY Q3
      31.656594
    
    
      245529
      Supply and use
      Feed grains
      United States
      Feed grains
      Food, seed, and industrial use
      Million metric tons
      2007
      Annual
      MMY AT
      118.602286
    
    
      245530
      Supply and use
      Feed grains
      United States
      Feed grains
      Food, seed, and industrial use
      Million metric tons
      2008
      Quarterly
      MMY Q1
      32.830448
    
    
      246222
      Supply and use
      Feed grains
      United States
      Feed grains
      Total supply
      Million metric tons
      1979
      Quarterly
      MMY Q3
      169.260681
    
    
      245523
      Supply and use
      Feed grains
      United States
      Feed grains
      Food, seed, and industrial use
      Million metric tons
      2006
      Quarterly
      MMY Q4
      26.075367
    
    
      245528
      Supply and use
      Feed grains
      United States
      Feed grains
      Food, seed, and industrial use
      Million metric tons
      2007
      Quarterly
      MMY Q4
      32.546793
    
    
      245506
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2004
      Annual
      MMY AT
      28.749491
    
    
      245515
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2006
      Quarterly
      MMY Q4
      95.235624
    
    
      245500
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2003
      Quarterly
      MMY Q4
      82.200069
    
    
      245501
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2003
      Annual
      MMY AT
      30.928814
    
    
      245502
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2004
      Quarterly
      MMY Q1
      28.749491
    
    
      245503
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2004
      Quarterly
      MMY Q2
      255.268595
    
    
      245504
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2004
      Quarterly
      MMY Q3
      183.666860
    
    
      245505
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2004
      Quarterly
      MMY Q4
      117.976691
    
    
      246221
      Supply and use
      Feed grains
      United States
      Feed grains
      Total supply
      Million metric tons
      1979
      Quarterly
      MMY Q2
      232.124932
    
    
      245507
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2005
      Quarterly
      MMY Q1
      58.780729
    
    
      245508
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2005
      Quarterly
      MMY Q2
      263.884907
    
    
      245509
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2005
      Quarterly
      MMY Q3
      188.309927
    
    
      245510
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2005
      Quarterly
      MMY Q4
      118.427377
    
    
      245511
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2005
      Annual
      MMY AT
      58.780729
    
    
      245512
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      2006
      Quarterly
      MMY Q1
      54.749043
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      237089
      Supply and use
      Corn
      United States
      Corn
      Feed and residual use
      Million bushels
      2001
      Quarterly
      Q1 Sep-Nov
      2194.268000
    
    
      237657
      Supply and use
      Corn
      United States
      Corn
      Food, alcohol, and industrial use
      Million bushels
      2015
      Quarterly
      Q2 Dec-Feb
      1651.840000
    
    
      237088
      Supply and use
      Corn
      United States
      Corn
      Feed and residual use
      Million bushels
      2000
      Annual
      MY Sep-Aug
      5818.902000
    
    
      237087
      Supply and use
      Corn
      United States
      Corn
      Feed and residual use
      Million bushels
      2000
      Quarterly
      Q4 Jun-Aug
      947.050000
    
    
      237093
      Supply and use
      Corn
      United States
      Corn
      Feed and residual use
      Million bushels
      2001
      Annual
      MY Sep-Aug
      5845.195000
    
    
      237155
      Supply and use
      Corn
      United States
      Corn
      Feed and residual use
      Million bushels
      2014
      Quarterly
      Q2 Dec-Feb
      1445.324000
    
    
      237156
      Supply and use
      Corn
      United States
      Corn
      Feed and residual use
      Million bushels
      2014
      Quarterly
      Q3 Mar-May
      1092.067000
    
    
      237075
      Supply and use
      Corn
      United States
      Corn
      Feed and residual use
      Million bushels
      1998
      Quarterly
      Q2 Dec-Feb
      1453.003000
    
    
      237074
      Supply and use
      Corn
      United States
      Corn
      Feed and residual use
      Million bushels
      1998
      Quarterly
      Q1 Sep-Nov
      2118.172000
    
    
      237672
      Supply and use
      Sorghum
      United States
      Sorghum
      Food, seed, and industrial use
      Million bushels
      1975
      Quarterly
      Q4 Jun-Aug
      2.600000
    
    
      237671
      Supply and use
      Sorghum
      United States
      Sorghum
      Food, seed, and industrial use
      Million bushels
      1975
      Quarterly
      Q3 Mar-May
      4.200000
    
    
      237670
      Supply and use
      Sorghum
      United States
      Sorghum
      Food, seed, and industrial use
      Million bushels
      1975
      Quarterly
      Q2 Dec-Feb
      2.200000
    
    
      237076
      Supply and use
      Corn
      United States
      Corn
      Feed and residual use
      Million bushels
      1998
      Quarterly
      Q3 Mar-May
      1089.276000
    
    
      237077
      Supply and use
      Corn
      United States
      Corn
      Feed and residual use
      Million bushels
      1998
      Quarterly
      Q4 Jun-Aug
      789.361000
    
    
      237078
      Supply and use
      Corn
      United States
      Corn
      Feed and residual use
      Million bushels
      1998
      Annual
      MY Sep-Aug
      5449.812000
    
    
      237079
      Supply and use
      Corn
      United States
      Corn
      Feed and residual use
      Million bushels
      1999
      Quarterly
      Q1 Sep-Nov
      2183.032000
    
    
      237080
      Supply and use
      Corn
      United States
      Corn
      Feed and residual use
      Million bushels
      1999
      Quarterly
      Q2 Dec-Feb
      1521.742000
    
    
      237081
      Supply and use
      Corn
      United States
      Corn
      Feed and residual use
      Million bushels
      1999
      Quarterly
      Q3 Mar-May
      1047.310000
    
    
      237082
      Supply and use
      Corn
      United States
      Corn
      Feed and residual use
      Million bushels
      1999
      Quarterly
      Q4 Jun-Aug
      887.788000
    
    
      237669
      Supply and use
      Sorghum
      United States
      Sorghum
      Food, seed, and industrial use
      Million bushels
      1975
      Quarterly
      Q1 Sep-Nov
      2.100000
    
    
      237668
      Supply and use
      Sorghum
      United States
      Sorghum
      Food, seed, and industrial use
      Million bushels
      1951
      Annual
      MY Sep-Aug
      14.000000
    
    
      237667
      Supply and use
      Sorghum
      United States
      Sorghum
      Food, seed, and industrial use
      Million bushels
      1950
      Annual
      MY Sep-Aug
      38.000000
    
    
      237664
      Supply and use
      Corn
      United States
      Corn
      Food, alcohol, and industrial use
      Million bushels
      2017
      Annual
      MY Sep-Aug
      6970.500000
    
    
      237663
      Supply and use
      Corn
      United States
      Corn
      Food, alcohol, and industrial use
      Million bushels
      2016
      Annual
      MY Sep-Aug
      6890.600100
    
    
      237662
      Supply and use
      Corn
      United States
      Corn
      Food, alcohol, and industrial use
      Million bushels
      2016
      Quarterly
      Q2 Dec-Feb
      1712.486300
    
    
      237661
      Supply and use
      Corn
      United States
      Corn
      Food, alcohol, and industrial use
      Million bushels
      2016
      Quarterly
      Q1 Sep-Nov
      1690.665300
    
    
      237660
      Supply and use
      Corn
      United States
      Corn
      Food, alcohol, and industrial use
      Million bushels
      2015
      Annual
      MY Sep-Aug
      6612.473000
    
    
      237659
      Supply and use
      Corn
      United States
      Corn
      Food, alcohol, and industrial use
      Million bushels
      2015
      Quarterly
      Q4 Jun-Aug
      1700.853000
    
    
      237673
      Supply and use
      Sorghum
      United States
      Sorghum
      Food, seed, and industrial use
      Million bushels
      1975
      Annual
      MY Sep-Aug
      11.100000
    
    
      237674
      Supply and use
      Sorghum
      United States
      Sorghum
      Food, seed, and industrial use
      Million bushels
      1976
      Quarterly
      Q1 Sep-Nov
      2.100000
    
  

17030 rows × 10 columns

We can see that even within the data group, there are many other variables that we can use to try and analyze the data. Following, we will attempt to analyze and graph the data by the annual beginning stocks for grains.

Annual Grain Stocks

To start, let's look at annual beginning stock of each commodity. Before we can begin looking at the data, once again, we need to filter and sort the data.

In the Attribute column, you can see that there are a variety of attributes. For this section, we will only be looking at Beginning stocks. First, lets extract only the rows whose attribute is Beginning stocks.



In [9]:

    
sau = data['Supply and use']
sau = sau[sau.Attribute.isin(['Beginning stocks'])]

You will notice that in the Frequency column, there are Annual and Quarterly data values. For simplicity, we will only be looking at annual values so we will extract only the rows with Annual.



In [10]:

    
sau = sau[sau.Frequency.isin(['Annual'])]

Now that we have the data we need, let's sort it by Year and take a look!



In [11]:

    
sau = sau.sort_values(by='Year')
sau.head()









    Out[11]:






  
    
      
      Group
      Commod_Group
      Location
      Commod
      Attribute
      Unit
      Year
      Frequency
      Time Period
      Amount
    
  
  
    
      233582
      Supply and use
      Corn
      United States
      Corn
      Beginning stocks
      Million bushels
      1975
      Annual
      MY Sep-Aug
      558.000000
    
    
      238672
      Supply and use
      Sorghum
      United States
      Sorghum
      Beginning stocks
      Million bushels
      1975
      Annual
      MY Sep-Aug
      65.300000
    
    
      233207
      Supply and use
      Oats
      United States
      Oats
      Beginning stocks
      Million bushels
      1975
      Annual
      MY Jun-May
      224.000000
    
    
      221500
      Supply and use
      Barley
      United States
      Barley
      Beginning stocks
      Million bushels
      1975
      Annual
      MY Jun-May
      92.000000
    
    
      245185
      Supply and use
      Feed grains
      United States
      Feed grains
      Beginning stocks
      Million metric tons
      1975
      Annual
      MMY AT
      21.086971

Finally, we can start to graph and visualize the data! To begin, we will make line graphs for each commodity to view the change in the amount of beginning stock for each year. Hopefully we will find some relationship!



In [12]:

    
for i, group in sau.groupby('Commod_Group'):
    plot = group.plot(x='Year', y='Amount', title=str(i))
    y = 'Beginning Stocks (' + group.Unit.iloc[0] + ')'
    plot.set(xlabel='Year', ylabel=y)
plt.show()

Personally, I would think that the beginning stock of grains each year would be increasing, or at least have an increasing trend, due to improvements in agricultural techniques. However, it is clear based on the graphs that something happened between 1986 and 1988. One event must have led to a massive surplus in 1986, while another event must have led to a huge (deficit?) in 1988. Through a quick search on "corn supply 1986" and "corn supply 1987", I managed to find 2 articles that explain the phenomena.

http://agfax.com/2016/11/29/grain-market-reminded-of-1986-king-daddy-year-of-corn-stocks-dtn/
The article in the link above describes how there were "four consecutive years of good crop weather" which led to not only the US, but the entire world to have excess grain supplies. Only about 66% of corn stock was used, explaining the massive surplus in the time period.

http://articles.latimes.com/1988-07-13/business/fi-5672_1_food-prices
This article compares the staggering grain production in 1988 compared to 1987. There seems to have been a severe drought that would kill off an estimated 30%-45% of projected harvest. As a result, the surplus that built up the past years must have been depleted in an instant, over the following years of bad weather.

Based on this, it seems that weather has a much greater effect on grain supply than agricultural techniques. However, this correlation can't be concluded without looking at weather data and trying to find a relation between the two datasets. We won't be looking at that today since this is a tutorial to get started in looking at grain data. Instead, we will explore more subsets of data in FeedGrains.csv, specifically prices!

Prices

Annual Grain Prices

Just looking at one correlation of data is no fun since finding a relation is so simple. Next, let's take a look at the prices of the grains above! We will begin by looking at annual grain prices and graph the data to see if we can find a basic visual correlation.

Prices won't be found in the Supply and use group, so we will be looking at the Prices data group for this part.



In [13]:

    
data['Prices'].head()









    Out[13]:






  
    
      
      Group
      Commod_Group
      Location
      Commod
      Attribute
      Unit
      Year
      Frequency
      Time Period
      Amount
    
  
  
    
      243657
      Prices
      Hay
      United States
      Hay all (dry)
      Prices received by farmers
      Dollars per ton
      2001
      Monthly
      Dec
      95.600
    
    
      243676
      Prices
      Corn
      United States
      Corn grain
      Prices received by farmers
      Dollars per bushel
      1957
      Monthly
      Oct
      1.060
    
    
      243677
      Prices
      Corn
      United States
      Corn grain
      Prices received by farmers
      Dollars per bushel
      1957
      Monthly
      Nov
      0.985
    
    
      243678
      Prices
      Corn
      United States
      Corn grain
      Prices received by farmers
      Dollars per bushel
      1957
      Monthly
      Dec
      0.984
    
    
      243679
      Prices
      Corn
      United States
      Corn grain
      Prices received by farmers
      Dollars per bushel
      1957
      Annual
      Commodity Market Year
      1.110

Similar to the Supply and use data group, Prices also has a variety of information. Once again, we will start by working to filter out unnecessary data.

Let's start by sorting the data by year and see where to go from there.



In [21]:

    
p = data['Prices']
p = p.sort_values(by='Year')
p.head()









    Out[21]:






  
    
      
      Group
      Commod_Group
      Location
      Commod
      Attribute
      Unit
      Year
      Frequency
      Time Period
      Amount
    
  
  
    
      230364
      Prices
      Oats
      United States
      Oats
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.474
    
    
      242108
      Prices
      Corn
      United States
      Corn grain
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.657
    
    
      228563
      Prices
      Corn
      United States
      Corn
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.657
    
    
      596
      Prices
      Barley
      United States
      Barley
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.952
    
    
      241469
      Prices
      Hay
      United States
      Hay all (dry)
      Prices received by farmers
      Dollars per ton
      1866
      Annual
      Commodity Market Year
      14.480

Once again, we can see that there is detailed data for each quarter of the year or each month, but for simplicity, we will again, be looking at annual prices. This consistency with the annual grain stocks data will also allow us to easily perform a comparison. Let's now extract only the annual data...



In [22]:

    
p = p[p.Frequency.isin(['Annual'])]
p.head()









    Out[22]:






  
    
      
      Group
      Commod_Group
      Location
      Commod
      Attribute
      Unit
      Year
      Frequency
      Time Period
      Amount
    
  
  
    
      230364
      Prices
      Oats
      United States
      Oats
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.474
    
    
      242108
      Prices
      Corn
      United States
      Corn grain
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.657
    
    
      228563
      Prices
      Corn
      United States
      Corn
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.657
    
    
      596
      Prices
      Barley
      United States
      Barley
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.952
    
    
      241469
      Prices
      Hay
      United States
      Hay all (dry)
      Prices received by farmers
      Dollars per ton
      1866
      Annual
      Commodity Market Year
      14.480

You may have noticed that there is some difference between the two datasets, the Prices dataset has a Hay group, while the Supply and use dataset has Feed grains. These two seem to be the same group of data, considering that hay is used as a feed grain for animals (most people choose not to eat hay and instead, eat oats, sorghum, barley, and corn). Before we come to this conclusion, let's take a look at what groups are in the Prices datasets.



In [16]:

    
p['Commod_Group'].unique().tolist()









    Out[16]:





['Oats', 'Corn', 'Barley', 'Hay', 'Sorghum']

Just like the Supply and use dataset, there are 5 groups, so we will be assuming that Hay corresponds to Feed grains.

You probably also noticed that the commodities in Prices differs from the group, as compared to the Supply and use dataset. Although there are many different types of specific grains and farmers are paid different prices depending on the specific product being made from their produce, we will be looking at the grain groups as a whole so it will be easier to compare.

Next, we see that there are newer attributes in recent years, distinguishing between farmers' pay on the high end, low end, and average. We will take only the average pay rate in order to maximize the data we can use (early years don't distinguish between high and low end). We can extract only the average prices with:



In [23]:

    
p = p[p.Attribute.isin(['Prices received by farmers'])]
p.head()









    Out[23]:






  
    
      
      Group
      Commod_Group
      Location
      Commod
      Attribute
      Unit
      Year
      Frequency
      Time Period
      Amount
    
  
  
    
      230364
      Prices
      Oats
      United States
      Oats
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.474
    
    
      242108
      Prices
      Corn
      United States
      Corn grain
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.657
    
    
      228563
      Prices
      Corn
      United States
      Corn
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.657
    
    
      596
      Prices
      Barley
      United States
      Barley
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.952
    
    
      241469
      Prices
      Hay
      United States
      Hay all (dry)
      Prices received by farmers
      Dollars per ton
      1866
      Annual
      Commodity Market Year
      14.480

After filtering some more data, we can see that there are some more issues with data consistency. Sorghum is listed under two types of units, Dollars per bushel and Dollars per cwt. We will only be looking at Dollars per bushel so next, we will remove those sets of data. The code below is slightly different from previous methods of extraction, because instead of extracting data with a certain value, we will extract data that doesn't contain a certain value, effectively removing the unwanted data.



In [24]:

    
p = p[p.Unit != 'Dollars per cwt']
p.head()









    Out[24]:






  
    
      
      Group
      Commod_Group
      Location
      Commod
      Attribute
      Unit
      Year
      Frequency
      Time Period
      Amount
    
  
  
    
      230364
      Prices
      Oats
      United States
      Oats
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.474
    
    
      242108
      Prices
      Corn
      United States
      Corn grain
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.657
    
    
      228563
      Prices
      Corn
      United States
      Corn
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.657
    
    
      596
      Prices
      Barley
      United States
      Barley
      Prices received by farmers
      Dollars per bushel
      1866
      Annual
      Commodity Market Year
      0.952
    
    
      241469
      Prices
      Hay
      United States
      Hay all (dry)
      Prices received by farmers
      Dollars per ton
      1866
      Annual
      Commodity Market Year
      14.480

There doesn't seem to be any more data inconsistencies so we can finally begin to graph the data. We will be graphing based on Commod this time instead of Commod_Group since Commod is more consistent. Hopefully we can find some trends!



In [19]:

    
for i, group in p.groupby('Commod'):
    if i == 'Corn' or i == 'Hay all (dry)' or i == 'Barley' or i == 'Oats' or i == 'Sorghum':
        plot = group.plot(x='Year', y='Amount', title=str(i))
        plot.set(xlabel='Year', ylabel=group.Unit.iloc[0])
plt.show()

We can see that there is inconsistency in the start years for the datasets, so we will set a x-limit so it will be easier to compare the datasets.



In [20]:

    
for i, group in p.groupby('Commod'):
    if i == 'Corn' or i == 'Hay all (dry)' or i == 'Barley' or i == 'Oats' or i == 'Sorghum':
        plot = group.plot(x='Year', y='Amount', title=str(i), xlim=(1975, 2017))
        plot.set(xlabel='Year', ylabel=group.Unit.iloc[0])
plt.show()

Simply looking at the data, there appears to be no relation between stock and prices whatsoever. The most clear indicators are how there's a spike in grain stock around 1987, but there is no defined change in prices during that time period. Prices also spiked around 2012, but there appears to have been no change in average grain stock during that time period.

Through some research, it seems that since around 2010, there has been severe weather conditions such as droughts in the US causing a deficit in the supply (although it seems to look consistent based on the data). Supposedly this is the reason for a raise in price, despite the fact that the data on our grain supply seems to differ.

The resulting conclusion seems to imply that regardless of the weather and grain supply, it will not have a severe impact on the price of grain, at least in the short term. Although this seems trivial, this does impact virtually everyone. Whether there is another reason that prices are high despite our consistent supply, or demand has simply increased beyond past years, resulting in a higher necessary grain supply, numbers tell more truth than a news article.

Note: for grader
After coming this far, I could not think of a way to incorporate machine learning usefully into this dataset without either choosing a completely different dataset or different subset of data. I initially expected there to be data for other countries as well so a linear regression line would be useful in determining the global trend of price and supply and determine if there is a relation be comparing the regression lines, but the only data in the dataset is for the US.

Summary and References

This tutorial has looked at the changes in grain supply, as well as grain prices over the time period of 1975 to 2016. Through graphical evidence, there does not seem to be any evidence of a correlation between grain price and supply, no matter how much surplus or deficit there is. There may be a correlation with the economy or weather between the two datasets, but that was not explored in this tutorial.

In conclusion, there are many things that data science can accomplish in any field. New patterns or unexpected trends may be found that were unseen before. With these patterns and unexpected trends, further research and data application can be applied to determine correlations between data. Although correlation does not imply causation, new discoveries may be found that initiate new ideas, especially considering the sheer amount of data, even for something (seemingly simple) like grains.

Data source: https://www.ers.usda.gov/data-products/feed-grains-database/feed-grains-yearbook-tables.aspx

	SC_Group_ID	SC_Group_Desc	SC_GroupCommod_ID	SC_GroupCommod_Desc	SC_Geography_ID	SortOrder	SC_GeographyIndented_Desc	SC_Commodity_ID	SC_Commodity_Desc	SC_Attribute_ID	SC_Attribute_Desc	SC_Unit_ID	SC_Unit_Desc	Year_ID	SC_Frequency_ID	SC_Frequency_Desc	Timeperiod_ID	Timeperiod_Desc	Amount
0	2	Supply and use	9.0	Barley	1	0.8	United States	1	Barley	1	Planted acreage	2	Million acres	1926	3	Annual	69	Commodity Market Year	8.796
1	2	Supply and use	9.0	Barley	1	0.8	United States	1	Barley	1	Planted acreage	2	Million acres	1927	3	Annual	69	Commodity Market Year	9.513
2	2	Supply and use	9.0	Barley	1	0.8	United States	1	Barley	1	Planted acreage	2	Million acres	1928	3	Annual	69	Commodity Market Year	12.828
3	2	Supply and use	9.0	Barley	1	0.8	United States	1	Barley	1	Planted acreage	2	Million acres	1929	3	Annual	69	Commodity Market Year	14.703
4	2	Supply and use	9.0	Barley	1	0.8	United States	1	Barley	1	Planted acreage	2	Million acres	1930	3	Annual	69	Commodity Market Year	13.581

	Group	Commod_Group	Location	Commod	Attribute	Unit	Year	Frequency	Time Period	Amount
245518	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2007	Quarterly	MMY Q2	273.864472
245519	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2007	Quarterly	MMY Q3	183.268794
245520	Supply and use	Feed grains	United States	Feed grains	Food, seed, and industrial use	Million metric tons	2006	Quarterly	MMY Q1	22.064118
245521	Supply and use	Feed grains	United States	Feed grains	Food, seed, and industrial use	Million metric tons	2006	Quarterly	MMY Q2	22.485139
245522	Supply and use	Feed grains	United States	Feed grains	Food, seed, and industrial use	Million metric tons	2006	Quarterly	MMY Q3	25.190198
245524	Supply and use	Feed grains	United States	Feed grains	Food, seed, and industrial use	Million metric tons	2006	Annual	MMY AT	95.814822
245517	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2007	Quarterly	MMY Q1	36.162470
245525	Supply and use	Feed grains	United States	Feed grains	Food, seed, and industrial use	Million metric tons	2007	Quarterly	MMY Q1	26.469205
245526	Supply and use	Feed grains	United States	Feed grains	Food, seed, and industrial use	Million metric tons	2007	Quarterly	MMY Q2	27.929693
245527	Supply and use	Feed grains	United States	Feed grains	Food, seed, and industrial use	Million metric tons	2007	Quarterly	MMY Q3	31.656594
245529	Supply and use	Feed grains	United States	Feed grains	Food, seed, and industrial use	Million metric tons	2007	Annual	MMY AT	118.602286
245530	Supply and use	Feed grains	United States	Feed grains	Food, seed, and industrial use	Million metric tons	2008	Quarterly	MMY Q1	32.830448
246222	Supply and use	Feed grains	United States	Feed grains	Total supply	Million metric tons	1979	Quarterly	MMY Q3	169.260681
245523	Supply and use	Feed grains	United States	Feed grains	Food, seed, and industrial use	Million metric tons	2006	Quarterly	MMY Q4	26.075367
245528	Supply and use	Feed grains	United States	Feed grains	Food, seed, and industrial use	Million metric tons	2007	Quarterly	MMY Q4	32.546793
245506	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2004	Annual	MMY AT	28.749491
245515	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2006	Quarterly	MMY Q4	95.235624
245500	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2003	Quarterly	MMY Q4	82.200069
245501	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2003	Annual	MMY AT	30.928814
245502	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2004	Quarterly	MMY Q1	28.749491
245503	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2004	Quarterly	MMY Q2	255.268595
245504	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2004	Quarterly	MMY Q3	183.666860
245505	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2004	Quarterly	MMY Q4	117.976691
246221	Supply and use	Feed grains	United States	Feed grains	Total supply	Million metric tons	1979	Quarterly	MMY Q2	232.124932
245507	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2005	Quarterly	MMY Q1	58.780729
245508	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2005	Quarterly	MMY Q2	263.884907
245509	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2005	Quarterly	MMY Q3	188.309927
245510	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2005	Quarterly	MMY Q4	118.427377
245511	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2005	Annual	MMY AT	58.780729
245512	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	2006	Quarterly	MMY Q1	54.749043
...	...	...	...	...	...	...	...	...	...	...
237089	Supply and use	Corn	United States	Corn	Feed and residual use	Million bushels	2001	Quarterly	Q1 Sep-Nov	2194.268000
237657	Supply and use	Corn	United States	Corn	Food, alcohol, and industrial use	Million bushels	2015	Quarterly	Q2 Dec-Feb	1651.840000
237088	Supply and use	Corn	United States	Corn	Feed and residual use	Million bushels	2000	Annual	MY Sep-Aug	5818.902000
237087	Supply and use	Corn	United States	Corn	Feed and residual use	Million bushels	2000	Quarterly	Q4 Jun-Aug	947.050000
237093	Supply and use	Corn	United States	Corn	Feed and residual use	Million bushels	2001	Annual	MY Sep-Aug	5845.195000
237155	Supply and use	Corn	United States	Corn	Feed and residual use	Million bushels	2014	Quarterly	Q2 Dec-Feb	1445.324000
237156	Supply and use	Corn	United States	Corn	Feed and residual use	Million bushels	2014	Quarterly	Q3 Mar-May	1092.067000
237075	Supply and use	Corn	United States	Corn	Feed and residual use	Million bushels	1998	Quarterly	Q2 Dec-Feb	1453.003000
237074	Supply and use	Corn	United States	Corn	Feed and residual use	Million bushels	1998	Quarterly	Q1 Sep-Nov	2118.172000
237672	Supply and use	Sorghum	United States	Sorghum	Food, seed, and industrial use	Million bushels	1975	Quarterly	Q4 Jun-Aug	2.600000
237671	Supply and use	Sorghum	United States	Sorghum	Food, seed, and industrial use	Million bushels	1975	Quarterly	Q3 Mar-May	4.200000
237670	Supply and use	Sorghum	United States	Sorghum	Food, seed, and industrial use	Million bushels	1975	Quarterly	Q2 Dec-Feb	2.200000
237076	Supply and use	Corn	United States	Corn	Feed and residual use	Million bushels	1998	Quarterly	Q3 Mar-May	1089.276000
237077	Supply and use	Corn	United States	Corn	Feed and residual use	Million bushels	1998	Quarterly	Q4 Jun-Aug	789.361000
237078	Supply and use	Corn	United States	Corn	Feed and residual use	Million bushels	1998	Annual	MY Sep-Aug	5449.812000
237079	Supply and use	Corn	United States	Corn	Feed and residual use	Million bushels	1999	Quarterly	Q1 Sep-Nov	2183.032000
237080	Supply and use	Corn	United States	Corn	Feed and residual use	Million bushels	1999	Quarterly	Q2 Dec-Feb	1521.742000
237081	Supply and use	Corn	United States	Corn	Feed and residual use	Million bushels	1999	Quarterly	Q3 Mar-May	1047.310000
237082	Supply and use	Corn	United States	Corn	Feed and residual use	Million bushels	1999	Quarterly	Q4 Jun-Aug	887.788000
237669	Supply and use	Sorghum	United States	Sorghum	Food, seed, and industrial use	Million bushels	1975	Quarterly	Q1 Sep-Nov	2.100000
237668	Supply and use	Sorghum	United States	Sorghum	Food, seed, and industrial use	Million bushels	1951	Annual	MY Sep-Aug	14.000000
237667	Supply and use	Sorghum	United States	Sorghum	Food, seed, and industrial use	Million bushels	1950	Annual	MY Sep-Aug	38.000000
237664	Supply and use	Corn	United States	Corn	Food, alcohol, and industrial use	Million bushels	2017	Annual	MY Sep-Aug	6970.500000
237663	Supply and use	Corn	United States	Corn	Food, alcohol, and industrial use	Million bushels	2016	Annual	MY Sep-Aug	6890.600100
237662	Supply and use	Corn	United States	Corn	Food, alcohol, and industrial use	Million bushels	2016	Quarterly	Q2 Dec-Feb	1712.486300
237661	Supply and use	Corn	United States	Corn	Food, alcohol, and industrial use	Million bushels	2016	Quarterly	Q1 Sep-Nov	1690.665300
237660	Supply and use	Corn	United States	Corn	Food, alcohol, and industrial use	Million bushels	2015	Annual	MY Sep-Aug	6612.473000
237659	Supply and use	Corn	United States	Corn	Food, alcohol, and industrial use	Million bushels	2015	Quarterly	Q4 Jun-Aug	1700.853000
237673	Supply and use	Sorghum	United States	Sorghum	Food, seed, and industrial use	Million bushels	1975	Annual	MY Sep-Aug	11.100000
237674	Supply and use	Sorghum	United States	Sorghum	Food, seed, and industrial use	Million bushels	1976	Quarterly	Q1 Sep-Nov	2.100000

	Group	Commod_Group	Location	Commod	Attribute	Unit	Year	Frequency	Time Period	Amount
233582	Supply and use	Corn	United States	Corn	Beginning stocks	Million bushels	1975	Annual	MY Sep-Aug	558.000000
238672	Supply and use	Sorghum	United States	Sorghum	Beginning stocks	Million bushels	1975	Annual	MY Sep-Aug	65.300000
233207	Supply and use	Oats	United States	Oats	Beginning stocks	Million bushels	1975	Annual	MY Jun-May	224.000000
221500	Supply and use	Barley	United States	Barley	Beginning stocks	Million bushels	1975	Annual	MY Jun-May	92.000000
245185	Supply and use	Feed grains	United States	Feed grains	Beginning stocks	Million metric tons	1975	Annual	MMY AT	21.086971

	Group	Commod_Group	Location	Commod	Attribute	Unit	Year	Frequency	Time Period	Amount
243657	Prices	Hay	United States	Hay all (dry)	Prices received by farmers	Dollars per ton	2001	Monthly	Dec	95.600
243676	Prices	Corn	United States	Corn grain	Prices received by farmers	Dollars per bushel	1957	Monthly	Oct	1.060
243677	Prices	Corn	United States	Corn grain	Prices received by farmers	Dollars per bushel	1957	Monthly	Nov	0.985
243678	Prices	Corn	United States	Corn grain	Prices received by farmers	Dollars per bushel	1957	Monthly	Dec	0.984
243679	Prices	Corn	United States	Corn grain	Prices received by farmers	Dollars per bushel	1957	Annual	Commodity Market Year	1.110

	Group	Commod_Group	Location	Commod	Attribute	Unit	Year	Frequency	Time Period	Amount
230364	Prices	Oats	United States	Oats	Prices received by farmers	Dollars per bushel	1866	Annual	Commodity Market Year	0.474
242108	Prices	Corn	United States	Corn grain	Prices received by farmers	Dollars per bushel	1866	Annual	Commodity Market Year	0.657
228563	Prices	Corn	United States	Corn	Prices received by farmers	Dollars per bushel	1866	Annual	Commodity Market Year	0.657
596	Prices	Barley	United States	Barley	Prices received by farmers	Dollars per bushel	1866	Annual	Commodity Market Year	0.952
241469	Prices	Hay	United States	Hay all (dry)	Prices received by farmers	Dollars per ton	1866	Annual	Commodity Market Year	14.480