Filter a CSV

We're going to use built-in Python modules - programs really - to download a csv file from the Internet and save it locally.

CSV stands for comma-separated values. It's a common file format a file format that resembles a spreadsheet or database table in a text file.

So first, let's import two built-in Python modules: urllib and csv.

  • urllib is a module that allows Python to make http requests to URLs on the web to fetch HTML. It contains a submodule called request. And inside there we want a specific method called urlretrieve

  • csv is a module that helps Python work with tabular data extracted from spreadsheets and databases


In [ ]:

We're going to download a csv file. What should we name it?


In [ ]:

Now we need a URL to a CSV file out on the Internet.

For this project we're going to download a CSV file that the FDIC compiles of all the banks that have failed since October 1, 2000.

The file we want is at https://s3.amazonaws.com/datanicar/banklist.csv.

If the internet is uncooperative, we can also use the local version of the file in the project1/data/ directory, and structure out code a little differently.

To do this, we use that program within the urllib module to download the file and save it to our project folder. It's called urlretrieve and for our purposes starting out think of it as a way to download a file from the Internet.

urlretrieve takes two arguments to download a file. First specify our target URL, and then we give it a name for the file we want to create.


In [ ]:

The output shows we successfully downloaded the file and saved it

Let's open a new file so we can filter just the data we want


In [ ]:

We will use the writer method to write data to a file by passing in the name of the new file as the first argument and delimiter as the the second.

Then we will go ahead and use python's csv reader to open the file and see what is inside.

We specify the name of the file we just created, and we add a setting so we can open and read almost any CSV file.


In [2]:
# create our output


# open our downloaded file


    # use python's csv reader to access the contents
    # and create an object that represents the data

    
    # write our header row to the output csv
    
    
    
    
    # loop through each row of the csv

    
        # now we're going to use an IF statement
        # to find items where the state field
        # is equal to California
            
            
            # write the row to the new csv file
        
        
            # and print the row to the terminal

            
            # print the data type to the terminal

            
            # print the length of the row to the terminal
            
            
        # otherwise continue on
            
            
            
# close the output file