Read a CSV

We're going to use built-in Python modules - programs really - to download a csv file from the Internet and save it locally.

CSV stands for comma-separated values. It's a common file format a file format that resembles a spreadsheet or database table in a text file.

So first, let's import two built-in Python modules: urllib and csv.

  • urllib is a module that allows Python to make http requests to URLs on the web to fetch HTML. It contains a submodule called request. And inside there we want a specific method called urlretrieve

  • csv is a module that helps Python work with tabular data extracted from spreadsheets and databases


In [ ]:

We're going to download a csv file. What should we name it?


In [ ]:

Now we need a URL to a CSV file out on the Internet.

For this project we're going to download a CSV file that the FDIC compiles of all the banks that have failed since October 1, 2000.

The file we want is at https://s3.amazonaws.com/datanicar/banklist.csv.

If the internet is uncooperative, we can also use the local version of the file in the project1/data/ directory, and structure out code a little differently.

To do this, we use that program within the urllib module to download the file and save it to our project folder. It's called urlretrieve and for our purposes starting out think of it as a way to download a file from the Internet.

urlretrieve takes two arguments to download a file. First specify our target URL, and then we give it a name for the file we want to create


In [ ]:

The output shows we successfully downloaded the file and saved it

Now we want to go ahead and use python's csv reader to open the file and see what is inside.

We specify the name of the file we just created, and we add a setting so we can open and read almost any CSV file.


In [ ]:
# open the downloaded file


    # use python's csv reader to access the contents
    # and create an object that represents the data

    
    # loop through each row of the csv

        
        # and print the row to the terminal

        
        # print the data type to the terminal

        
        # print the length of the row to the terminal