2: Opening files

Instructions

Use the open() function to create a File object. The name of the file is "crime_rates.csv" and we want the file to be accessed in read mode ("r"). Assign this File object to the variable f.

Answer


In [7]:
! touch test.txt
a = open("test.txt", "r")
print(a)
f = open("crime_rates.csv", "r")


<_io.TextIOWrapper name='test.txt' mode='r' encoding='UTF-8'>

3: Reading in files

Instructions

Run the read() method on the File object f to return the string representation of crime_rates.csv. Assign the resulting string to the variable data.


In [9]:
f = open("crime_rates.csv", "r")
data = f.read()

In [10]:
print(data)


Albuquerque,749
Anaheim,371
Anchorage,828
Arlington,503
Atlanta,1379
Aurora,425
Austin,408
Bakersfield,542
Baltimore,1405
Boston,835
Buffalo,1288
Charlotte-Mecklenburg,647
Cincinnati,974
Cleveland,1383
Colorado Springs,455
Corpus Christi,658
Dallas,675
Denver,615
Detroit,2122
El Paso,423
Fort Wayne,362
Fort Worth,587
Fresno,543
Greensboro,563
Henderson,168
Houston,992
Indianapolis,1185
Jacksonville,617
Jersey City,734
Kansas City,1263
Las Vegas,784
Lexington,352
Lincoln,397
Long Beach,575
Los Angeles,481
Louisville Metro,598
Memphis,1750
Mesa,399
Miami,1172
Milwaukee,1294
Minneapolis,992
Mobile,522
Nashville,1216
New Orleans,815
New York,639
Newark,1154
Oakland,1993
Oklahoma City,919
Omaha,594
Philadelphia,1160
Phoenix,636
Pittsburgh,752
Plano,130
Portland,517
Raleigh,423
Riverside,443
Sacramento,738
San Antonio,503
San Diego,413
San Francisco,704
San Jose,363
Santa Ana,401
Seattle,597
St. Louis,1776
St. Paul,722
Stockton,1548
Tampa,616
Toledo,1171
Tucson,724
Tulsa,990
Virginia Beach,169
Washington,1177
Wichita,742

4: Splitting

Instructions

Split the string object data on the new-line character "\n" and store the result in a variable named rows. Then use the print() function to display the first 5 elements in rows.

Answer


In [11]:
# We can split a string into a list.
sample = "john,plastic,joe"
split_list = sample.split(",")
print(split_list)

# Here's another example.
string_two = "How much wood\ncan a woodchuck chuck\nif a woodchuck\ncan chuck wood?"
split_string_two = string_two.split('\n')
print(split_string_two)

# Code from previous cells
f = open('crime_rates.csv', 'r')
data = f.read()
rows = data.split('\n')
print(rows[0:5])


['john', 'plastic', 'joe']
['How much wood', 'can a woodchuck chuck', 'if a woodchuck', 'can chuck wood?']
['Albuquerque,749', 'Anaheim,371', 'Anchorage,828', 'Arlington,503', 'Atlanta,1379']

5: Loops

Instructions

...

Answer

6: Practice, loops

Instructions

The variable ten_rows contains the first 10 elements in rows. Write a for loop that iterates over each element in ten_rows and uses the print() function to display each element.

Answer


In [12]:
ten_rows = rows[0:10]
for row in ten_rows:
    print(row)


Albuquerque,749
Anaheim,371
Anchorage,828
Arlington,503
Atlanta,1379
Aurora,425
Austin,408
Bakersfield,542
Baltimore,1405
Boston,835

7: List of lists

Instructions

For now, explore and run the code we dissected in this step in the code cell below

Answer


In [13]:
three_rows = ["Albuquerque,749", "Anaheim,371", "Anchorage,828"]
final_list = []
for row in three_rows:
    split_list = row.split(',')
    final_list.append(split_list)
print(final_list)
for elem in final_list:
    print(elem)
print(final_list[0])
print(final_list[1])
print(final_list[2])


[['Albuquerque', '749'], ['Anaheim', '371'], ['Anchorage', '828']]
['Albuquerque', '749']
['Anaheim', '371']
['Anchorage', '828']
['Albuquerque', '749']
['Anaheim', '371']
['Anchorage', '828']

8: Practice, splitting elements in a list

Let's now convert the full dataset, rows, into a list of lists using the same logic from the step before.

Instructions

Write a for loop that splits each element in rows on the comma delimiter and appends the resulting list to a new list named final_data. Then, display the first 5 elements in final_data using list slicing and the print() function.

Answer


In [16]:
f = open('crime_rates.csv', 'r')
data = f.read()
rows = data.split('\n')
final_data = [row.split(",")
              for row in rows]
print(final_data[0:5])


[['Albuquerque', '749'], ['Anaheim', '371'], ['Anchorage', '828'], ['Arlington', '503'], ['Atlanta', '1379']]

9: Accessing elements in a list of lists, the manual way

Instructions

five_elements contains the first 5 elements from final_data. Create a list of strings named cities_list that contains the city names from each list in five_elements.

Answer


In [25]:
five_elements = final_data[:5]
print(five_elements)
cities_list = [city for city,_ in five_elements]


[['Albuquerque', '749'], ['Anaheim', '371'], ['Anchorage', '828'], ['Arlington', '503'], ['Atlanta', '1379']]

10: Looping through a list of lists

Instructions

Create a list of strings named cities_list that contains just the city names from final_data. Recall that the city name is located at index 0 for each list in final_data.

Answer


In [27]:
crime_rates = []

for row in five_elements:
    # row is a list variable, not a string.
    crime_rate = row[1]
    # crime_rate is a string, the city name.
    crime_rates.append(crime_rate)
    
cities_list = [row[0] for row in final_data]

11: Practice

Instructions

Create a list of integers named int_crime_rates that contains just the crime rates as integers from the list rows.

First create an empty list and assign it to int_crime_rates. Then, write a for loop that iterates over rows that executes the following:

  • uses the split() method to convert each string in rows into a list on the comma delimiter
  • converts the value at index 1 from that list to an integer using the int() function
  • then uses the append() method to add each integer to int_crime_rates

In [28]:
f = open('crime_rates.csv', 'r')
data = f.read()
rows = data.split('\n')
print(rows[0:5])


['Albuquerque,749', 'Anaheim,371', 'Anchorage,828', 'Arlington,503', 'Atlanta,1379']

In [32]:
int_crime_rates = []

for row in rows:
    data = row.split(",")
    if len(data) < 2:
        continue
    int_crime_rates.append(int(row.split(",")[1]))

print(int_crime_rates)


[749, 371, 828, 503, 1379, 425, 408, 542, 1405, 835, 1288, 647, 974, 1383, 455, 658, 675, 615, 2122, 423, 362, 587, 543, 563, 168, 992, 1185, 617, 734, 1263, 784, 352, 397, 575, 481, 598, 1750, 399, 1172, 1294, 992, 522, 1216, 815, 639, 1154, 1993, 919, 594, 1160, 636, 752, 130, 517, 423, 443, 738, 503, 413, 704, 363, 401, 597, 1776, 722, 1548, 616, 1171, 724, 990, 169, 1177, 742]

In [ ]:


In [ ]: