How do we make computers seem intelligent? One approach is to use term extraction. Term extration is a type of information extration where we attempt to find relevant terms in text. The relevant terms come from a corpus, or set of plausible terms we want to extract.
For example, suppose we have the text:
One day I would like to visit Syracuse
We as smart humans can be fairly confident that Syracuse
is a place, more specifically a city
.
A rudimentary method to make the computer interpret Syracuse
as a place is to provide a corpus of cities and have the computer look up Syracuse
in that corpus.
In this code exercise we will do just that. Let's first write a function to read cities from the file NYC2-cities.txt
into a corpus of cities, which will be represented in Python as a list.
Then write a main program loop to input some text, split the text into a list of words and if any of the words match a city in the corpus list we will output the word is a city.
The program should handle upper / lower case matching. A good approach is to title case the input.
IMPORTANT: Please note that our program will ONLY work for one word cities, like Syracuse
and will not work for multiple-word cities like San Diego
. Don't worry about that now.
SAMPLE RUN
Enter some text (or ENTER to quit): one day I would like to visit syracuse and rochester
Syracuse is a city
Rochester is a city
Enter some text (or ENTER to quit): austin is in texas
Austin is a city
Enter some text (or ENTER to quit):
Quitting...
Once again we will solve this problem using the problem simplification approach. First we will write the load_city_corpus
function to build our city list. Second we will write the is_a_city
function which given a word and a city list will return True
when the word is a city. Finally we conclude with the main program which finds cities in our text, as demonstrated in our sample run.
In [ ]:
## Step 2: write the defintion for the load_city_corpus function
In [ ]:
## Step 4: write the definition for the is_a_city function
In [1]:
## Step 6: Write complete program, making sure to use your two functions.
Reflect upon your experience completing this assignment. This should be a personal narrative, in your own voice, and cite specifics relevant to the activity as to help the grader understand how you arrived at the code you submitted. Things to consider touching upon: Elaborate on the process itself. Did your original problem analysis work as designed? How many iterations did you go through before you arrived at the solution? Where did you struggle along the way and how did you overcome it? What did you learn from completing the assignment? What do you need to work on to get better? What was most valuable and least valuable about this exercise? Do you have any suggestions for improvements?
To make a good reflection, you should journal your thoughts, questions and comments while you complete the exercise.
Keep your response to between 100 and 250 words.
--== Write Your Reflection Below Here ==--