Note: Keep in mind that Mac uses a different delimiter to determine the end of a row in a CSV file than Microsoft. Since the CSV python module we will use works well with Windows CSV files, we will save and use a Windows CSV file in our program. So in MAC, you have to save the CSV file as "windows csv" file rather than just csv file.
Let us write a program to read a CSV file (word_sentiment.csv). This file contains a list of 2000 + words and its sentiment ranging form -5 to +5. Write a function "word_sentiment" which checks if the entered word is found in the sentiment_csv file and returns the corresponding sentiment. If the word is not found it returns 0.
In [1]:
In [2]:
Before we read a file, we need to open it. The "with open()" command is very handy since it can open the file and give you a handler with which you can read the file. One of the benefits of the "with"command is that (unlike the simple open() command) it can automaticaly close the file, allowing write operations to be completed. The syntax is :
with open('filename', 'mode', 'encoding') as fileobj
Where fileobj is the file object returned by open(); filename is the string name of the file. mode indicates what you want to do with the file and ecoding defines the type of encoding with which you want to open the file.
Mode could be:
For each, adding a subfix 't' refers to read/write as text and the subfix 'b' refers to read/write as bytes.
Encoding could be:
After opening the file, we call the csv.reader() function to read the data. It assigns a data structure (similar to a multidimentional list) which we can use to read any cell in the csv file.
In [ ]:
In [5]:
Now let us update this code so that we ask the user to enter a sentence. We then break the sentence into words and find the sentiment of each word. We then aggregate the sentiments across all the words to calcuate the sentiment of the sentence and tell if the sentence entered is positive or negative. Hint: Use the split() command we saw in lesson 1.
In [6]:
Can you improve this code to handle double like "not" ? eg. "poonacha is not good" should return a negative sentiment rather than positive .
In [ ]:
# Enter code here
Do you think we can build a rudimentary learning algorithm to imporve the corpus of sentiments ?