Note: Keep in mind that Mac uses a different delimiter to determine the end of a row in a CSV file than Microsoft. Since the CSV python module we will use works well with Windows CSV files, we will save and use a Windows CSV file in our program. So in MAC, you have to save the CSV file as "windows csv" file rather than just csv file.
Let us write a program to read a CSV file (word_sentiment.csv). This file contains a list of 2000 + words and its sentiment ranging form -5 to +5. Write a function "word_sentiment" which checks if the entered word is found in the sentiment_csv file and returns the corresponding sentiment. If the word is not found it returns 0.
In [1]:
import csv
In [2]:
SENTIMENT_CSV = "C:\\Users\kmpoo\Dropbox\HEC\Teaching\Python for PhD May 2019\python4phd\Session 1\Sent\word_sentiment.csv"
Before we read a file, we need to open it. The "with open()" command is very handy since it can open the file and give you a handler with which you can read the file. One of the benefits of the "with"command is that (unlike the simple open() command) it can automaticaly close the file, allowing write operations to be completed. The syntax is :
with open('filename', 'mode', 'encoding') as fileobj
Where fileobj is the file object returned by open(); filename is the string name of the file. mode indicates what you want to do with the file and ecoding defines the type of encoding with which you want to open the file.
Mode could be:
For each, adding a subfix 't' refers to read/write as text and the subfix 'b' refers to read/write as bytes.
Encoding could be:
After opening the file, we call the csv.reader() function to read the data. It assigns a data structure (similar to a multidimentional list) which we can use to read any cell in the csv file.
In [ ]:
with open(SENTIMENT_CSV, 'rt',encoding = 'utf-8') as senti_data:
sentiment = csv.reader(senti_data)
for data_row in sentiment:
print(data_row)
In [5]:
import csv
SENTIMENT_CSV = "C:/Users/kmpoo/Dropbox/HEC/Teaching/Python for PhD Mar 2018/python4phd/Session 1/Sent/word_sentiment.csv"
'''Updated the path to point to your file. The path provided changes based on your operating system. '''
def word_sentiment(word):
'''This function uses the word_sentiment.csv file to find the sentiment of the word
entered'''
with open(SENTIMENT_CSV, 'rt',encoding = 'utf-8') as senti_data:
sentiment = csv.reader(senti_data)
for data_row in sentiment:
if data_row[0] == word:
sentiment_val = data_row[1]
return sentiment_val
return 0
def main():
word_in = input("enter the word: ").lower()
return_val = word_sentiment(word_in)
print("the sentiment of the word ",word_in ," is: ",return_val)
main()
Now let us update this code so that we ask the user to enter a sentence. We then break the sentence into words and find the sentiment of each word. We then aggregate the sentiments across all the words to calcuate the sentiment of the sentence and tell if the sentence entered is positive or negative. Hint: Use the split() command we saw in lesson 1.
In [6]:
import csv
SENTIMENT_CSV = "C:/Users/kmpoo/Dropbox/HEC/Teaching/Python for PhD Mar 2018/python4phd/Session 1/Sent/word_sentiment.csv"#Updated the path to point to your file.
'''The path provided changes based on your operating system. For a windows system the format
of the path will be "C:/Users/User/Desktop/word_sentiment.csv" '''
def word_sentiment(word):
"""This function uses the word_sentiment.csv file to find the sentiment of the word
entered"""
with open(SENTIMENT_CSV, 'rt',encoding = 'utf-8') as senti_data:
sentiment = csv.reader(senti_data)
for data_row in sentiment:
if data_row[0] == word:
sentiment_val = data_row[1]
return sentiment_val
return 0
def main():
"""This function asks the user to input a sentence and tries to calculate the sentiment
of the sentence"""
sentiment = 0
sentence_in = input("enter the sentence: ").lower()
words_list = sentence_in.split()
for word in words_list:
sentiment = sentiment + int(word_sentiment(word))
if sentiment > 0:
print("The entered sentence has a positive sentiment")
elif sentiment == 0:
print("The entered sentence has a neutral sentiment")
else:
print("The entered sentence has a negative sentiment")
main()
Can you improve this code to handle double like "not" ? eg. "poonacha is not good" should return a negative sentiment rather than positive .
In [ ]:
# Enter code here
Do you think we can build a rudimentary learning algorithm to imporve the corpus of sentiments ?