Excercise - Working With CSV

Using the CSV module

A CSV file is often used exchange format for spreadsheets and databases.
Each line is called a record and each field within a record is seperated by a delimiter such as comma, tab etc.
We use the module "CSV" which is not included in the standard library of Python.

Note: Keep in mind that Mac uses a different delimiter to determine the end of a row in a CSV file than Microsoft. Since the CSV python module we will use works well with Windows CSV files, we will save and use a Windows CSV file in our program. So in MAC, you have to save the CSV file as "windows csv" file rather than just csv file.

Let us write a program to read a CSV file (word_sentiment.csv). This file contains a list of 2000 + words and its sentiment ranging form -5 to +5. Write a function "word_sentiment" which checks if the entered word is found in the sentiment_csv file and returns the corresponding sentiment. If the word is not found it returns 0.

Step 1:Import the module CSV.

Note: If any module is not included in the computer, we will need to do "pip install csv" in the terminal (in case of mac) or in the command prompt (in case of windows).



In [1]:

Step 2: Assign the path of the file to a global variable "SENTIMENT_CSV"



In [2]:

Step 3: Open the file using the "with open()" command and read the file

Before we read a file, we need to open it. The "with open()" command is very handy since it can open the file and give you a handler with which you can read the file. One of the benefits of the "with"command is that (unlike the simple open() command) it can automaticaly close the file, allowing write operations to be completed. The syntax is :

with open('filename', 'mode', 'encoding') as fileobj

Where fileobj is the file object returned by open(); filename is the string name of the file. mode indicates what you want to do with the file and ecoding defines the type of encoding with which you want to open the file.

Mode could be:

w -> write. if the file exists it is overwritten
r -> read
a -> append. Write at the end of the file
x - > write. Only if the file does not exist. It does not allow a file to be re-written

For each, adding a subfix 't' refers to read/write as text and the subfix 'b' refers to read/write as bytes.

Encoding could be:

'ascii'
'utf-8'
'latin-1'
'cp-1252'
'unicode-escape'

After opening the file, we call the csv.reader() function to read the data. It assigns a data structure (similar to a multidimentional list) which we can use to read any cell in the csv file.



In [ ]:

The full code

Let us package all of this into a nice function which

reads the word_sentiment.csv file
searches for a particualr given word
returns the sentiment value of the word given to it. If the word is not found it returns 0 .



In [5]:









    



enter the word: good
the sentiment of the word  good  is:  3

Now let us update this code so that we ask the user to enter a sentence. We then break the sentence into words and find the sentiment of each word. We then aggregate the sentiments across all the words to calcuate the sentiment of the sentence and tell if the sentence entered is positive or negative. Hint: Use the split() command we saw in lesson 1.



In [6]:









    



enter the sentence: Poonacha is bad
The entered sentence has a negative sentiment

Can you improve this code to handle double like "not" ? eg. "poonacha is not good" should return a negative sentiment rather than positive .



In [ ]:

    
# Enter code here

Do you think we can build a rudimentary learning algorithm to imporve the corpus of sentiments ?