Predicting Election 2016

1. Data Collection

1-A Twitter API

1-B Extracting fields from json data

Extracting Location:

Palm Beach Gardens, FL
Wisconsin, USA
Fishers, IN

1-C Storing Tweets into MySQL Database

Filtering tweets based on

  • Language "lang":"en"
  • Country "country":"United States"
  • Not sensitive tweets "possibly_sensitive":"false"
mysql> create table ElectionTweets (id_str CHAR(18) PRIMARY KEY, month INT(2), day INT(2), loc_name VARCHAR(20), text VARCHAR(140));
Query OK, 0 rows affected (0.00 sec)

mysql> describe ElectionTweets;
+----------+--------------+------+-----+---------+-------+
| Field    | Type         | Null | Key | Default | Extra |
+----------+--------------+------+-----+---------+-------+
| id_str   | char(18)     | NO   | PRI | NULL    |       |
| month    | int(2)       | YES  |     | NULL    |       |
| day      | int(2)       | YES  |     | NULL    |       |
| loc_name | varchar(20)  | YES  |     | NULL    |       |
| text     | varchar(140) | YES  |     | NULL    |       |
+----------+--------------+------+-----+---------+-------+
5 rows in set (0.00 sec)

Handling the time of tweets:

2. Analysis

Counting number of tweets in each state

# function to read from database
def process_locations():
     conn = MySQLdb.connect (host = "localhost",
                             user = "USERNAME",
                             passwd = "PASSWORD",
                             db = "DATABASE")
     cursor = conn.cursor ()
     cursor.execute ("SELECT loc_name, count(*) as count FROM ElectionTweets GROUP BY loc_name;")

     state_dict = {}

     result_set = cursor.fetchall()
     for row in result_set:
         state_code = False
         loc_split = row[0].strip().split(' ')
         if (len(loc_split)==2):
             if (len(loc_split[1]) == 2): 
                 state_code = loc_split[1]

         if (state_code):
             if (state_code in state_dict):
                 state_dict[state_code] += int(row[1])
             else:
                 state_dict[state_code] = int(row[1])

     cursor.close ()
     conn.close ()

     return(state_dict)

3. Web-app


In [ ]: