Reaching Beyond the Stars In Recommending Thai Restaurants in Las Vegas: A Sentiment Detecting Approach to Rating Reviews as a Complement to User Ratings

ABSTRACT

In line with career prospects in social networking applications, this classification project deals with recommender systems and focuses on the problem of matching a user to a new item. Based on the business and review academic datasets from the 2015 Yelp Dataset Challenge, the task is to classify a Thai restaurant in Las Vegas that is new to a user as a restaurant for the user to experience or not. Performance is measured, as a matter of classification accuracy, by a restaurant rating that is predicted correctly for a user from user ratings and from lexicon-extracted sentiment scores in review text. The merged working dataset has 990,627 reviews, of which 405,760 target one of the 4,960 restaurants in Las Vegas. In line with the “Yelp Restaurant Lexicon” (Kiritchenko, Zhu, Cherry, & Mohammad, 2014b) whose units of analysis are reviews of Yelp Restaurants in Phoenix (AZ), we narrow down the reviews to reviewers of restaurants in Phoenix, although not exclusively. We further narrow the reviews to Thai restaurants to accommodate R and RAM limits. Using restaurant ratings given by reviewers and sentiment scores, we select a user-based collaborative approach to predict missing ratings and to deliver to the user top recommendations.

CODE AND RESULTS FOR THE SENTIMENT DETECTION EXERCISE

The objectives of this kernel are to extract sentiment scores from Thai restaurant reviews using lexicons and to perform a preliminary comparison assessment of the review scores and the ratings of the restaurants given by individual reviewers.

In this kernel, we rely on a working dataset that was created in a previous phase of the project from the merger of a slightly transformed version of the Yelp_academic_dataset_business to the Yelp_academic_dataset_review (key: business_id, method: left, left dataset: Yelp_academic_dataset_review). For lexicon compatibility reasons and for RAM limitation reasons, we filtered the working dataset respectively by users who have rated restaurants in Phoenix (AZ) and by the "Thai" category of restaurants. Before filtering the working dataset by Thai restaurants, we combined review text by unique users and averaged review_ratings by unique users. The restaurant review data files in this kernel were derived from this working dataset. The lexicon data files were downloaded from the internet.

In the next few code cells, we will be reading four files:

1- ThaiTextByUniqueUserBiz.csv, which has three separate columns of interest: user_id, business_id and text

2- Yelp-restaurant-reviews-AFFLEX-NEGLEX-unigrams.txt, which was downloaded from http://saifmohammad.com/Lexicons/Yelp-restaurant-reviews.zip

3- AFINN-emoticons-en-165.txt, which is a lexicon derived from the content of the AFINN_emoticon-8 downloaded from https://github.com/fnielsen/afinn/blob/master/afinn/data/AFINN-emoticon-8.txt and manually copied to the AFINN-en-165 lexicon downloaded from https://github.com/fnielsen/afinn/blob/master/afinn/data/AFINN-en-165.txt

4- ThaiReviewRatingsByUserBiz.pkl, which has three columns of interest: user_id, business_id and review_ratings.

The files in items 1 and 4 were derived from the same working dataset. We made a decision to re-merge the two with the keys 'user_id' and 'business_id' at the end of this kernel to prevent confusion and to compare the various sentiment scores with the review_ratings.

Import Utility Packages

We acknowledge that some of these packages are overkill. You may wish to select what you need to decrease the burden on your memory.


In [1]:
import sys
import re
import os
import shutil
import commands
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import itertools
import nltk
nltk.download('all')


[nltk_data] Downloading collection u'all'
[nltk_data]    | 
[nltk_data]    | Downloading package abc to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package abc is already up-to-date!
[nltk_data]    | Downloading package alpino to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package alpino is already up-to-date!
[nltk_data]    | Downloading package biocreative_ppi to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package biocreative_ppi is already up-to-date!
[nltk_data]    | Downloading package brown to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package brown is already up-to-date!
[nltk_data]    | Downloading package brown_tei to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package brown_tei is already up-to-date!
[nltk_data]    | Downloading package cess_cat to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package cess_cat is already up-to-date!
[nltk_data]    | Downloading package cess_esp to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package cess_esp is already up-to-date!
[nltk_data]    | Downloading package chat80 to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package chat80 is already up-to-date!
[nltk_data]    | Downloading package city_database to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package city_database is already up-to-date!
[nltk_data]    | Downloading package cmudict to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package cmudict is already up-to-date!
[nltk_data]    | Downloading package comparative_sentences to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package comparative_sentences is already up-to-
[nltk_data]    |       date!
[nltk_data]    | Downloading package comtrans to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package comtrans is already up-to-date!
[nltk_data]    | Downloading package conll2000 to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package conll2000 is already up-to-date!
[nltk_data]    | Downloading package conll2002 to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package conll2002 is already up-to-date!
[nltk_data]    | Downloading package conll2007 to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package conll2007 is already up-to-date!
[nltk_data]    | Downloading package crubadan to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package crubadan is already up-to-date!
[nltk_data]    | Downloading package dependency_treebank to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package dependency_treebank is already up-to-date!
[nltk_data]    | Downloading package europarl_raw to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package europarl_raw is already up-to-date!
[nltk_data]    | Downloading package floresta to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package floresta is already up-to-date!
[nltk_data]    | Downloading package framenet_v15 to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package framenet_v15 is already up-to-date!
[nltk_data]    | Downloading package gazetteers to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package gazetteers is already up-to-date!
[nltk_data]    | Downloading package genesis to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package genesis is already up-to-date!
[nltk_data]    | Downloading package gutenberg to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package gutenberg is already up-to-date!
[nltk_data]    | Downloading package ieer to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package ieer is already up-to-date!
[nltk_data]    | Downloading package inaugural to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package inaugural is already up-to-date!
[nltk_data]    | Downloading package indian to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package indian is already up-to-date!
[nltk_data]    | Downloading package jeita to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package jeita is already up-to-date!
[nltk_data]    | Downloading package kimmo to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package kimmo is already up-to-date!
[nltk_data]    | Downloading package knbc to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package knbc is already up-to-date!
[nltk_data]    | Downloading package lin_thesaurus to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package lin_thesaurus is already up-to-date!
[nltk_data]    | Downloading package mac_morpho to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package mac_morpho is already up-to-date!
[nltk_data]    | Downloading package machado to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package machado is already up-to-date!
[nltk_data]    | Downloading package masc_tagged to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package masc_tagged is already up-to-date!
[nltk_data]    | Downloading package moses_sample to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package moses_sample is already up-to-date!
[nltk_data]    | Downloading package movie_reviews to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package movie_reviews is already up-to-date!
[nltk_data]    | Downloading package names to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package names is already up-to-date!
[nltk_data]    | Downloading package nombank.1.0 to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package nombank.1.0 is already up-to-date!
[nltk_data]    | Downloading package nps_chat to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package nps_chat is already up-to-date!
[nltk_data]    | Downloading package oanc_masc to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package oanc_masc is already up-to-date!
[nltk_data]    | Downloading package omw to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Unzipping corpora\omw.zip.
[nltk_data]    | Downloading package opinion_lexicon to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package opinion_lexicon is already up-to-date!
[nltk_data]    | Downloading package paradigms to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package paradigms is already up-to-date!
[nltk_data]    | Downloading package pil to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package pil is already up-to-date!
[nltk_data]    | Downloading package pl196x to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package pl196x is already up-to-date!
[nltk_data]    | Downloading package ppattach to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package ppattach is already up-to-date!
[nltk_data]    | Downloading package problem_reports to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package problem_reports is already up-to-date!
[nltk_data]    | Downloading package propbank to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package propbank is already up-to-date!
[nltk_data]    | Downloading package ptb to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package ptb is already up-to-date!
[nltk_data]    | Downloading package oanc_masc to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package oanc_masc is already up-to-date!
[nltk_data]    | Downloading package product_reviews_1 to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package product_reviews_1 is already up-to-date!
[nltk_data]    | Downloading package product_reviews_2 to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package product_reviews_2 is already up-to-date!
[nltk_data]    | Downloading package pros_cons to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package pros_cons is already up-to-date!
[nltk_data]    | Downloading package qc to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package qc is already up-to-date!
[nltk_data]    | Downloading package reuters to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package reuters is already up-to-date!
[nltk_data]    | Downloading package rte to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package rte is already up-to-date!
[nltk_data]    | Downloading package semcor to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package semcor is already up-to-date!
[nltk_data]    | Downloading package senseval to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package senseval is already up-to-date!
[nltk_data]    | Downloading package sentiwordnet to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package sentiwordnet is already up-to-date!
[nltk_data]    | Downloading package sentence_polarity to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package sentence_polarity is already up-to-date!
[nltk_data]    | Downloading package shakespeare to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package shakespeare is already up-to-date!
[nltk_data]    | Downloading package sinica_treebank to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package sinica_treebank is already up-to-date!
[nltk_data]    | Downloading package smultron to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package smultron is already up-to-date!
[nltk_data]    | Downloading package state_union to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package state_union is already up-to-date!
[nltk_data]    | Downloading package stopwords to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package stopwords is already up-to-date!
[nltk_data]    | Downloading package subjectivity to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package subjectivity is already up-to-date!
[nltk_data]    | Downloading package swadesh to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package swadesh is already up-to-date!
[nltk_data]    | Downloading package switchboard to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package switchboard is already up-to-date!
[nltk_data]    | Downloading package timit to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package timit is already up-to-date!
[nltk_data]    | Downloading package toolbox to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package toolbox is already up-to-date!
[nltk_data]    | Downloading package treebank to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package treebank is already up-to-date!
[nltk_data]    | Downloading package twitter_samples to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package twitter_samples is already up-to-date!
[nltk_data]    | Downloading package udhr to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package udhr is already up-to-date!
[nltk_data]    | Downloading package udhr2 to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package udhr2 is already up-to-date!
[nltk_data]    | Downloading package unicode_samples to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package unicode_samples is already up-to-date!
[nltk_data]    | Downloading package universal_treebanks_v20 to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package universal_treebanks_v20 is already up-to-
[nltk_data]    |       date!
[nltk_data]    | Downloading package verbnet to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package verbnet is already up-to-date!
[nltk_data]    | Downloading package webtext to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package webtext is already up-to-date!
[nltk_data]    | Downloading package wordnet to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Unzipping corpora\wordnet.zip.
[nltk_data]    | Downloading package wordnet_ic to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package wordnet_ic is already up-to-date!
[nltk_data]    | Downloading package words to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package words is already up-to-date!
[nltk_data]    | Downloading package ycoe to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package ycoe is already up-to-date!
[nltk_data]    | Downloading package rslp to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package rslp is already up-to-date!
[nltk_data]    | Downloading package hmm_treebank_pos_tagger to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package hmm_treebank_pos_tagger is already up-to-
[nltk_data]    |       date!
[nltk_data]    | Downloading package maxent_treebank_pos_tagger to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package maxent_treebank_pos_tagger is already up-
[nltk_data]    |       to-date!
[nltk_data]    | Downloading package universal_tagset to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package universal_tagset is already up-to-date!
[nltk_data]    | Downloading package maxent_ne_chunker to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package maxent_ne_chunker is already up-to-date!
[nltk_data]    | Downloading package punkt to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package punkt is already up-to-date!
[nltk_data]    | Downloading package book_grammars to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package book_grammars is already up-to-date!
[nltk_data]    | Downloading package sample_grammars to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package sample_grammars is already up-to-date!
[nltk_data]    | Downloading package spanish_grammars to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package spanish_grammars is already up-to-date!
[nltk_data]    | Downloading package basque_grammars to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package basque_grammars is already up-to-date!
[nltk_data]    | Downloading package large_grammars to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package large_grammars is already up-to-date!
[nltk_data]    | Downloading package tagsets to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package tagsets is already up-to-date!
[nltk_data]    | Downloading package snowball_data to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package snowball_data is already up-to-date!
[nltk_data]    | Downloading package bllip_wsj_no_aux to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package bllip_wsj_no_aux is already up-to-date!
[nltk_data]    | Downloading package word2vec_sample to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package word2vec_sample is already up-to-date!
[nltk_data]    | Downloading package panlex_swadesh to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package panlex_swadesh is already up-to-date!
[nltk_data]    | Downloading package mte_teip5 to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package mte_teip5 is already up-to-date!
[nltk_data]    | Downloading package averaged_perceptron_tagger to
[nltk_data]    |     C:\Users\Luc\AppData\Roaming\nltk_data...
[nltk_data]    |   Package averaged_perceptron_tagger is already up-
[nltk_data]    |       to-date!
[nltk_data]    | 
[nltk_data]  Done downloading collection all
Out[1]:
True

Read the Thai Restaurant Review File

We choose a csv file rather than a pkl file and encode with utf-8 in order to avoid complications with accented charaters, such as 'ê' in crêpe.


In [2]:
ThaiTextByUniqueUserBiz = pd.read_csv("ThaiTextByUniqueUserBiz.csv", encoding='utf-8')
ThaiTextByUniqueUserBiz


Out[2]:
Unnamed: 0 user_id business_id text
0 0 --65q1FpAL_UQtVZ2PTGew JiLK9QPjd53pOBEAaY83lw I'm a big fan of this place and have dropped i...
1 1 --ijvARuRJhZrBdS9_jF2A ApUCpJ9aa6yVgsde16gYrg Food was ok but the service was less than exce...
2 2 --ohLoec6PU9_yxhbIlVWg 2XXwiASSS6685OhWWnIt_A I got the Penang curry and have to say the foo...
3 3 --qEXbk-cA0HmbPyhcffdA CVos739DJ06t8-dNiRMyeQ To sum up in one sentence: "I only go to Thai ...
4 4 --qEXbk-cA0HmbPyhcffdA jQST5lkLGX9L52-A10TGTQ I LOVE THIS PLACE!\r\r\n\r\r\nIt's a cute mom-...
5 5 -0fMBkX7QvWKQrtOp7H-GQ 3rqoxOasrRKxNubxjLSElA The food was delicious and the service was ama...
6 6 -2EuoueswhqEERWezJY8gw cInzGnaFZ3EIItvFXl1MvQ My Girlfriend and I eat here occasionally and ...
7 7 -2Ig3GSBkj8JQT8eETmDPg d-YNxMKL6ZhkiRhfUPxKHg Very friendly family business. We had the pad...
8 8 -3WzrbWjnaKg2QWAsouy_g jQST5lkLGX9L52-A10TGTQ Yellow curry w/tofu is my favorite!
9 9 -45GJdo8Ye8A1AStuUZp9Q -SNpLwJNup8N96yq7sBJyw Excellent food, reasonable price and great atm...
10 10 -4c_mgQdLH5axJ3j2In5_Q WPmamMTGAmNYXGoXW1mWyQ This place has the best green curry and spicy ...
11 11 -7R1u0HzHKmhLy9qE2MBpw KTF-E3NfkJy2wiwcgOPyVQ Just ordered carry-out while at work and was p...
12 12 -8BqfYouq3o_UoazAQWwNw a_wK-2KhPu-8DAwwRObr8A Came here for lunch time was sat quickly, the ...
13 13 -8BqfYouq3o_UoazAQWwNw jQST5lkLGX9L52-A10TGTQ This place is one of the best thai places you ...
14 14 -8gRkiYaVm3zfoQ4pcg75w UxiSHVZxMdey7vRwm1fQyA The food and the service was the absolute best...
15 15 -8pbvWZH7Czk9YW1UkW4Ng MDtjD14H1sGLc4tSg0sUhw The service was fast, and friendly. The thai i...
16 16 -9-fkZ72_Qg4E6YYYXMqSg 4nnMgD9X62YrMqkQKhx-Pg Ate here for first time. My wife's parents own...
17 17 -9g6w1xoj6-4iZH29P3h7g kGEW4XXJQ2FS94gZv_N7VA This is one of the best places in Phoenix!! O...
18 18 -A01aSKVuOm42FnhvOCdKA JiLK9QPjd53pOBEAaY83lw Cannot say enough about this place!! It's our ...
19 19 -ARd7byPUILfnFVlKcn0Yg wct7rZKyZqZftzmAU-vhWQ We used to go here a lot and service was alway...
20 20 -BQFGG_hrORLkEs8oigCjg qcm7pfIdNn9XBuPEtoogbw Fantastic ambiance and food, I'm excited to ha...
21 21 -BVv1TDLLphHzgKw-eAJJQ fJzKYljToXOauSohw9cMIA I read the reviews and thought I should check ...
22 22 -BVv1TDLLphHzgKw-eAJJQ puFrm8eNizztqaWr_e32pQ As a resident of San Fransisco for almost thre...
23 23 -BrtOvg4tL1xcaQnQTaZow oXQmAzFj_qKNhUGYGNWLSA So first time here and the combo appetizer was...
24 24 -CyBQG3dc4UnpluY7UdMOA ujLZmyy11g1JHCQTxRA3Dw i visited on jan 19 and ordered the thai basil...
25 25 -DRza4wuHHWfQx5HcG6qaw NCtzWkMbE13r2M2Sg0wH9w We have been several times and have positive f...
26 26 -E7e4sTuVAHwwWjQYBG07w 5W48_DnrXVD7EbtmE4pxOQ I came in 15 minutes before close and got my f...
27 27 -EFuxDYchSSVkb4Q9Iivpg shCdCHRbnY5FTMJbWl-myQ Can I get a wha what? This place ROCKS! I li...
28 28 -EyEj5BujVFisco6OwmR8A puFrm8eNizztqaWr_e32pQ A friend wanted to meet for an early dinner an...
29 29 -F32Vl8Rk4dwsmk0f2wRIw NH67MdKaFGNcP-dlu56pyw I ordered take-out from Thai Elephant tonight ...
... ... ... ... ...
10881 10881 zkhOTlhe6dn-jrwDpYDN6Q LbBxrQJl-ny02-eCM1LYNg I grew up eating amazing Thai food at a little...
10882 10882 zlcHQII8dyI8I0LHGj8nOA 90AXjqb4O-wrTHDKDoDUzg Came in on a Sunday evening, place was quiet a...
10883 10883 zlcHQII8dyI8I0LHGj8nOA AsX-6ECbV83zGJLUVMre9w Roasted duck was way too salty. Papaya salad w...
10884 10884 zlneJ82kppmQXOUGHqCLaQ RlfX4muX5LfJsvmI9qWGvw Amazing service, food, & decor. Best Thai food...
10885 10885 zmZtT1T6-J4NcqP8j1L5jA Nz_AasmpsQ8MLSqhCTRVoA Great food and phenomenal service, but please ...
10886 10886 zmZtT1T6-J4NcqP8j1L5jA joxWCp6dgN-kTE9GMziwjA Unfortunately I can't give this place a 5 beca...
10887 10887 zn81QpflLDUaGZkCMUowCg 8qrICL2tS2Rq7b5gxUdQwQ I am not sure why there were great reviews of ...
10888 10888 zn81QpflLDUaGZkCMUowCg I1rvqU2k5UQGo2lGdY6hyw Great authentic Thai food. So glad to have fo...
10889 10889 zn81QpflLDUaGZkCMUowCg NH67MdKaFGNcP-dlu56pyw Ordered drunken noodles with seafood last nigh...
10890 10890 zn81QpflLDUaGZkCMUowCg qyNtVViurIcChc35mfYIEw Just shoot me. A Touch of Thai is just that: ...
10891 10891 zn81QpflLDUaGZkCMUowCg vtQOervVVTXjhvSZQiZ6PA Red curry was watery and rather average. Hot ...
10892 10892 znDOmt2ifMXWiAkrhjiuig AsX-6ECbV83zGJLUVMre9w I love Thai food. It is my favorite food. The ...
10893 10893 zp-DF3qfvOn5ko_vjpQLOg KPoTixdjoJxSqRSEApSAGg The Wild Thaiger is the best kept Thai restaur...
10894 10894 zqHznU4iL06NziZIEGWHJw lliksv-tglfUz1T3B3vgvA I now understand what all the hype is about. T...
10895 10895 zrBmkDDLS94GYexyp0LyqQ NH67MdKaFGNcP-dlu56pyw FANTASTIC. Service? 5 stars. Friendly, effi...
10896 10896 zrO1ENicvYdPsQk8ykJOkg a1t31qMLd5fQocEjbSJ61A Saw that there were some mixed reviews for the...
10897 10897 zs1msKnmTFD3iV2u69USuA PXmR1MgOAWB066XH20HjxQ Went here on a recent trip to AZ at the recomm...
10898 10898 zs6wQGh1r726ZzaNKRa-bw JiLK9QPjd53pOBEAaY83lw Delicious. The vegetable penang curry was so ...
10899 10899 zs6wQGh1r726ZzaNKRa-bw j2a5uJz76rK9uTRgLn5TdQ The food was fine, nothing special. The most ...
10900 10900 zs6wQGh1r726ZzaNKRa-bw xcxkEmy4CD-qaJUqprvpHA This is my favorite Thai restaurant in the Val...
10901 10901 zssolmdLpaX1tpRRMDWYwg 0udEgNqy5rLR5pZ4kD19Og Yum Yum!\r\r\nI love that they actually use sp...
10902 10902 zulNp3NWnv7sYODNZ1Xrow o15PeOAUzpcCl8ngk0lMHw One of my favorite places for thai takeout! Lo...
10903 10903 zuoeE7GdXXlCgr995ImWfQ xcxkEmy4CD-qaJUqprvpHA Worst experience I have ever had at a restaura...
10904 10904 zv4i7JjhI9v9j4ZzX7TGDw NCtzWkMbE13r2M2Sg0wH9w LOVE THIS PLACE! I've lived in NY and LA so I'...
10905 10905 zvFDYEFo_xO8VqLQfmB-DA shCdCHRbnY5FTMJbWl-myQ Shopped at the market and decided to eat here....
10906 10906 zw5NmE_epbvJ22xOYLdIoQ cBwc3dhdHw0emmg9nd5SXw Decor is very nice and clean. In a small shopp...
10907 10907 zxQaAt4awDFVWme2I9mFgg 0udEgNqy5rLR5pZ4kD19Og My friends and I joke that the Pad Thai has cr...
10908 10908 zxcrlC3cmH5S2TGIxuLwBw MDtjD14H1sGLc4tSg0sUhw There really aren't that many Thai food spots ...
10909 10909 zxcrlC3cmH5S2TGIxuLwBw kGEW4XXJQ2FS94gZv_N7VA This was my go-to for delivery Thai food. Pret...
10910 10910 zyor9BbfHNjTsaRFfePRwQ apGVTRZRCQ9-89hu2qW-vw Had dinner at the Bangkok Thai Bar-B-Q tonight...

10911 rows × 4 columns


In [3]:
ThaiTextByUniqueUserBiz.drop(ThaiTextByUniqueUserBiz.columns[[0]],axis=1, inplace=True)
ThaiTextByUniqueUserBiz


Out[3]:
user_id business_id text
0 --65q1FpAL_UQtVZ2PTGew JiLK9QPjd53pOBEAaY83lw I'm a big fan of this place and have dropped i...
1 --ijvARuRJhZrBdS9_jF2A ApUCpJ9aa6yVgsde16gYrg Food was ok but the service was less than exce...
2 --ohLoec6PU9_yxhbIlVWg 2XXwiASSS6685OhWWnIt_A I got the Penang curry and have to say the foo...
3 --qEXbk-cA0HmbPyhcffdA CVos739DJ06t8-dNiRMyeQ To sum up in one sentence: "I only go to Thai ...
4 --qEXbk-cA0HmbPyhcffdA jQST5lkLGX9L52-A10TGTQ I LOVE THIS PLACE!\r\r\n\r\r\nIt's a cute mom-...
5 -0fMBkX7QvWKQrtOp7H-GQ 3rqoxOasrRKxNubxjLSElA The food was delicious and the service was ama...
6 -2EuoueswhqEERWezJY8gw cInzGnaFZ3EIItvFXl1MvQ My Girlfriend and I eat here occasionally and ...
7 -2Ig3GSBkj8JQT8eETmDPg d-YNxMKL6ZhkiRhfUPxKHg Very friendly family business. We had the pad...
8 -3WzrbWjnaKg2QWAsouy_g jQST5lkLGX9L52-A10TGTQ Yellow curry w/tofu is my favorite!
9 -45GJdo8Ye8A1AStuUZp9Q -SNpLwJNup8N96yq7sBJyw Excellent food, reasonable price and great atm...
10 -4c_mgQdLH5axJ3j2In5_Q WPmamMTGAmNYXGoXW1mWyQ This place has the best green curry and spicy ...
11 -7R1u0HzHKmhLy9qE2MBpw KTF-E3NfkJy2wiwcgOPyVQ Just ordered carry-out while at work and was p...
12 -8BqfYouq3o_UoazAQWwNw a_wK-2KhPu-8DAwwRObr8A Came here for lunch time was sat quickly, the ...
13 -8BqfYouq3o_UoazAQWwNw jQST5lkLGX9L52-A10TGTQ This place is one of the best thai places you ...
14 -8gRkiYaVm3zfoQ4pcg75w UxiSHVZxMdey7vRwm1fQyA The food and the service was the absolute best...
15 -8pbvWZH7Czk9YW1UkW4Ng MDtjD14H1sGLc4tSg0sUhw The service was fast, and friendly. The thai i...
16 -9-fkZ72_Qg4E6YYYXMqSg 4nnMgD9X62YrMqkQKhx-Pg Ate here for first time. My wife's parents own...
17 -9g6w1xoj6-4iZH29P3h7g kGEW4XXJQ2FS94gZv_N7VA This is one of the best places in Phoenix!! O...
18 -A01aSKVuOm42FnhvOCdKA JiLK9QPjd53pOBEAaY83lw Cannot say enough about this place!! It's our ...
19 -ARd7byPUILfnFVlKcn0Yg wct7rZKyZqZftzmAU-vhWQ We used to go here a lot and service was alway...
20 -BQFGG_hrORLkEs8oigCjg qcm7pfIdNn9XBuPEtoogbw Fantastic ambiance and food, I'm excited to ha...
21 -BVv1TDLLphHzgKw-eAJJQ fJzKYljToXOauSohw9cMIA I read the reviews and thought I should check ...
22 -BVv1TDLLphHzgKw-eAJJQ puFrm8eNizztqaWr_e32pQ As a resident of San Fransisco for almost thre...
23 -BrtOvg4tL1xcaQnQTaZow oXQmAzFj_qKNhUGYGNWLSA So first time here and the combo appetizer was...
24 -CyBQG3dc4UnpluY7UdMOA ujLZmyy11g1JHCQTxRA3Dw i visited on jan 19 and ordered the thai basil...
25 -DRza4wuHHWfQx5HcG6qaw NCtzWkMbE13r2M2Sg0wH9w We have been several times and have positive f...
26 -E7e4sTuVAHwwWjQYBG07w 5W48_DnrXVD7EbtmE4pxOQ I came in 15 minutes before close and got my f...
27 -EFuxDYchSSVkb4Q9Iivpg shCdCHRbnY5FTMJbWl-myQ Can I get a wha what? This place ROCKS! I li...
28 -EyEj5BujVFisco6OwmR8A puFrm8eNizztqaWr_e32pQ A friend wanted to meet for an early dinner an...
29 -F32Vl8Rk4dwsmk0f2wRIw NH67MdKaFGNcP-dlu56pyw I ordered take-out from Thai Elephant tonight ...
... ... ... ...
10881 zkhOTlhe6dn-jrwDpYDN6Q LbBxrQJl-ny02-eCM1LYNg I grew up eating amazing Thai food at a little...
10882 zlcHQII8dyI8I0LHGj8nOA 90AXjqb4O-wrTHDKDoDUzg Came in on a Sunday evening, place was quiet a...
10883 zlcHQII8dyI8I0LHGj8nOA AsX-6ECbV83zGJLUVMre9w Roasted duck was way too salty. Papaya salad w...
10884 zlneJ82kppmQXOUGHqCLaQ RlfX4muX5LfJsvmI9qWGvw Amazing service, food, & decor. Best Thai food...
10885 zmZtT1T6-J4NcqP8j1L5jA Nz_AasmpsQ8MLSqhCTRVoA Great food and phenomenal service, but please ...
10886 zmZtT1T6-J4NcqP8j1L5jA joxWCp6dgN-kTE9GMziwjA Unfortunately I can't give this place a 5 beca...
10887 zn81QpflLDUaGZkCMUowCg 8qrICL2tS2Rq7b5gxUdQwQ I am not sure why there were great reviews of ...
10888 zn81QpflLDUaGZkCMUowCg I1rvqU2k5UQGo2lGdY6hyw Great authentic Thai food. So glad to have fo...
10889 zn81QpflLDUaGZkCMUowCg NH67MdKaFGNcP-dlu56pyw Ordered drunken noodles with seafood last nigh...
10890 zn81QpflLDUaGZkCMUowCg qyNtVViurIcChc35mfYIEw Just shoot me. A Touch of Thai is just that: ...
10891 zn81QpflLDUaGZkCMUowCg vtQOervVVTXjhvSZQiZ6PA Red curry was watery and rather average. Hot ...
10892 znDOmt2ifMXWiAkrhjiuig AsX-6ECbV83zGJLUVMre9w I love Thai food. It is my favorite food. The ...
10893 zp-DF3qfvOn5ko_vjpQLOg KPoTixdjoJxSqRSEApSAGg The Wild Thaiger is the best kept Thai restaur...
10894 zqHznU4iL06NziZIEGWHJw lliksv-tglfUz1T3B3vgvA I now understand what all the hype is about. T...
10895 zrBmkDDLS94GYexyp0LyqQ NH67MdKaFGNcP-dlu56pyw FANTASTIC. Service? 5 stars. Friendly, effi...
10896 zrO1ENicvYdPsQk8ykJOkg a1t31qMLd5fQocEjbSJ61A Saw that there were some mixed reviews for the...
10897 zs1msKnmTFD3iV2u69USuA PXmR1MgOAWB066XH20HjxQ Went here on a recent trip to AZ at the recomm...
10898 zs6wQGh1r726ZzaNKRa-bw JiLK9QPjd53pOBEAaY83lw Delicious. The vegetable penang curry was so ...
10899 zs6wQGh1r726ZzaNKRa-bw j2a5uJz76rK9uTRgLn5TdQ The food was fine, nothing special. The most ...
10900 zs6wQGh1r726ZzaNKRa-bw xcxkEmy4CD-qaJUqprvpHA This is my favorite Thai restaurant in the Val...
10901 zssolmdLpaX1tpRRMDWYwg 0udEgNqy5rLR5pZ4kD19Og Yum Yum!\r\r\nI love that they actually use sp...
10902 zulNp3NWnv7sYODNZ1Xrow o15PeOAUzpcCl8ngk0lMHw One of my favorite places for thai takeout! Lo...
10903 zuoeE7GdXXlCgr995ImWfQ xcxkEmy4CD-qaJUqprvpHA Worst experience I have ever had at a restaura...
10904 zv4i7JjhI9v9j4ZzX7TGDw NCtzWkMbE13r2M2Sg0wH9w LOVE THIS PLACE! I've lived in NY and LA so I'...
10905 zvFDYEFo_xO8VqLQfmB-DA shCdCHRbnY5FTMJbWl-myQ Shopped at the market and decided to eat here....
10906 zw5NmE_epbvJ22xOYLdIoQ cBwc3dhdHw0emmg9nd5SXw Decor is very nice and clean. In a small shopp...
10907 zxQaAt4awDFVWme2I9mFgg 0udEgNqy5rLR5pZ4kD19Og My friends and I joke that the Pad Thai has cr...
10908 zxcrlC3cmH5S2TGIxuLwBw MDtjD14H1sGLc4tSg0sUhw There really aren't that many Thai food spots ...
10909 zxcrlC3cmH5S2TGIxuLwBw kGEW4XXJQ2FS94gZv_N7VA This was my go-to for delivery Thai food. Pret...
10910 zyor9BbfHNjTsaRFfePRwQ apGVTRZRCQ9-89hu2qW-vw Had dinner at the Bangkok Thai Bar-B-Q tonight...

10911 rows × 3 columns

Read the Yelp Restaurant Review Lexicon File

Although a bigram-based lexicon is available from the Yelp Restaurant Review Lexicon website file, we choose for this exercise the unigram-based lexicon for ease of comparison with the AFINN lexicon. We first assign column names to the four types of values found in the unigram-based text file although we will be using for this exercise the 'word' and 'score' values only.


In [4]:
colnames = ['word', 'score', 'pos', 'neg']
YelpSentiment = pd.read_table("Yelp-restaurant-reviews-AFFLEX-NEGLEX-unigrams.txt", names=colnames)
YelpSentiment


Out[4]:
word score pos neg
0 overpowering_NEGFIRST 3.798 139 0
1 yumm 3.716 128 0
2 faves 3.306 256 2
3 yummmm 3.250 80 0
4 satisfies 3.238 79 0
5 disappoints_NEGFIRST 3.100 208 2
6 bosnian 3.061 66 0
7 combines 3.046 65 0
8 vinegars 2.967 60 0
9 exquisite 2.955 240 3
10 regret_NEGFIRST 2.953 360 5
11 yummm 2.942 118 1
12 seasonally 2.899 56 0
13 approachable 2.881 55 0
14 heavenly 2.875 667 11
15 yummo 2.863 54 0
16 payton 2.863 54 0
17 awsome 2.863 109 1
18 pablo 2.845 53 0
19 a++ 2.845 53 0
20 penzey's 2.845 53 0
21 awesomeness 2.833 159 2
22 camper 2.826 105 1
23 cutest 2.807 103 1
24 disappoint_NEGFIRST 2.802 982 18
25 burmese 2.768 49 0
26 popsicles 2.748 48 0
27 godiva 2.727 47 0
28 pushy_NEGFIRST 2.727 47 0
29 delicioso 2.695 92 1
... ... ... ... ...
38352 ym -3.542 0 10
38353 reply_NEGFIRST -3.542 0 10
38354 dirtiest -3.542 1 21
38355 cockroach -3.568 6 78
38356 unedible -3.586 1 22
38357 rudest -3.601 2 34
38358 uneatable -3.629 0 11
38359 insipid -3.629 0 11
38360 blandest -3.709 1 25
38361 inexcusable -3.709 2 38
38362 argumentative -3.709 0 12
38363 insulted_NEG -3.709 0 12
38364 return_NEGFIRST -3.719 43 577
38365 inedible_NEG -3.783 1 27
38366 oggis -3.783 0 13
38367 redeeming_NEGFIRST -3.783 0 13
38368 greeting_NEGFIRST -3.852 1 29
38369 compensation_NEG -3.852 0 14
38370 apology_NEGFIRST -3.852 6 104
38371 stars_NEGFIRST -3.878 4 76
38372 rudely -3.960 6 116
38373 inconvenienced -3.977 0 16
38374 apologized_NEGFIRST -3.977 0 16
38375 patronize_NEGFIRST -4.034 0 17
38376 pepper_NEGFIRST -4.034 0 17
38377 acknowledgement_NEGFIRST -4.034 0 17
38378 grossest -4.088 0 18
38379 apology_NEG -4.114 3 77
38380 refund_NEGFIRST -4.140 0 19
38381 returning_NEGFIRST -4.440 0 26

38382 rows × 4 columns

Filter out the NEGLEX elements from the Yelp Restaurant Review Lexicon

The unigram lexicon for Yelp restaurant reviews has two scales, each with words (common or not with the other scale) and its own sentiment scores. For the purpose of this excercise, we filter out the NEGLEX scale as it provides positive and negative values for some words with negative connotations. We are seeking alignment with the AFINN lexicons for comparison purposes. The NEGFLEX lexicon is provided for reference purposes only.


In [5]:
YelpSentimentAFFLEX = YelpSentiment[YelpSentiment.word.str.contains("_NEG") == False]
YelpSentimentAFFLEX = YelpSentimentAFFLEX.reset_index(drop=True)
YelpSentimentAFFLEX


Out[5]:
word score pos neg
0 yumm 3.716 128 0
1 faves 3.306 256 2
2 yummmm 3.250 80 0
3 satisfies 3.238 79 0
4 bosnian 3.061 66 0
5 combines 3.046 65 0
6 vinegars 2.967 60 0
7 exquisite 2.955 240 3
8 yummm 2.942 118 1
9 seasonally 2.899 56 0
10 approachable 2.881 55 0
11 heavenly 2.875 667 11
12 yummo 2.863 54 0
13 payton 2.863 54 0
14 awsome 2.863 109 1
15 pablo 2.845 53 0
16 a++ 2.845 53 0
17 penzey's 2.845 53 0
18 awesomeness 2.833 159 2
19 camper 2.826 105 1
20 cutest 2.807 103 1
21 burmese 2.768 49 0
22 popsicles 2.748 48 0
23 godiva 2.727 47 0
24 delicioso 2.695 92 1
25 chino's 2.663 44 0
26 joel 2.663 44 0
27 timo 2.663 44 0
28 barbara 2.663 44 0
29 maui 2.651 177 3
... ... ... ... ...
29666 promoter -3.341 0 8
29667 unflavorful -3.341 0 8
29668 unacceptably -3.341 0 8
29669 xooro -3.341 0 8
29670 unacceptable -3.346 20 189
29671 imposition -3.447 0 9
29672 eggo -3.447 0 9
29673 livid -3.447 1 19
29674 curtly -3.447 0 9
29675 mildew -3.447 0 9
29676 spoonz -3.447 0 9
29677 over-rated -3.479 2 30
29678 tock -3.495 1 20
29679 disrespectful -3.511 2 31
29680 defended -3.542 0 10
29681 vomiting -3.542 2 32
29682 ym -3.542 0 10
29683 dirtiest -3.542 1 21
29684 cockroach -3.568 6 78
29685 unedible -3.586 1 22
29686 rudest -3.601 2 34
29687 uneatable -3.629 0 11
29688 insipid -3.629 0 11
29689 blandest -3.709 1 25
29690 inexcusable -3.709 2 38
29691 argumentative -3.709 0 12
29692 oggis -3.783 0 13
29693 rudely -3.960 6 116
29694 inconvenienced -3.977 0 16
29695 grossest -4.088 0 18

29696 rows × 4 columns


In [6]:
YelpSentimentNEGFLEX = YelpSentiment[YelpSentiment.word.str.contains("_NEG") == True]
YelpSentimentNEGFLEX


Out[6]:
word score pos neg
0 overpowering_NEGFIRST 3.798 139 0
5 disappoints_NEGFIRST 3.100 208 2
10 regret_NEGFIRST 2.953 360 5
24 disappoint_NEGFIRST 2.802 982 18
28 pushy_NEGFIRST 2.727 47 0
30 pretentious_NEGFIRST 2.692 138 2
31 overpower_NEGFIRST 2.685 45 0
38 affordable_NEG 2.640 43 0
46 rushed_NEGFIRST 2.606 84 1
51 skimping_NEGFIRST 2.570 40 0
62 disappointed_NEGFIRST 2.497 990 25
63 failed_NEGFIRST 2.494 37 0
78 beat_NEGFIRST 2.432 821 22
81 detract_NEGFIRST 2.411 34 0
89 tour_NEG 2.382 33 0
96 stuffy_NEGFIRST 2.368 66 1
106 gem_NEG 2.330 128 3
132 dissapointed_NEGFIRST 2.257 29 0
136 devoured_NEG 2.223 28 0
148 pressure_NEGFIRST 2.223 28 0
152 overdone_NEGFIRST 2.206 56 1
154 hesitate_NEGFIRST 2.193 196 6
176 skimpy_NEGFIRST 2.152 26 0
178 pricey_NEGFIRST 2.152 26 0
179 carrot_NEG 2.152 26 0
198 closer_NEGFIRST 2.114 25 0
207 cartel_NEG 2.114 25 0
211 hassle_NEGFIRST 2.114 25 0
214 shaved_NEG 2.114 25 0
230 fastest_NEG 2.075 24 0
... ... ... ... ...
38322 acknowledge_NEGFIRST -3.341 2 26
38326 precooked_NEG -3.341 0 8
38328 voucher_NEG -3.341 1 17
38329 disrespected_NEG -3.341 0 8
38331 response_NEGFIRST -3.369 3 36
38332 spices_NEGFIRST -3.395 1 18
38333 eye_NEGFIRST -3.395 1 18
38334 flavor_NEGFIRST -3.410 54 529
38338 compensate_NEGFIRST -3.447 0 9
38339 comped_NEGFIRST -3.447 0 9
38344 again_NEGFIRST -3.483 26 279
38345 seasoning_NEGFIRST -3.495 5 62
38349 manager_NEGFIRST -3.542 2 32
38351 acknowledgment_NEGFIRST -3.542 0 10
38353 reply_NEGFIRST -3.542 0 10
38363 insulted_NEG -3.709 0 12
38364 return_NEGFIRST -3.719 43 577
38365 inedible_NEG -3.783 1 27
38367 redeeming_NEGFIRST -3.783 0 13
38368 greeting_NEGFIRST -3.852 1 29
38369 compensation_NEG -3.852 0 14
38370 apology_NEGFIRST -3.852 6 104
38371 stars_NEGFIRST -3.878 4 76
38374 apologized_NEGFIRST -3.977 0 16
38375 patronize_NEGFIRST -4.034 0 17
38376 pepper_NEGFIRST -4.034 0 17
38377 acknowledgement_NEGFIRST -4.034 0 17
38379 apology_NEG -4.114 3 77
38380 refund_NEGFIRST -4.140 0 19
38381 returning_NEGFIRST -4.440 0 26

8685 rows × 4 columns

Create Two Python Dictionaries with Sentiment Scores for the Sentiment Detection Algorithm

In this section, the objective of the code cells are (1) to identify the dictionary format used with the AFINN lexicon for the sentiment dectection algorithm and (2) the creation of a Python dictionary in the same format with the Yelp Restaurant Review Lexicon from the YelpSentimentAFFLEX dataframe. The Python dictionaries will be used in sentiment detection algorithms in the next section.

Create the AFINN dictionary


In [7]:
sentiment_dictionary = {}
for line in open('AFINN-emoticons-en-165.txt'):
    word, score = line.split('\t')
    sentiment_dictionary[word] = int(score)
sentiment_dictionary


Out[7]:
{'unimaginative': -2,
 'limited': -1,
 'unscientific': -2,
 'suicidal': -2,
 'pardon': 2,
 'desirable': 2,
 'foul': -3,
 'obstruction': -2,
 'protest': -2,
 'lurking': -1,
 'controversial': -2,
 'hating': -3,
 'ridiculous': -3,
 'hate': -3,
 '\\o/': 3,
 'aggression': -2,
 'poorly': -2,
 'stinks': -2,
 'infuriates': -2,
 'regretted': -2,
 'violate': -2,
 'granting': 1,
 'attracted': 1,
 'tremors': -2,
 'stinky': -2,
 'poorest': -2,
 'disability': -2,
 'condemns': -2,
 'sorry': -1,
 'regrets': -2,
 'struck': -1,
 'misreporting': -2,
 'compassion': 2,
 'misreports': -2,
 'hilarious': 2,
 'lurk': -1,
 'misunderstanding': -2,
 'distort': -2,
 'lololol': 4,
 'stolen': -2,
 'gratification': 2,
 'uncertain': -1,
 'stabbed': -2,
 'screaming': -2,
 'courageous': 2,
 'disturb': -2,
 'exaggerate': -2,
 'harried': -2,
 'solution': 1,
 'nigger': -5,
 'honor': 2,
 'pardons': 2,
 'delightfully': 3,
 'monopolized': -2,
 'illiteracy': -2,
 'triumph': 4,
 'enjoy': 2,
 'shithead': -4,
 'diverting': -1,
 'tired': -2,
 'warns': -2,
 'landmark': 2,
 'elegant': 2,
 'fabulous': 4,
 'rigorous': 3,
 'emptiness': -1,
 'loathing': -3,
 'misplaced': -2,
 'misconducting': -2,
 'errors': -2,
 'hide': -1,
 'wreck': -2,
 'desirous': 2,
 'integrity': 2,
 'beaten': -2,
 'jocular': 2,
 'poison': -2,
 'plodding': -2,
 'victims': -3,
 'defraud': -3,
 'endorse': 2,
 'shocks': -2,
 'unmotivated': -2,
 'hero': 2,
 'avert': -1,
 'festive': 2,
 'interrupting': -2,
 'prblms': -2,
 'tyrannic': -3,
 'active': 1,
 'goodmorning': 1,
 'celebration': 3,
 'oversells': -2,
 'kudos': 3,
 'uplifting': 2,
 'inefficiently': -2,
 'lugubrious': -2,
 'deferring': -1,
 'postpones': -1,
 'controversy': -2,
 'harass': -3,
 'launched': 1,
 'stressed': -2,
 'lonesome': -2,
 'lucrative': 3,
 'postponed': -1,
 'missing': -2,
 'rebel': -2,
 'criticism': -2,
 'appropriately': 2,
 'fantastic': 4,
 'secure': 2,
 'criticise': -2,
 'qualities': 2,
 'dehumanize': -2,
 'inflicting': -2,
 'cheats': -3,
 ':-)))))': 3,
 'sparkling': 3,
 'achievable': 1,
 'romantically': 2,
 'hoping': 2,
 'devastation': -2,
 'cherishing': 2,
 'cheating': -3,
 'appeasing': 2,
 'motivate': 1,
 'negative': -2,
 'insult': -2,
 'asset': 2,
 'recommend': 2,
 'strike': -1,
 'abusing': -3,
 'menaces': -2,
 'barbarous': -2,
 'supporters': 1,
 'solidifies': 2,
 'detracts': -1,
 'expose': -1,
 'haha': 3,
 'award': 3,
 'hurt': -2,
 'grief': -2,
 'disruption': -2,
 'sobering': 1,
 'frenzy': -3,
 'defeated': -2,
 'excellent': 3,
 'blocking': -1,
 'hostile': -2,
 'shoot': -1,
 'hoax': -2,
 'join': 1,
 'unreleting': -2,
 'accomplishment': 2,
 'somber': -2,
 'refusing': -2,
 'worn': -1,
 'worth': 2,
 'deficiencies': -2,
 'sweeter': 3,
 'criticizing': -2,
 'elegantly': 2,
 'collapsing': -2,
 'devastations': -2,
 'unsecured': -2,
 'pollute': -2,
 'joyous': 3,
 'unjust': -2,
 'bitterest': -2,
 'laughs': 1,
 'heartless': -2,
 'scandalous': -3,
 'impair': -2,
 'want': 1,
 'indifferent': -2,
 'attract': 1,
 'cocksucker': -5,
 'guarantee': 1,
 'farce': -1,
 'groaned': -2,
 'complaining': -2,
 'infecting': -2,
 'puzzled': -2,
 'nuts': -3,
 'damage': -3,
 'cutback': -2,
 'amazing': 4,
 'uncredited': -1,
 'trusted': 2,
 'aching': -2,
 'significance': 1,
 'moron': -3,
 'disappoint': -2,
 'bankster': -3,
 'undesirable': -2,
 'badly': -3,
 'fever': -2,
 'uselessness': -2,
 'crowding': -1,
 'beauty': 3,
 'mess': -2,
 'insecure': -2,
 'lag': -1,
 'laughting': 1,
 'demanding': -1,
 'astoundingly': 3,
 'wrong': -2,
 'memoriam': -2,
 'sentencing': -2,
 'sparkles': 3,
 'meaningful': 2,
 'misleads': -3,
 'ambivalent': -1,
 'complacent': -2,
 'splendid': 3,
 'effective': 2,
 'wins': 4,
 'attracts': 1,
 'appreciate': 2,
 'greet': 1,
 'stimulate': 1,
 'daredevil': 2,
 'steals': -2,
 'worst': -3,
 'atrocity': -3,
 'damn cute': 3,
 'greed': -3,
 'restriction': -2,
 'infested': -2,
 'dirtiest': -2,
 'consent': 2,
 'derides': -2,
 ';)))': 3,
 'rotfl': 4,
 'finest': 3,
 'ghastly': -2,
 'innovative': 2,
 'responsive': 2,
 'incapacitating': -2,
 'derided': -2,
 'cool stuff': 3,
 'welcomes': 2,
 'fit': 1,
 'polluters': -2,
 'succeeds': 3,
 'destroyed': -3,
 'confidently': 2,
 'harmony': 2,
 'affected': -1,
 'diseases': -1,
 'litigious': -2,
 'underperform': -2,
 'prevented': -1,
 'powerful': 2,
 'unemployed': -1,
 'safe': 1,
 'supporter': 1,
 'collide': -1,
 'disparaged': -2,
 'oks': 2,
 'mongering': -2,
 'aborted': -1,
 'interrupt': -2,
 'jesus': 1,
 'lenient': 1,
 'saddened': -2,
 'gloom': -1,
 'brightness': 1,
 'arrested': -3,
 'absolve': 2,
 'dumps': -1,
 'benevolent': 3,
 'spotless': 2,
 'misbehave': -2,
 'fitness': 1,
 'entitled': 1,
 'vitriolic': -3,
 'bitch': -5,
 'interruption': -2,
 'dour': -2,
 'exhilarated': 3,
 'constipation': -2,
 'crimes': -3,
 'unparliamentary': -2,
 'celebrating': 3,
 'exhilarates': 3,
 'forgot': -1,
 'substantially': 1,
 'tricked': -2,
 'absolves': 2,
 'contend': -1,
 'misfortune': -2,
 'touts': -2,
 'pesky': -2,
 'god': 1,
 'useless': -2,
 'godsend': 4,
 'revolting': -2,
 'sparkle': 3,
 'encourage': 2,
 'loathe': -3,
 'heavenly': 4,
 'ranter': -3,
 'sluggish': -2,
 'misrepresentations': -2,
 'disputed': -2,
 'barrier': -2,
 'overweight': -1,
 'threatened': -2,
 'revenge': -2,
 'free': 1,
 'mourning': -2,
 'admits': -1,
 'unconvinced': -1,
 'fervid': 2,
 'disputes': -2,
 'icky': -3,
 'struggle': -2,
 'violation': -2,
 'boosting': 1,
 'sadly': -2,
 'enlightened': 2,
 'obstructing': -2,
 'warmhearted': 2,
 'jubilant': 3,
 'exposing': -1,
 'blackmailing': -3,
 'absolved': 2,
 'infracts': -2,
 'repulsive': -2,
 'flawless': 2,
 'reprimanded': -2,
 'disturbed': -2,
 'hopeless': -2,
 'murky': -2,
 'ineffectively': -2,
 'fuckin': -4,
 'enraged': -2,
 'wronged': -2,
 'loool': 3,
 'attractions': 2,
 'reassure': 1,
 'encouraging': 2,
 'misplace': -2,
 ':))))))))))': 4,
 'lovable': 3,
 'rant': -3,
 'downside': -2,
 'delicious': 3,
 'comic': 1,
 'clarity': 2,
 'snubbing': -2,
 'dysfunction': -2,
 'cheerless': -2,
 'methodical': 2,
 'misusing': -2,
 'lawsuits': -2,
 'euphoric': 4,
 'euphoria': 3,
 'enthral': 3,
 'combats': -1,
 'tragic': -2,
 'inconvenient': -2,
 'bitter': -2,
 'lunatics': -3,
 'damaging': -3,
 'troubled': -2,
 'murder': -2,
 'collapse': -2,
 'conciliated': 2,
 'rejected': -1,
 'wisdom': 1,
 'conciliates': 2,
 'wasted': -2,
 'positively': 2,
 'mercy': 2,
 'anxiety': -2,
 'coward': -2,
 'matter': 1,
 'guilt': -3,
 'painful': -2,
 'spoiled': -2,
 'mirth': 3,
 'adopts': 1,
 'slashes': -2,
 'acclaim': 2,
 'groaning': -2,
 'willingness': 2,
 'pessimism': -2,
 'upset': -2,
 'taint': -2,
 'antagonistic': -2,
 'enchanted': 2,
 'refreshingly': 2,
 'unconcerned': -2,
 'strengthening': 2,
 'forced': -1,
 'kidnap': -2,
 'reassuring': 2,
 'responsible': 2,
 'isolated': -1,
 'impotent': -2,
 'elation': 3,
 'recommended': 2,
 'absorbed': 1,
 '<3333333': 4,
 'laughed': 1,
 'demoralizing': -2,
 'effectiveness': 2,
 'misclassify': -2,
 'indoctrinate': -2,
 'greatest': 3,
 'committing': 1,
 'distracts': -2,
 'visionary': 3,
 'vexing': -2,
 'debonair': 2,
 'disjointed': -2,
 'blackmailed': -3,
 'relieving': 2,
 'panic': -3,
 'ranters': -3,
 'sulking': -2,
 'exonerate': 2,
 'illogical': -2,
 'pretends': -1,
 'misrepresents': -2,
 'demoralizes': -2,
 'alarm': -2,
 'devastating': -2,
 'vindicate': 2,
 'solemn': -1,
 'demoralized': -2,
 'unsophisticated': -2,
 'entertaining': 2,
 'annoyed': -2,
 'incomplete': -1,
 'marvel': 3,
 'pardoned': 2,
 'bomb': -1,
 'inspire': 2,
 'hunger': -2,
 'rapture': 2,
 'misplacing': -2,
 'brilliance': 3,
 'sexist': -2,
 'dupe': -2,
 'disturbs': -2,
 'solutions': 1,
 'lethal': -2,
 'smog': -2,
 'enjoyed': 2,
 'adoration': 3,
 'acquit': 2,
 'unemployment': -2,
 'contagions': -2,
 'bliss': 3,
 'disgust': -3,
 'enslave': -2,
 'adequate': 1,
 'jailed': -2,
 'explorations': 1,
 'settlements': 1,
 'criticizes': -2,
 'inoperative': -2,
 'stop': -1,
 'denounce': -2,
 'unfair': -2,
 'criticized': -2,
 'haunted': -2,
 'oversell': -2,
 'fuking': -4,
 'happiest': 3,
 'misreported': -2,
 'angers': -3,
 'avenges': -2,
 'avenger': -2,
 'arrests': -2,
 'rescues': 2,
 'scold': -2,
 'idiotic': -3,
 'enraging': -2,
 'illegitimate': -3,
 'heavyhearted': -2,
 'bad': -3,
 'avenged': -2,
 'ban': -2,
 'ruins': -2,
 'ethical': 2,
 'erroneous': -2,
 'mandatory': -1,
 'disaster': -2,
 'fascinating': 3,
 'thwarting': -2,
 'well-established': 2,
 'sincerest': 2,
 'horrendous': -3,
 'fail': -2,
 'disturbing': -2,
 'resigned': -1,
 'best': 3,
 'clarifies': 2,
 'fucking fantastic': 4,
 'pressured': -2,
 'inappropriate': -2,
 'chiding': -3,
 'unable': -2,
 'green wash': -3,
 'succeeding': 3,
 'meditative': 1,
 'hopefully': 2,
 'scorn': -2,
 'tolerance': 2,
 'unprofessional': -2,
 'lethargic': -2,
 'indecisive': -2,
 'lazy': -1,
 'rotflol': 4,
 'extend': 1,
 'appalling': -2,
 'felony': -3,
 'restricting': -2,
 'weak': -2,
 'ignorance': -2,
 'weary': -2,
 'outages': -2,
 'wtf': -4,
 'contamination': -2,
 'dedication': 2,
 'debt': -2,
 'improve': 2,
 'pity': -2,
 'protect': 1,
 'accident': -2,
 'disdain': -2,
 'ill': -2,
 'penalizing': -2,
 'adventures': 2,
 'demanded': -1,
 'disastrous': -3,
 'misunderstand': -2,
 'agonise': -3,
 'indoctrinating': -2,
 'monopolize': -2,
 'disappears': -1,
 'overstatements': -2,
 'peacefully': 2,
 'obscenity': -2,
 'smuggled': -2,
 'likers': 2,
 'imaginative': 2,
 'honouring': 2,
 'favoured': 2,
 'greenwashing': -3,
 'regretful': -2,
 'greenwashers': -3,
 'smuggles': -2,
 'irresponsibly': -2,
 'trust': 1,
 'penalize': -2,
 'trickery': -2,
 'parley': -1,
 'condemned': -2,
 'victimizing': -3,
 'criticised': -2,
 'bothers': -2,
 'confident': 2,
 'charisma': 2,
 'impeding': -2,
 'interest': 1,
 'criticises': -2,
 'jackass': -4,
 'incompetent': -2,
 'deportations': -2,
 'easy': 1,
 'prosperous': 3,
 'attacked': -1,
 'hospitalized': -2,
 'sunshine': 2,
 'excite': 3,
 'wish': 1,
 'wowow': 4,
 'doubts': -1,
 'haters': -3,
 'censored': -2,
 'cunt': -5,
 'gracious': 3,
 'distorts': -2,
 'unsold': -1,
 'inhibit': -1,
 'save': 2,
 'harms': -2,
 'censors': -2,
 'rigorously': 3,
 'ugly': -3,
 'slam': -2,
 'stopping': -1,
 'murders': -2,
 'inspired': 2,
 'chilling': -1,
 'mistake': -2,
 'pessimistic': -2,
 'scum': -3,
 'stink': -2,
 'vulnerability': -2,
 'congratulation': 2,
 'unresearched': -2,
 'dumped': -2,
 'faker': -3,
 'fakes': -3,
 'shame': -2,
 'dizzy': -1,
 'moping': -1,
 'motivating': 2,
 'oxymoron': -1,
 'greenwash': -3,
 'disappear': -1,
 'damaged': -3,
 'constrained': -2,
 'traps': -1,
 'misunderstood': -2,
 'disparages': -2,
 'pardoning': 2,
 'damages': -3,
 'contaminate': -2,
 'avoids': -1,
 ':((': -3,
 'naive': -2,
 'harming': -2,
 'acquitting': 2,
 'condemnation': -2,
 'effortlessly': 2,
 ':)))))))))': 4,
 'inadvertently': -2,
 'unbearable': -2,
 'blames': -2,
 'misery': -2,
 'tops': 2,
 'evil': -3,
 'greeted': 1,
 'delight': 3,
 'consents': 2,
 'outmaneuvered': -2,
 'jealous': -2,
 'terribly': -3,
 'opportunity': 2,
 'blamed': -2,
 'suitable': 2,
 'shitty': -3,
 'failing': -2,
 'suffering': -2,
 'detracted': -1,
 'savings': 1,
 '<3333': 4,
 'detained': -2,
 'scam': -2,
 'flops': -2,
 'dipshit': -3,
 'outnumbered': -2,
 'bereave': -2,
 'appease': 2,
 'sentence': -2,
 'agog': 2,
 'impairs': -2,
 'envious': -2,
 'disguise': -1,
 'thanks': 2,
 'victim': -3,
 'swears': -2,
 'breaching': -2,
 'resigning': -1,
 'yes': 1,
 'convivial': 2,
 'enlighten': 2,
 'rapturous': 4,
 'chided': -3,
 'salutes': 2,
 'ease': 2,
 'assassination': -3,
 'passive': -1,
 'chides': -3,
 'advanced': 1,
 'beloved': 3,
 'prison': -2,
 'moody': -1,
 'expelled': -2,
 'harmoniously': 2,
 'mocked': -2,
 'insensitive': -2,
 'disagreement': -2,
 'perfects': 2,
 'celebrates': 3,
 'bastards': -5,
 'unlikely': -1,
 'befit': 2,
 'dreams': 1,
 'disadvantage': -2,
 'desire': 1,
 'dignity': 2,
 'gift': 2,
 'dishonest': -2,
 'jerk': -3,
 'disillusioned': -2,
 'underestimating': -1,
 'notorious': -2,
 'tyrannical': -3,
 'self-confident': 2,
 'honored': 2,
 'drowned': -2,
 'stereotype': -2,
 'macabre': -2,
 'successfully': 3,
 'proudly': 2,
 'dead': -3,
 'coziness': 2,
 'displeasure': -2,
 'escape': -1,
 'critique': -2,
 'prick': -5,
 'bore': -2,
 'lapsed': -1,
 'contaminating': -2,
 'enemies': -2,
 'denies': -2,
 'denier': -2,
 'humor': 2,
 'combat': -1,
 'inhuman': -2,
 'creative': 2,
 'disheartened': -2,
 'honoured': 2,
 'rebellion': -2,
 'denied': -2,
 'pollutes': -2,
 'polluter': -2,
 'beating': -1,
 'adorable': 3,
 'bamboozle': -2,
 'alarmists': -2,
 'bold': 2,
 'swearing': -2,
 'blackmail': -3,
 'disappeared': -1,
 'criminals': -3,
 'mediocrity': -3,
 'losing': -3,
 'memorable': 1,
 'insanity': -2,
 'super': 3,
 'trusts': 1,
 'empowerment': 2,
 'boycotts': -2,
 'derailed': -2,
 'attacks': -1,
 'accepts': 1,
 'donation': 2,
 'blithe': 2,
 'despair': -3,
 'ensure': 1,
 'cherish': 2,
 'commit': 1,
 'disqualified': -2,
 'donates': 2,
 'dilemma': -1,
 'lied': -2,
 'wasting': -2,
 'prisoners': -2,
 'detention': -2,
 'delighted': 3,
 'obscene': -2,
 'thankful': 2,
 'accomplishments': 2,
 'donated': 2,
 'balanced': 1,
 'refined': 1,
 'stall': -2,
 'frightened': -2,
 'squander': -2,
 'strangely': -1,
 'annoyance': -2,
 'deception': -3,
 'outbreaks': -2,
 'support': 2,
 'sneezing': -2,
 'jealousy': -2,
 'unfocused': -2,
 'fight': -1,
 'derision': -2,
 'playful': 2,
 'war': -2,
 'calm': 2,
 'lowest': -1,
 'shoody': -2,
 'underestimated': -1,
 'affectionateness': 3,
 'postponing': -1,
 'overjoyed': 4,
 'fascination': 3,
 'unloved': -2,
 'deny': -2,
 'vitality': 3,
 'failure': -2,
 'underestimates': -1,
 'astonished': 2,
 'true': 2,
 'dump': -1,
 'congratulations': 2,
 'injury': -2,
 'strangled': -2,
 'heartwarming': 3,
 'bargain': 2,
 'ruined': -2,
 'adore': 3,
 'torturing': -4,
 'faking': -3,
 'braveness': 2,
 'propaganda': -2,
 'cutbacks': -2,
 'promises': 1,
 'looms': -1,
 'disgrace': -2,
 'hysterical': -3,
 'sinful': -3,
 'assassinations': -3,
 'soothe': 3,
 'dear': 2,
 'promised': 1,
 'acclaimed': 2,
 'sprightly': 2,
 'pileup': -1,
 'annoying': -2,
 'harassment': -3,
 'shit': -4,
 'oversimplification': -2,
 'discriminate': -2,
 'faithful': 3,
 'dying': -3,
 'eerie': -2,
 'perfected': 2,
 'successful': 3,
 'falsely': -2,
 'looses': -3,
 'ambitious': 2,
 'subpoena': -2,
 'confusing': -2,
 'misinterpreted': -2,
 'evergreen': 2,
 'congratulate': 2,
 'outage': -2,
 'assfucking': -4,
 'outreach': 2,
 'smile': 2,
 'funerals': -1,
 'died': -3,
 'derail': -2,
 'warn': -2,
 'apologizing': -1,
 'cheated': -3,
 'apology': -1,
 'restores': 1,
 'stout': 2,
 'stealing': -2,
 'accolade': 2,
 'cheater': -3,
 'optionless': -2,
 'menaced': -2,
 'intimacy': 2,
 'profits': 2,
 ':))': 3,
 ':/': -2,
 ':(': -2,
 ':*': 2,
 'drained': -2,
 'rob': -2,
 'relieve': 1,
 'ouch': -2,
 'post-traumatic': -2,
 'grateful': 3,
 'battle': -1,
 'polluted': -2,
 'devoted': 3,
 'soothed': 3,
 'misread': -1,
 'anticipation': 1,
 'praises': 3,
 'deceitful': -3,
 'honoring': 2,
 'sufferer': -2,
 'disapproves': -2,
 'abhorrent': -3,
 'amaze': 2,
 'praised': 3,
 'bribed': -3,
 'negativity': -2,
 'sceptical': -2,
 'terror': -3,
 'pay': -1,
 'disapproved': -2,
 'oversight': -1,
 'overwrought': -3,
 'fatigues': -2,
 'slash': -2,
 'advantage': 2,
 'apeshit': -3,
 'sloppy': -2,
 ':|': -1,
 ':}': 2,
 'gloomy': -2,
 'mourn': -2,
 'fatigued': -2,
 'uneventful': -2,
 ':{': -2,
 'luxury': 2,
 'forgiving': 1,
 ':p': 3,
 '<333333': 4,
 'cool': 1,
 'impressive': 3,
 'solves': 1,
 ':D': 3,
 'die': -3,
 'accidentally': -2,
 ':\\': -2,
 ':]': 2,
 'engrossing': 3,
 'excellence': 3,
 'denounces': -2,
 ':[': -2,
 ':P': 3,
 'unaware': -2,
 ':S': -2,
 'distasteful': -2,
 'brave': 2,
 'fearing': -2,
 'insignificant': -2,
 'irritates': -3,
 'sigh': -2,
 'obsolete': -2,
 'comforting': 2,
 'helpless': -2,
 'irritated': -3,
 'acrimonious': -3,
 'rapist': -4,
 'self-deluded': -2,
 'curse': -1,
 'victimization': -3,
 'jeopardy': -2,
 'celebrated': 3,
 'favour': 2,
 'havoc': -2,
 'pleased': 3,
 'ostracizes': -2,
 'misrepresent': -2,
 'deadening': -2,
 'saluting': 2,
 'falling': -1,
 'kidnaps': -2,
 'axed': -1,
 'supporting': 1,
 'unsure': -1,
 'honour': 2,
 'funeral': -1,
 'victimized': -3,
 'inspiring': 3,
 'deceit': -3,
 'persecuted': -2,
 'oversimplify': -2,
 'environment-friendly': 2,
 'avenging': -2,
 'accomplishes': 2,
 'coerced': -2,
 'battling': -2,
 'brilliant': 4,
 'XOXOXO': 4,
 'guilty': -3,
 'proud': 2,
 'pseudoscience': -3,
 'accomplished': 2,
 'ineffective': -2,
 'exacerbating': -2,
 'ill-fated': -2,
 'weird': -2,
 'unintentional': -2,
 'motherfucking': -5,
 'breathtaking': 5,
 'charmless': -3,
 'infract': -2,
 'humour': 2,
 'defer': -1,
 'hardier': 2,
 'bloody': -3,
 'marvelous': 3,
 'betraying': -3,
 'penalized': -2,
 'sufferers': -2,
 'fake': -3,
 'embarrassing': -2,
 'forefront': 1,
 'crisis': -3,
 ...}

Create the Yelp Sentiment dictionary


In [8]:
Yelp_sentiment_dictionary = YelpSentimentAFFLEX.set_index('word')['score'].to_dict()
Yelp_sentiment_dictionary


Out[8]:
{'gai': 1.6890000000000001,
 'mid-week': 0.16800000000000001,
 'woods': 0.90200000000000002,
 'hanging': 0.13300000000000001,
 'woody': 0.59899999999999998,
 'northsight': -0.33299999999999996,
 'comically': -0.67400000000000004,
 'frou-frou': 0.35999999999999999,
 'cake-': 0.871,
 'originality': -0.098000000000000004,
 'calpico': 1.0529999999999999,
 'fattiness': 0.24199999999999999,
 'rawhide': -0.45100000000000001,
 'bringing': 0.089999999999999997,
 'tcby': -0.16300000000000001,
 'revelers': -0.45100000000000001,
 'caramels': 0.89300000000000002,
 'grueling': 0.155,
 'broiler': -0.90300000000000002,
 'caramely': 0.80200000000000005,
 'condessa': -0.29699999999999999,
 'wednesday': 0.56799999999999995,
 'broiled': 0.54600000000000004,
 'crotch': -0.044999999999999998,
 'stereotypical': -0.64800000000000002,
 'caramelo': 1.159,
 'bbqs': 0.35999999999999999,
 'chimichuri': 0.24199999999999999,
 "roscoe's": 0.92299999999999993,
 "tom's": -0.33799999999999997,
 'scrapes': 0.64800000000000002,
 '270': -1.3259999999999998,
 'francesca': 0.17800000000000002,
 "server's": -0.76500000000000001,
 'snuggled': 1.0529999999999999,
 'errors': -2.0600000000000001,
 'deferred': 0.24199999999999999,
 'cooking': 0.28000000000000003,
 'salsify': 0.109,
 'designing': 0.035000000000000003,
 'replaced': -0.64000000000000001,
 'succumb': 0.53000000000000003,
 'shocks': 0.109,
 'ching': -0.58399999999999996,
 'china': 0.20499999999999999,
 'hand-pulled': 1.5640000000000001,
 'wagyu': 1.679,
 'affiliated': -1.5119999999999998,
 'chino': 0.51700000000000002,
 'wiseguy': 1.4790000000000001,
 'natured': 0.109,
 'reggae': 1.107,
 'kids': 0.018000000000000002,
 'uplifting': 0.56100000000000005,
 'robata': 0.69700000000000006,
 'cupful': 0.64800000000000002,
 'controversy': -0.16300000000000001,
 'spotty': -0.122,
 'golden': 0.45700000000000002,
 'cobblers': -0.13200000000000001,
 'projection': -0.17499999999999999,
 "chilli's": -0.58399999999999996,
 'orangish': -0.044999999999999998,
 "justin's": -0.044999999999999998,
 'stern': -0.80799999999999994,
 'dna': 0.32200000000000001,
 'portugal': 0.24199999999999999,
 'catchy': -0.76900000000000002,
 'music': 0.41100000000000003,
 'therefore': -0.20199999999999999,
 'grahams': 0.64800000000000002,
 'yahoo': 0.90400000000000003,
 ',,,,': 0.088000000000000009,
 'circumstances': -1.03,
 'intake': 0.318,
 'locked': -1.0409999999999999,
 'tomahawk': -0.45100000000000001,
 "india's": 0.64800000000000002,
 'meat-eating': 1.254,
 'pints': 0.68200000000000005,
 'grass-fed': 0.57899999999999996,
 'kaprow': 0.80200000000000005,
 'w/rice': 0.871,
 '********': -0.96200000000000008,
 'want': 0.050999999999999997,
 'pinto': 0.73099999999999998,
 'cookery': -0.044999999999999998,
 'absolute': 0.66099999999999992,
 'mcmuffin': -0.76900000000000002,
 'sugarless': 0.64800000000000002,
 'travel': 0.40100000000000002,
 'copious': -0.128,
 "badman's": 0.80200000000000005,
 'barstools': 0.16800000000000001,
 'dared': -0.61299999999999999,
 'topings': -0.45100000000000001,
 'tutorial': 0.24199999999999999,
 'caldo': 0.83700000000000008,
 'fookin': -0.29699999999999999,
 'modest': 0.56100000000000005,
 'tomato-based': -1.026,
 'colorfully': 1.0529999999999999,
 'sickening': -1.6299999999999999,
 'tulip': 0.64800000000000002,
 '18th': 0.28300000000000003,
 'horribly': -2.319,
 'crackin': 1.0529999999999999,
 'hophead': 0.64800000000000002,
 'welcomed': 1.022,
 "tam's": 1.0529999999999999,
 'whizzing': -0.29699999999999999,
 'partnered': 0.17800000000000002,
 'tillamook': 0.35999999999999999,
 "luc's": -0.16300000000000001,
 'rewarded': 0.622,
 'stabbed': -0.53799999999999992,
 'welcomes': 1.9909999999999999,
 'fir': -1.01,
 'wickedly': 0.871,
 "yasu's": 0.72799999999999998,
 'screaming': -0.77700000000000002,
 'fix': -0.105,
 'non-vegan': 0.35999999999999999,
 'fig': 1.681,
 'fuddruckers': -0.59099999999999997,
 'fin': 1.0979999999999999,
 'fil': 0.217,
 'vouchers': -0.65799999999999992,
 'top-quality': 0.24199999999999999,
 'r+d': -0.7390000000000001,
 "pistachio's": 0.64800000000000002,
 'over-rated': -3.4789999999999996,
 'sixteen': 0.155,
 'chandler/gilbert': 1.254,
 'saddened': -0.68700000000000006,
 'laxpudding': 1.341,
 'wooden': 0.59200000000000008,
 'bartop': -0.45100000000000001,
 'arrow': 0.72799999999999998,
 'arroz': 0.371,
 'kincaids': -1.48,
 'windmill': 1.421,
 'allah': 0.64800000000000002,
 'oohed': 0.64800000000000002,
 'allan': 0.64800000000000002,
 'phoenicians': 0.21600000000000003,
 'grapefruits': -0.044999999999999998,
 'veracruz': 0.33799999999999997,
 "greek's": 0.80200000000000005,
 'touts': -1.01,
 'oprah': -0.51500000000000001,
 'smirk': -1.0390000000000001,
 'scrumptiously': 0.80200000000000005,
 'enviroment': 1.107,
 'mason': 1.0149999999999999,
 'encourage': 0.7659999999999999,
 "joe's!": 2.1880000000000002,
 'adapt': -0.044999999999999998,
 'frenchie': -0.60499999999999998,
 'pumpkins': 1.0529999999999999,
 'estimate': -0.71700000000000008,
 'universally': -0.13200000000000001,
 'chlorine': -0.7390000000000001,
 'silent': -0.27200000000000002,
 'competes': 1.5640000000000001,
 'sickeningly': -2.6839999999999997,
 'paneng': 0.80200000000000005,
 'disturbed': -1.2490000000000001,
 'competed': -1.837,
 'vermecelli': 0.24199999999999999,
 'loudness': -0.45100000000000001,
 'chronic': -0.29699999999999999,
 'breed': 0.035000000000000003,
 'aurelios': 0.80200000000000005,
 'filibertos': -0.747,
 'kfc': -1.357,
 'chevron': -0.044999999999999998,
 'sisig': 0.64800000000000002,
 'renovated': 0.68400000000000005,
 'service': -0.026000000000000002,
 'zinfandel': 0.98799999999999999,
 'reuben': 0.40399999999999997,
 'needed': -0.23399999999999999,
 'mind-boggling': -0.7390000000000001,
 'olde': 0.64800000000000002,
 'rewards': 1.008,
 'ph\xef\xbf\xbd': 1.0529999999999999,
 'complainers': 0.109,
 'chain-restaurant': 0.64800000000000002,
 'positively': 0.28499999999999998,
 'anniversaries': 1.5640000000000001,
 'non-beer': 0.64800000000000002,
 'idly': -0.58399999999999996,
 'idle': -1.395,
 'sheen': 0.080000000000000002,
 'shampoos': -0.044999999999999998,
 'feeling': 0.054000000000000006,
 "oscar's": 0.996,
 "wanda's": 0.24199999999999999,
 "lulu's": 0.109,
 "sarah's": 0.35999999999999999,
 'politely': -1.508,
 'spectrum': 0.36700000000000005,
 'thaw': -1.5490000000000002,
 'dozen': 0.254,
 'affairs': -0.044999999999999998,
 'scraped': -1.115,
 'wholesome': 1.0940000000000001,
 '2.75': -0.7390000000000001,
 'shipments': -0.044999999999999998,
 'committing': 0.32200000000000001,
 'sugarcane': -0.044999999999999998,
 'limitless': 0.53000000000000003,
 'disjointed': -0.98999999999999999,
 'mouth': 0.59999999999999998,
 'conceded': 0.109,
 'singer': 0.254,
 '-will': 0.109,
 'tech': -0.16300000000000001,
 'paneling': 0.64800000000000002,
 'scream': -0.55899999999999994,
 'saying': -0.67799999999999994,
 'condesa': 0.39100000000000001,
 'braising': 0.93500000000000005,
 'post-work': 0.32200000000000001,
 'teresa': 0.56100000000000005,
 'padded': 0.84799999999999998,
 'fennel': 1.145,
 'toys-r-us': 0.64800000000000002,
 'cheaply': -0.76900000000000002,
 'eliminated': -0.97199999999999998,
 'orleans': 0.41200000000000003,
 'flowers': 1.232,
 'clicked': -1.3109999999999999,
 "baby's": 0.20000000000000001,
 'rico': 0.82499999999999996,
 'lube': -1.367,
 'bliss': 1.0109999999999999,
 'rick': 0.019,
 'rich': 1.3799999999999999,
 'rice': 0.052000000000000005,
 'rica': 0.46500000000000002,
 'plate': -0.16600000000000001,
 'waaaaaay': -0.53799999999999992,
 'plato': 1.9909999999999999,
 '7-11': -0.45100000000000001,
 '7-10': -0.25700000000000001,
 "tomaso's": 0.753,
 'altogether': -0.68099999999999994,
 'rugged': 0.64800000000000002,
 'nicely': 0.57700000000000007,
 'boarder': -0.67400000000000004,
 'pretzel': 0.69400000000000006,
 'patch': 0.29100000000000004,
 'american-ized': 0.64800000000000002,
 'boarded': -0.67400000000000004,
 'circling': -0.044999999999999998,
 'heirloom': 1.2590000000000001,
 'clarified': -0.60499999999999998,
 'sensitivity': -0.45100000000000001,
 'pinon': 0.93500000000000005,
 'pinot': 1.0979999999999999,
 '10:00': -0.60499999999999998,
 'pinos': 0.93500000000000005,
 'pinoy': -0.29699999999999999,
 '48th': -0.17199999999999999,
 'lots': 0.76300000000000001,
 'droves': 0.24199999999999999,
 'irk': -1.5490000000000002,
 'letting': -0.377,
 'extend': 0.28699999999999998,
 'nature': 0.32400000000000001,
 '99cents': -0.78700000000000003,
 'optimist': -0.044999999999999998,
 'lapping': -0.044999999999999998,
 'extent': -0.66900000000000004,
 'gobble': 1.629,
 'tendons': -0.70200000000000007,
 'cus': -1.1440000000000001,
 'sweet/salty': 1.0529999999999999,
 '-rice': -1.3259999999999998,
 'veer': -0.16300000000000001,
 "cork's": -0.58399999999999996,
 'heating': -0.75599999999999989,
 'incense': 0.20000000000000001,
 'lot-': 0.64800000000000002,
 'himalayan': 0.80200000000000005,
 'mortified': -1.9709999999999999,
 'pasteles': -0.96200000000000008,
 'fresheezy': 0.80200000000000005,
 'humming': 0.89300000000000002,
 'frc': 0.93500000000000005,
 'fra': 0.191,
 'underdone': -1.3259999999999998,
 'milquetoast': -2.9360000000000004,
 'union': 0.32899999999999996,
 'fri': 0.498,
 'fro': 0.91299999999999992,
 'fair-trade': 1.159,
 '.': -0.0080000000000000002,
 'bothers': -0.155,
 'much': -0.18899999999999997,
 'sommelier': 0.83200000000000007,
 '-they': 0.96599999999999997,
 'fru': 0.39600000000000002,
 'fry': 0.36299999999999999,
 'tallest': -0.58399999999999996,
 'cub': -1.837,
 'obese': -0.76900000000000002,
 'retrospect': -1.2209999999999999,
 'spit': -1.4099999999999999,
 'spic': 0.64800000000000002,
 'dave': 0.17899999999999999,
 'doubts': -0.11900000000000001,
 'spin': 0.83400000000000007,
 'wildcat': 0.32200000000000001,
 'professionally': -0.45100000000000001,
 'employ': -0.40600000000000003,
 'real-deal': 0.93500000000000005,
 'reunion': 0.34799999999999998,
 'k': 0.111,
 'climbed': -0.13200000000000001,
 'saltiness': 0.33799999999999997,
 'expat': -0.29699999999999999,
 'conditioned': 0.55100000000000005,
 'eighteen': 0.24199999999999999,
 'conditioner': -0.67400000000000004,
 'oxymoron': -0.16300000000000001,
 'insatiable': 0.56100000000000005,
 'hone': 1.0529999999999999,
 'hong': 0.80200000000000005,
 'ogether': 0.20600000000000002,
 'portobella': 0.24199999999999999,
 'portobello': 0.60499999999999998,
 'split': 0.66799999999999993,
 "jungle's": 0.83700000000000008,
 "caf\xef\xbf\xbd's": 0.93500000000000005,
 'effortlessly': -0.26899999999999996,
 'low-grade': -2.2430000000000003,
 'qdoba': -0.314,
 'marched': -1.298,
 'refil': -1.1440000000000001,
 'supper': 0.88,
 "foodie's": 1.8,
 'nuanced': 1.159,
 '!!!!!!!!': -0.17100000000000001,
 'academic': 0.46500000000000002,
 'goofing': -2.125,
 'corporate': -0.77800000000000002,
 'appropriately': 0.19699999999999998,
 '????': -1.484,
 'hushpuppies': 0.24199999999999999,
 'lassi': 1.0529999999999999,
 'hah': -0.45100000000000001,
 'hai': -0.33299999999999996,
 'upsides': -0.044999999999999998,
 'ham': 0.42999999999999999,
 'out-of-town': 0.109,
 'haa': -0.044999999999999998,
 'had': -0.026000000000000002,
 'hag': -1.6140000000000001,
 'hay': -0.188,
 'waffels': 1.0529999999999999,
 'beloved': 0.39600000000000002,
 '2-3x': 0.35999999999999999,
 'has': 0.36200000000000004,
 'hat': -0.096999999999999989,
 'hav': -0.67400000000000004,
 'haw': 0.93500000000000005,
 "t's": -0.34499999999999997,
 'municipal': 1.254,
 "boylan's": 1.0529999999999999,
 'elders': 0.24199999999999999,
 'confection': 2.1519999999999997,
 'online': -0.29899999999999999,
 "rocket's": 0.109,
 'unequivocally': 0.46500000000000002,
 'mexican/latin': -0.044999999999999998,
 'otherworldly': 0.35999999999999999,
 'indicative': -0.17899999999999999,
 'shadow': -0.311,
 '12:30': -0.35600000000000004,
 "american's": -0.044999999999999998,
 'delievery': 0.64800000000000002,
 'kazmierz': 0.64800000000000002,
 'non-menu': 0.64800000000000002,
 'alice': -0.19399999999999998,
 'gangplank': 0.109,
 'festivities': 0.217,
 'unabashedly': 0.24199999999999999,
 'beneficial': 0.24199999999999999,
 'crowd': 0.32100000000000001,
 'czech': 0.93500000000000005,
 'crown': 0.28800000000000003,
 'begin': -0.64500000000000002,
 'crows': -0.13200000000000001,
 'karmeliet': 0.93500000000000005,
 'banchan': 1.8519999999999999,
 'hot/cold': 1.0529999999999999,
 'bottom': -0.623,
 'plucked': 0.13699999999999998,
 'yummie': 1.9909999999999999,
 'treadmill': 2.2230000000000003,
 'rep': -0.92900000000000005,
 'binder': -0.317,
 'starring': -0.20000000000000001,
 'benches': 0.79000000000000004,
 'cheese-filled': 0.64800000000000002,
 'benched': -0.45100000000000001,
 'stoked': 0.065000000000000002,
 'mini-shopping': 0.80200000000000005,
 'maxwell': 1.4950000000000001,
 'marshall': 1.341,
 'honeymoon': 1.901,
 'mba': 0.93500000000000005,
 'mbb': 1.8519999999999999,
 'beings': -0.53799999999999992,
 'mangos': 0.90400000000000003,
 'corn-fed': 0.64800000000000002,
 'grossest': -4.0880000000000001,
 'appealingly': 0.80200000000000005,
 'panninis': 1.159,
 'suffice': -0.39700000000000002,
 'daves': -0.39700000000000002,
 'tame': 0.376,
 'grasping': -1.1440000000000001,
 'greatness': 0.41399999999999998,
 'avocados': 0.376,
 "harlow's": 0.64800000000000002,
 'robeks': 0.871,
 'thesaurus': 0.64800000000000002,
 'verde': 0.83900000000000008,
 'peperoni': 0.0050000000000000001,
 'significantly': -0.745,
 'paneled': 0.56100000000000005,
 'humbled': 1.254,
 "else's": -0.77500000000000002,
 'smashes': 0.109,
 'azz': 0.24199999999999999,
 'open-air': 0.96599999999999997,
 'nicest': 1.3730000000000002,
 'servings': 0.745,
 'smashed': -0.30399999999999999,
 'duet': -0.16300000000000001,
 'refillable': 1.421,
 'passenger': -0.044999999999999998,
 'disgrace': -3.2230000000000003,
 'calzones': 1.1879999999999999,
 'bourdain': 0.080000000000000002,
 'females': -1.2009999999999998,
 'triangles': 0.109,
 'microwaving': -1.655,
 'friendly/helpful': 1.0529999999999999,
 'eventual': -0.96200000000000008,
 'cambodia': 0.61399999999999999,
 'pasadena': 2.1519999999999997,
 'role': -0.61899999999999999,
 'macaroon': 1.577,
 'roll': -0.042000000000000003,
 'intend': 0.45100000000000001,
 'palms': 0.24199999999999999,
 'transported': 1.1000000000000001,
 'acquaintance': -0.39000000000000001,
 'intent': -0.069000000000000006,
 'smelling': -0.28100000000000003,
 'variable': -1.026,
 'explosions': 1.254,
 'loren': 0.72799999999999998,
 "mike's": -0.13699999999999998,
 'president': -0.95299999999999996,
 'steelhead': 1.254,
 'gown': 0.109,
 '5:15': -0.64800000000000002,
 'cincinnati': 0.059999999999999998,
 'corps': 0.80200000000000005,
 'whoever': -0.52100000000000002,
 'osp': 1.341,
 'bandito': 0.24199999999999999,
 'bandits': 0.64800000000000002,
 'chair': -0.58099999999999996,
 'osf': -0.72099999999999997,
 'tilt': -0.58399999999999996,
 'ballet': 1.9909999999999999,
 'raining': -0.17899999999999999,
 "when's": 0.24199999999999999,
 'crates': -0.044999999999999998,
 'camarones': 0.85400000000000009,
 'macho': 0.14400000000000002,
 'oversight': -1.097,
 'machi': -0.044999999999999998,
 'over-the-top': 0.129,
 'portlandia': 1.0529999999999999,
 'jerk': 0.31900000000000001,
 'choice': 0.65900000000000003,
 'embark': -0.044999999999999998,
 'gloomy': -0.22800000000000001,
 'stays': 0.58399999999999996,
 'exact': -0.43700000000000006,
 'minute': -0.40799999999999997,
 "stacy's": 0.109,
 'cooks': -0.27600000000000002,
 'az88': 0.12,
 'minnie': 1.4950000000000001,
 'skewed': 0.24199999999999999,
 'skewer': 0.070999999999999994,
 'meadow': 0.24199999999999999,
 'trails': 1.159,
 '11:45': -0.188,
 'gnawed': 0.109,
 'chopping': 0.019,
 'trisha': 0.64800000000000002,
 'adorns': 1.4950000000000001,
 'bagging': -1.655,
 'bakeshop': 0.39600000000000002,
 'chimichanga': -0.02,
 "mother-in-law's": -0.96200000000000008,
 'celebrated': 1.1990000000000001,
 'ground': -0.33299999999999996,
 'celebrates': 1.254,
 'unintentionally': 0.64800000000000002,
 'oldies': 0.27300000000000002,
 'address': -0.38500000000000001,
 'chit-chat': -0.52500000000000002,
 'xtreme': -0.02,
 'benson': 0.64800000000000002,
 'mafioso': 0.93500000000000005,
 'dusty': -0.51500000000000001,
 'impacted': -0.55600000000000005,
 'queue': -0.0090000000000000011,
 'accomplished': 0.17800000000000002,
 'sprouted': 0.059999999999999998,
 'influx': -0.35600000000000004,
 'pibil': 0.59599999999999997,
 'tamp': 0.109,
 'randoms': -0.044999999999999998,
 'omelets': 0.998,
 'umph': 1.159,
 "grandpa's": 0.64800000000000002,
 'undergone': 0.59099999999999997,
 'working': -0.13300000000000001,
 'hyatt': -0.314,
 '!!!!!!': 0.222,
 'opposed': 0.313,
 'pavilions': 0.059999999999999998,
 'raspados': 1.9909999999999999,
 'ooooooo': 0.64800000000000002,
 'kreamery': 0.93500000000000005,
 'thompson': -0.25700000000000001,
 'cookouts': 0.64800000000000002,
 'riders': 0.30299999999999999,
 'charbroil': 0.93500000000000005,
 'originally': 0.253,
 'abortion': -2.3969999999999998,
 'service-wise': -0.13200000000000001,
 'harmonious': 1.421,
 'following': 0.072000000000000008,
 'admired': -0.65799999999999992,
 'mirrors': -0.13,
 'stetson': 1.0609999999999999,
 "gigi's": -0.33299999999999996,
 'locks': -0.35600000000000004,
 'matzo': 0.28800000000000003,
 'parking-': -0.044999999999999998,
 'admirer': 0.80200000000000005,
 'matza': 0.93500000000000005,
 'listens': 1.746,
 'litre': 0.93500000000000005,
 'thanking': 0.88,
 'mintues': -1.0640000000000001,
 'casualness': 0.35999999999999999,
 'teavana': 0.214,
 'fueled': 0.13699999999999998,
 'tqla': 0.0090000000000000011,
 'surfing': 1.2229999999999999,
 'conscious': 0.29600000000000004,
 'mango': 0.753,
 'swollen': -1.1440000000000001,
 'mange': 0.109,
 'pulled': 0.059999999999999998,
 'manga': 1.0529999999999999,
 'spicier': 1.0390000000000001,
 'webpage': 0.24199999999999999,
 'years': 0.19500000000000001,
 'professors': 1.5640000000000001,
 'yearn': 0.33799999999999997,
 'jig': -0.044999999999999998,
 'disconnect': -0.98999999999999999,
 'did-': 0.64800000000000002,
 'greeks': 0.13699999999999998,
 'jim': 0.83200000000000007,
 'troubles': -0.45100000000000001,
 'faves': 3.306,
 'no-brainer': 0.38200000000000001,
 'modestly': 0.28300000000000003,
 'dpov': 0.24199999999999999,
 'recipients': -0.044999999999999998,
 'indigenous': 0.46500000000000002,
 'biersch': -0.54899999999999993,
 "babaloo's": 1.159,
 'shhh': 1.0529999999999999,
 'hickory': 1.2229999999999999,
 'plunk': -0.45100000000000001,
 'awkwardly': -1.2869999999999999,
 'tortilleria': 1.159,
 'fisherman': 0.109,
 'quarter': -0.40500000000000003,
 'quartet': 0.24199999999999999,
 'salado': -0.044999999999999998,
 'retrieve': -1.2549999999999999,
 'bursting': 1.857,
 'salade': 1.159,
 'receipt': -1.8619999999999999,
 'over-sized': -0.26899999999999996,
 'sponsor': 0.059999999999999998,
 'entering': -0.002,
 'receipe': 0.64800000000000002,
 'salads': 0.72499999999999998,
 'disasters': -1.3259999999999998,
 'seriously': -0.21100000000000002,
 'trauma': 0.24199999999999999,
 'internet': 0.126,
 '(?)': -0.028999999999999998,
 'salad-': 0.39600000000000002,
 'disintegrated': -2.6480000000000001,
 'incentives': 0.80200000000000005,
 'inwardly': 0.64800000000000002,
 'crazies': -0.044999999999999998,
 'crazier': -0.044999999999999998,
 'grandma': 0.32400000000000001,
 '6:20': -0.7390000000000001,
 'marla': 0.64800000000000002,
 'downsides': 2.306,
 "couldn't!": -0.044999999999999998,
 'quibble': 1.901,
 'zuni': 0.64800000000000002,
 'neglect': -1.655,
 'emotion': -0.55600000000000005,
 'saving': -1.1279999999999999,
 'ono': 0.311,
 'spoken': -0.70200000000000007,
 'tolteca': 0.45000000000000001,
 "our's": -0.45100000000000001,
 'bomb-diggity': -0.16300000000000001,
 'one': -0.0040000000000000001,
 'tamari': -0.67400000000000004,
 'oogave': 0.64800000000000002,
 'mignon': 1.0309999999999999,
 'ons': -0.29699999999999999,
 'affords': -0.55600000000000005,
 'looong': 0.035000000000000003,
 "boston's": -0.57600000000000007,
 '1.50': 0.14400000000000002,
 'magnifique': 1.5640000000000001,
 'lingering': -0.027999999999999997,
 'pablano': 0.505,
 'on-': 0.39600000000000002,
 'shawn': 0.80200000000000005,
 'brewpub': 0.214,
 'snatch': 0.13699999999999998,
 'herds': 0.64800000000000002,
 'absorbs': 1.341,
 "con's:": 0.56100000000000005,
 'non-stop': -0.39000000000000001,
 'so-so': -1.1990000000000001,
 'rehab': 0.17800000000000002,
 'oooohhhh': 1.0529999999999999,
 'proactive': 0.13699999999999998,
 'illness': -1.2390000000000001,
 'stylings': 0.059999999999999998,
 'sumptuous': 0.871,
 'turned': -0.70200000000000007,
 'locations': 0.33200000000000002,
 'jewels': 0.53000000000000003,
 'uninterrupted': 1.159,
 'turner': 0.109,
 'politicos': -0.044999999999999998,
 '50cents': -0.33299999999999996,
 '$40.00': -0.45100000000000001,
 'zoe': -0.45100000000000001,
 'fashionable': -0.044999999999999998,
 'warriors': 0.035000000000000003,
 'straight-up': 0.42499999999999999,
 'zoo': 0.254,
 'evened': 0.35999999999999999,
 'fashionably': 0.64800000000000002,
 'printer': -1.5490000000000002,
 'opposite': -0.32200000000000001,
 'buffer': 0.93500000000000005,
 'discerning': 0.77800000000000002,
 'spewing': -2.125,
 'buffet': -0.12,
 'location-': 0.53000000000000003,
 'printed': -0.53299999999999992,
 'unpronounceable': 0.93500000000000005,
 'touchy': 0.109,
 'phil': -0.044999999999999998,
 'yucatan': 0.109,
 'ooohhh': 0.93500000000000005,
 'wynn': -0.044999999999999998,
 'inconsistent': -1.2,
 'aggressive': -1.696,
 'imagined': 0.70599999999999996,
 'ensembles': 0.64800000000000002,
 'recomended': 0.871,
 'try-': 1.159,
 'rejoiced': 1.0529999999999999,
 'simplistic': 0.80200000000000005,
 'monde': 0.64800000000000002,
 'awaiting': -0.0069999999999999993,
 'trappist': 0.80200000000000005,
 'pimp': -0.51500000000000001,
 'was/is': 0.80200000000000005,
 'trys': -0.89300000000000002,
 'tiki': 0.70200000000000007,
 "could've": -0.53600000000000003,
 'pima': 0.045999999999999999,
 'tika': 0.46500000000000002,
 'hotness': 0.64800000000000002,
 'vision': 0.318,
 '!!!!!!!!!!!!!!!!!': 1.0529999999999999,
 'enthralled': 0.46500000000000002,
 'impressions': -0.68599999999999994,
 'intoxicating': 1.9159999999999999,
 "alex's": -0.044999999999999998,
 'refresher': 0.16800000000000001,
 '830': -0.96200000000000008,
 'harvey': -0.45100000000000001,
 'dofino': 1.0529999999999999,
 'refreshed': 1.008,
 'enjoys': 1.0209999999999999,
 'awards': -0.07400000000000001,
 'concentrated': -0.22800000000000001,
 'busting': -0.28699999999999998,
 'circumference': -0.58399999999999996,
 'tastless': -1.6140000000000001,
 'fit': 0.433,
 'millionaire': -0.33299999999999996,
 'pre-show': -0.044999999999999998,
 'paring': 2.1139999999999999,
 's': 0.052999999999999999,
 'workplace': 0.035000000000000003,
 'unflattering': -0.98999999999999999,
 "can't": 0.375,
 'loveliest': 0.24199999999999999,
 'spitfire': 0.30299999999999999,
 "shady's": 1.159,
 'glutinous': -0.35600000000000004,
 'anthropologie': 1.0529999999999999,
 'west': 0.68200000000000005,
 'breckenridge': 1.421,
 '$8.50': -0.23899999999999999,
 'wants': -0.44700000000000001,
 'mekong': -0.059999999999999998,
 'formed': -0.45100000000000001,
 'readings': 0.56100000000000005,
 'photos': 0.28600000000000003,
 'wilco': 0.80200000000000005,
 'former': -0.0060000000000000001,
 'rarities': -0.044999999999999998,
 'consulted': -0.25700000000000001,
 'squeezes': 0.64800000000000002,
 'ives': 0.93500000000000005,
 'd-lish': 1.5640000000000001,
 "mcduffy's": 0.35999999999999999,
 'squeezed': 0.67799999999999994,
 'situation': -0.91799999999999993,
 'penthouse': 0.64800000000000002,
 'paht': -0.45100000000000001,
 'engaged': 1.327,
 'dubious': -0.59699999999999998,
 'moreso': 0.53000000000000003,
 'technology': 0.61399999999999999,
 "poliberto's": 0.93500000000000005,
 'verified': -0.88200000000000001,
 'deeeelish': 1.6890000000000001,
 'otto': -0.29699999999999999,
 'o-m-g': 1.421,
 'visually': -0.28800000000000003,
 'edged': 0.24199999999999999,
 'hideaway': 0.79299999999999993,
 'deft': 1.0529999999999999,
 'defy': 0.035000000000000003,
 'edges': -0.35700000000000004,
 'amuck': -0.044999999999999998,
 'deff': 1.0529999999999999,
 'advertisement': -1.2009999999999998,
 'teepee': -0.94299999999999995,
 'tracking': -0.89300000000000002,
 'yummmmmmy': 1.0529999999999999,
 'nothin': -0.39700000000000002,
 'yummmmmmm': 1.254,
 'sixer': -0.58399999999999996,
 'machacha': 0.64800000000000002,
 'dimension': 1.0249999999999999,
 'effects': -0.090999999999999998,
 'self-parking': 0.64800000000000002,
 'steamed': 0.24100000000000002,
 'being': -0.19500000000000001,
 'recycled': 0.24199999999999999,
 'steamer': -0.16300000000000001,
 'rover': 0.64800000000000002,
 'grounded': 0.035000000000000003,
 'excuses': -1.169,
 'location-wise': 0.64800000000000002,
 'haystack': -0.0050000000000000001,
 'dicks': 0.27699999999999997,
 'gestured': -1.732,
 'marshmellow': 0.60499999999999998,
 'sums': -1.0959999999999999,
 'unveil': 0.64800000000000002,
 'sumo': -0.33299999999999996,
 'traffic': 0.184,
 'preference': 0.71599999999999997,
 'sorry-': -2.2430000000000003,
 'world': 0.84799999999999998,
 'postal': -0.33299999999999996,
 'pizzaria': 0.59899999999999998,
 'sensational': 1.6890000000000001,
 "mojito's": 1.4950000000000001,
 'shutter': -1.704,
 'seating': 0.81499999999999995,
 "couldn't": -0.60999999999999999,
 'grub': 0.91099999999999992,
 'tvp': -0.044999999999999998,
 'tvs': 0.25600000000000001,
 'diving': 0.001,
 'stagecoach': -0.29699999999999999,
 'divine': 2.1230000000000002,
 'yesss': 0.93500000000000005,
 'excelent': 1.0529999999999999,
 'scuffed': 0.80200000000000005,
 'scottie': -0.044999999999999998,
 'drizzed': 0.64800000000000002,
 'well-spiced': 0.24199999999999999,
 'litttle': 0.64800000000000002,
 'restoring': 1.254,
 'retains': 0.17800000000000002,
 "partner's": -1.367,
 'leadership': -0.67400000000000004,
 'phoenix-metro': 1.341,
 'thailand': 0.26200000000000001,
 'luxurious': 1.107,
 'demarco': 0.56100000000000005,
 'paninis': 1.3330000000000002,
 'johnston': 1.341,
 'perturbed': -1.4319999999999999,
 'antidote': 0.64800000000000002,
 'rosse': 0.109,
 'bjs': -0.7390000000000001,
 'lively': 1.28,
 'bubbly': 0.56499999999999995,
 'gleam': 0.64800000000000002,
 'ooey-gooey': 0.39600000000000002,
 'lounging': 0.94900000000000007,
 'mindless': -0.29699999999999999,
 'missy': -2.125,
 'sealed': -0.29699999999999999,
 'brazilian': 1.6200000000000001,
 'bubble': 0.42299999999999999,
 'trusses': 0.80200000000000005,
 'continents': 0.80200000000000005,
 "child's": -1.1440000000000001,
 "sf's": 0.80200000000000005,
 'chilies': 0.622,
 'nickle': -1.462,
 'miss-': -0.044999999999999998,
 'pull': -0.34200000000000003,
 'rush': 0.0080000000000000002,
 'rage': -0.36099999999999999,
 'tripe': 0.90799999999999992,
 'on-tap': -0.044999999999999998,
 'tripa': 1.0529999999999999,
 'rags': -0.85599999999999998,
 'dirty': -1.389,
 'ragu': -0.63300000000000001,
 'asst': -2.8789999999999996,
 'trips': 0.314,
 'rust': -1.4319999999999999,
 'butthole': -1.704,
 'gratuitous': -0.16300000000000001,
 'serve-yourself': 0.109,
 'watches': -0.67400000000000004,
 'watcher': 0.80200000000000005,
 'ensuing': 0.155,
 'follow-up': -0.7390000000000001,
 'watched': -0.91200000000000003,
 'after-work': 0.35999999999999999,
 'cream': 0.60299999999999998,
 'yoga': 2.423,
 'ideally': 0.68900000000000006,
 'yogi': 0.17800000000000002,
 'unparalleled': 2.29,
 'friggin': -0.30399999999999999,
 'puppy': 0.434,
 'refunded': -2.1970000000000001,
 'waving': -1.4180000000000001,
 'sheepishly': -1.1440000000000001,
 'closeby': -0.67400000000000004,
 'tricky': 1.5269999999999999,
 'natalie': 0.46500000000000002,
 'tricks': 0.442,
 'dyed': 0.24199999999999999,
 'caused': -1.131,
 'beware': -0.26600000000000001,
 'landlocked': 0.109,
 "bertha's!": 1.0529999999999999,
 'c-fu': 0.14400000000000002,
 'acknowledging': -1.9909999999999999,
 'causes': -0.55600000000000005,
 'nori': 0.24199999999999999,
 'midwest': 0.46000000000000002,
 'norm': 0.31900000000000001,
 'checkin': 1.5640000000000001,
 'floated': -0.35600000000000004,
 '24th': 0.46500000000000002,
 'moines': -0.044999999999999998,
 'sans': -0.13,
 'shenanigans': -0.371,
 "lai's": 0.64800000000000002,
 'developing': -0.54600000000000004,
 'boozer': 0.80200000000000005,
 'prosciutto': 1.0549999999999999,
 'small': 0.13300000000000001,
 'sano': 0.64800000000000002,
 'chipolte': -0.044999999999999998,
 'sank': 1.9909999999999999,
 'abbreviated': -0.29699999999999999,
 'quicker': 0.01,
 'gnochi': -0.58399999999999996,
 'paso': 1.298,
 'food/atmosphere': 0.64800000000000002,
 'healed': 0.64800000000000002,
 'past': -0.22800000000000001,
 'displays': 0.53400000000000003,
 'pass': -0.62,
 'investment': 0.14400000000000002,
 'anywho': 0.29499999999999998,
 'clock': -0.51300000000000001,
 'corked': -0.29699999999999999,
 'prevailed': 1.0529999999999999,
 'pre-dinner': 0.59099999999999997,
 '----------------------': 0.64800000000000002,
 'full': 0.42200000000000004,
 'hash': 0.153,
 'diapers': -0.51500000000000001,
 'november': 0.14199999999999999,
 'experience': -0.14000000000000001,
 'prior': -0.40700000000000003,
 'ohhhhh': 1.4950000000000001,
 'skepticism': 0.109,
 "friday's": 0.115,
 'followed': 0.215,
 'reclaimed': 0.51400000000000001,
 'traumatized': -2.6480000000000001,
 'follower': 1.159,
 'full-sized': 1.107,
 'attendance': 0.33200000000000002,
 "tito's": 0.64800000000000002,
 'morn': -0.29699999999999999,
 "patio's": 1.8,
 'teeny-tiny': -0.45100000000000001,
 'reisling': 0.20000000000000001,
 'more': 0.091999999999999998,
 'door': -0.13,
 'initiated': -0.044999999999999998,
 'tester': -0.96200000000000008,
 'company': 0.08900000000000001,
 'corrected': -0.88200000000000001,
 'chucks': 0.088000000000000009,
 'tested': 0.27000000000000002,
 'dranks': -0.29699999999999999,
 'uncool': -0.7390000000000001,
 'negativity': -1.298,
 'leary': 1.254,
 'thickest': 0.019,
 'w/this': -0.044999999999999998,
 'learn': -0.28399999999999997,
 'knocked': 0.16800000000000001,
 'scramble': 0.45399999999999996,
 'winemaker': 0.64800000000000002,
 'vampiros': 0.56100000000000005,
 "schlotzky's": -0.16300000000000001,
 'bogo': 0.46500000000000002,
 'huge': 0.72499999999999998,
 'respective': 0.059999999999999998,
 'gringa': -0.58399999999999996,
 'hugo': -0.16300000000000001,
 'hugh': -0.58399999999999996,
 'dismissed': -1.732,
 'hugs': -0.16300000000000001,
 'higley': 0.27699999999999997,
 'sprinkle': 0.11800000000000001,
 'intended': 0.151,
 'thickened': -0.29699999999999999,
 'phabulous': 0.80200000000000005,
 'tamarindo': 1.5640000000000001,
 'fluffed': 0.64800000000000002,
 'jiang': 0.64800000000000002,
 'funday': 2.0340000000000003,
 'resemble': -0.41299999999999998,
 ...}

Observations

We note that the two dictionaries have emoticons, that their word elements are lower-cased and that verbs are conjugated in various tenses. We also note that the Yelp Sentiment dictionary has both correctly spelled and mispelled words.

Run the Sentiment Detection Algorithm with Dictionaries and NLTK's TweetTokenizer

As the restaurant reviews in our dataset has emoticons, we choose NLTK's TweetTokenizer to tokenize words from restaurant reviews with a view not to sacrifice the emoticons that provide key review sentiments. We acknowledge however that sarcasm may be difficult to detect with a simple dictionary method. We also choose to lower the case of all words and emoticons as both dictionaries have exclusively lower-case words and as the AFINN dictionary has both higher-case emoticons and their lower-case equivalents. We choose not to combine all the sentiment score summing lines in the same algorithm for the purposes of comprehension. We acknowledge that a combination of these summing lines in the same algorithm may save significant computational ressources with larger datasets.

Extract Sentiment Scores from Reviews Using Algorithm, TweetTokenizer and Yelp Restaurant Review Sentiment Dictionary


In [9]:
from nltk.tokenize import TweetTokenizer
tknzr = TweetTokenizer()
for row in range(len(ThaiTextByUniqueUserBiz)):
    n = ThaiTextByUniqueUserBiz.loc[row, 'text']
    words = tknzr.tokenize(n.lower())
    ThaiTextByUniqueUserBiz.loc[row,"YRsentiment"] = sum(Yelp_sentiment_dictionary.get(word, 0) for word in words)
ThaiTextByUniqueUserBiz


Out[9]:
user_id business_id text YRsentiment
0 --65q1FpAL_UQtVZ2PTGew JiLK9QPjd53pOBEAaY83lw I'm a big fan of this place and have dropped i... 43.539
1 --ijvARuRJhZrBdS9_jF2A ApUCpJ9aa6yVgsde16gYrg Food was ok but the service was less than exce... -9.641
2 --ohLoec6PU9_yxhbIlVWg 2XXwiASSS6685OhWWnIt_A I got the Penang curry and have to say the foo... 0.105
3 --qEXbk-cA0HmbPyhcffdA CVos739DJ06t8-dNiRMyeQ To sum up in one sentence: "I only go to Thai ... -16.096
4 --qEXbk-cA0HmbPyhcffdA jQST5lkLGX9L52-A10TGTQ I LOVE THIS PLACE!\r\r\n\r\r\nIt's a cute mom-... 5.802
5 -0fMBkX7QvWKQrtOp7H-GQ 3rqoxOasrRKxNubxjLSElA The food was delicious and the service was ama... 12.067
6 -2EuoueswhqEERWezJY8gw cInzGnaFZ3EIItvFXl1MvQ My Girlfriend and I eat here occasionally and ... 14.353
7 -2Ig3GSBkj8JQT8eETmDPg d-YNxMKL6ZhkiRhfUPxKHg Very friendly family business. We had the pad... 3.156
8 -3WzrbWjnaKg2QWAsouy_g jQST5lkLGX9L52-A10TGTQ Yellow curry w/tofu is my favorite! 4.719
9 -45GJdo8Ye8A1AStuUZp9Q -SNpLwJNup8N96yq7sBJyw Excellent food, reasonable price and great atm... 15.490
10 -4c_mgQdLH5axJ3j2In5_Q WPmamMTGAmNYXGoXW1mWyQ This place has the best green curry and spicy ... 16.382
11 -7R1u0HzHKmhLy9qE2MBpw KTF-E3NfkJy2wiwcgOPyVQ Just ordered carry-out while at work and was p... 13.661
12 -8BqfYouq3o_UoazAQWwNw a_wK-2KhPu-8DAwwRObr8A Came here for lunch time was sat quickly, the ... 11.081
13 -8BqfYouq3o_UoazAQWwNw jQST5lkLGX9L52-A10TGTQ This place is one of the best thai places you ... 10.130
14 -8gRkiYaVm3zfoQ4pcg75w UxiSHVZxMdey7vRwm1fQyA The food and the service was the absolute best... 30.053
15 -8pbvWZH7Czk9YW1UkW4Ng MDtjD14H1sGLc4tSg0sUhw The service was fast, and friendly. The thai i... 10.124
16 -9-fkZ72_Qg4E6YYYXMqSg 4nnMgD9X62YrMqkQKhx-Pg Ate here for first time. My wife's parents own... -3.684
17 -9g6w1xoj6-4iZH29P3h7g kGEW4XXJQ2FS94gZv_N7VA This is one of the best places in Phoenix!! O... 17.372
18 -A01aSKVuOm42FnhvOCdKA JiLK9QPjd53pOBEAaY83lw Cannot say enough about this place!! It's our ... 9.072
19 -ARd7byPUILfnFVlKcn0Yg wct7rZKyZqZftzmAU-vhWQ We used to go here a lot and service was alway... -12.582
20 -BQFGG_hrORLkEs8oigCjg qcm7pfIdNn9XBuPEtoogbw Fantastic ambiance and food, I'm excited to ha... 5.605
21 -BVv1TDLLphHzgKw-eAJJQ fJzKYljToXOauSohw9cMIA I read the reviews and thought I should check ... 33.168
22 -BVv1TDLLphHzgKw-eAJJQ puFrm8eNizztqaWr_e32pQ As a resident of San Fransisco for almost thre... 22.912
23 -BrtOvg4tL1xcaQnQTaZow oXQmAzFj_qKNhUGYGNWLSA So first time here and the combo appetizer was... -28.467
24 -CyBQG3dc4UnpluY7UdMOA ujLZmyy11g1JHCQTxRA3Dw i visited on jan 19 and ordered the thai basil... 18.329
25 -DRza4wuHHWfQx5HcG6qaw NCtzWkMbE13r2M2Sg0wH9w We have been several times and have positive f... 21.893
26 -E7e4sTuVAHwwWjQYBG07w 5W48_DnrXVD7EbtmE4pxOQ I came in 15 minutes before close and got my f... -9.253
27 -EFuxDYchSSVkb4Q9Iivpg shCdCHRbnY5FTMJbWl-myQ Can I get a wha what? This place ROCKS! I li... 38.506
28 -EyEj5BujVFisco6OwmR8A puFrm8eNizztqaWr_e32pQ A friend wanted to meet for an early dinner an... 15.682
29 -F32Vl8Rk4dwsmk0f2wRIw NH67MdKaFGNcP-dlu56pyw I ordered take-out from Thai Elephant tonight ... 40.291
... ... ... ... ...
10881 zkhOTlhe6dn-jrwDpYDN6Q LbBxrQJl-ny02-eCM1LYNg I grew up eating amazing Thai food at a little... 19.861
10882 zlcHQII8dyI8I0LHGj8nOA 90AXjqb4O-wrTHDKDoDUzg Came in on a Sunday evening, place was quiet a... 18.955
10883 zlcHQII8dyI8I0LHGj8nOA AsX-6ECbV83zGJLUVMre9w Roasted duck was way too salty. Papaya salad w... 10.533
10884 zlneJ82kppmQXOUGHqCLaQ RlfX4muX5LfJsvmI9qWGvw Amazing service, food, & decor. Best Thai food... 10.771
10885 zmZtT1T6-J4NcqP8j1L5jA Nz_AasmpsQ8MLSqhCTRVoA Great food and phenomenal service, but please ... -0.849
10886 zmZtT1T6-J4NcqP8j1L5jA joxWCp6dgN-kTE9GMziwjA Unfortunately I can't give this place a 5 beca... -4.963
10887 zn81QpflLDUaGZkCMUowCg 8qrICL2tS2Rq7b5gxUdQwQ I am not sure why there were great reviews of ... -2.737
10888 zn81QpflLDUaGZkCMUowCg I1rvqU2k5UQGo2lGdY6hyw Great authentic Thai food. So glad to have fo... 16.930
10889 zn81QpflLDUaGZkCMUowCg NH67MdKaFGNcP-dlu56pyw Ordered drunken noodles with seafood last nigh... 17.177
10890 zn81QpflLDUaGZkCMUowCg qyNtVViurIcChc35mfYIEw Just shoot me. A Touch of Thai is just that: ... 7.038
10891 zn81QpflLDUaGZkCMUowCg vtQOervVVTXjhvSZQiZ6PA Red curry was watery and rather average. Hot ... 3.414
10892 znDOmt2ifMXWiAkrhjiuig AsX-6ECbV83zGJLUVMre9w I love Thai food. It is my favorite food. The ... 46.203
10893 zp-DF3qfvOn5ko_vjpQLOg KPoTixdjoJxSqRSEApSAGg The Wild Thaiger is the best kept Thai restaur... 77.339
10894 zqHznU4iL06NziZIEGWHJw lliksv-tglfUz1T3B3vgvA I now understand what all the hype is about. T... 34.365
10895 zrBmkDDLS94GYexyp0LyqQ NH67MdKaFGNcP-dlu56pyw FANTASTIC. Service? 5 stars. Friendly, effi... 16.308
10896 zrO1ENicvYdPsQk8ykJOkg a1t31qMLd5fQocEjbSJ61A Saw that there were some mixed reviews for the... 0.719
10897 zs1msKnmTFD3iV2u69USuA PXmR1MgOAWB066XH20HjxQ Went here on a recent trip to AZ at the recomm... 9.741
10898 zs6wQGh1r726ZzaNKRa-bw JiLK9QPjd53pOBEAaY83lw Delicious. The vegetable penang curry was so ... 6.660
10899 zs6wQGh1r726ZzaNKRa-bw j2a5uJz76rK9uTRgLn5TdQ The food was fine, nothing special. The most ... -4.652
10900 zs6wQGh1r726ZzaNKRa-bw xcxkEmy4CD-qaJUqprvpHA This is my favorite Thai restaurant in the Val... 20.621
10901 zssolmdLpaX1tpRRMDWYwg 0udEgNqy5rLR5pZ4kD19Og Yum Yum!\r\r\nI love that they actually use sp... 10.904
10902 zulNp3NWnv7sYODNZ1Xrow o15PeOAUzpcCl8ngk0lMHw One of my favorite places for thai takeout! Lo... 6.922
10903 zuoeE7GdXXlCgr995ImWfQ xcxkEmy4CD-qaJUqprvpHA Worst experience I have ever had at a restaura... -31.274
10904 zv4i7JjhI9v9j4ZzX7TGDw NCtzWkMbE13r2M2Sg0wH9w LOVE THIS PLACE! I've lived in NY and LA so I'... 30.269
10905 zvFDYEFo_xO8VqLQfmB-DA shCdCHRbnY5FTMJbWl-myQ Shopped at the market and decided to eat here.... 14.770
10906 zw5NmE_epbvJ22xOYLdIoQ cBwc3dhdHw0emmg9nd5SXw Decor is very nice and clean. In a small shopp... 30.799
10907 zxQaAt4awDFVWme2I9mFgg 0udEgNqy5rLR5pZ4kD19Og My friends and I joke that the Pad Thai has cr... 13.315
10908 zxcrlC3cmH5S2TGIxuLwBw MDtjD14H1sGLc4tSg0sUhw There really aren't that many Thai food spots ... 10.809
10909 zxcrlC3cmH5S2TGIxuLwBw kGEW4XXJQ2FS94gZv_N7VA This was my go-to for delivery Thai food. Pret... -1.059
10910 zyor9BbfHNjTsaRFfePRwQ apGVTRZRCQ9-89hu2qW-vw Had dinner at the Bangkok Thai Bar-B-Q tonight... 22.703

10911 rows × 4 columns

Convert Yelp Review Sentiment Scores to a 1-to-5 Scale Using the Round Function without Decimals


In [10]:
OldMax = max(ThaiTextByUniqueUserBiz['YRsentiment'])
OldMin = min(ThaiTextByUniqueUserBiz['YRsentiment'])
NewMax = 5
NewMin = 1
OldRange = (OldMax - OldMin)
NewRange = (NewMax - NewMin)
for row in range(len(ThaiTextByUniqueUserBiz)):
    n = ThaiTextByUniqueUserBiz.loc[row, 'YRsentiment']
    ThaiTextByUniqueUserBiz.loc[row,"YRSentScore"] = (((n - OldMin) * NewRange / OldRange) + NewMin).round(decimals=0, out=None)
ThaiTextByUniqueUserBiz


Out[10]:
user_id business_id text YRsentiment YRSentScore
0 --65q1FpAL_UQtVZ2PTGew JiLK9QPjd53pOBEAaY83lw I'm a big fan of this place and have dropped i... 43.539 3
1 --ijvARuRJhZrBdS9_jF2A ApUCpJ9aa6yVgsde16gYrg Food was ok but the service was less than exce... -9.641 3
2 --ohLoec6PU9_yxhbIlVWg 2XXwiASSS6685OhWWnIt_A I got the Penang curry and have to say the foo... 0.105 3
3 --qEXbk-cA0HmbPyhcffdA CVos739DJ06t8-dNiRMyeQ To sum up in one sentence: "I only go to Thai ... -16.096 3
4 --qEXbk-cA0HmbPyhcffdA jQST5lkLGX9L52-A10TGTQ I LOVE THIS PLACE!\r\r\n\r\r\nIt's a cute mom-... 5.802 3
5 -0fMBkX7QvWKQrtOp7H-GQ 3rqoxOasrRKxNubxjLSElA The food was delicious and the service was ama... 12.067 3
6 -2EuoueswhqEERWezJY8gw cInzGnaFZ3EIItvFXl1MvQ My Girlfriend and I eat here occasionally and ... 14.353 3
7 -2Ig3GSBkj8JQT8eETmDPg d-YNxMKL6ZhkiRhfUPxKHg Very friendly family business. We had the pad... 3.156 3
8 -3WzrbWjnaKg2QWAsouy_g jQST5lkLGX9L52-A10TGTQ Yellow curry w/tofu is my favorite! 4.719 3
9 -45GJdo8Ye8A1AStuUZp9Q -SNpLwJNup8N96yq7sBJyw Excellent food, reasonable price and great atm... 15.490 3
10 -4c_mgQdLH5axJ3j2In5_Q WPmamMTGAmNYXGoXW1mWyQ This place has the best green curry and spicy ... 16.382 3
11 -7R1u0HzHKmhLy9qE2MBpw KTF-E3NfkJy2wiwcgOPyVQ Just ordered carry-out while at work and was p... 13.661 3
12 -8BqfYouq3o_UoazAQWwNw a_wK-2KhPu-8DAwwRObr8A Came here for lunch time was sat quickly, the ... 11.081 3
13 -8BqfYouq3o_UoazAQWwNw jQST5lkLGX9L52-A10TGTQ This place is one of the best thai places you ... 10.130 3
14 -8gRkiYaVm3zfoQ4pcg75w UxiSHVZxMdey7vRwm1fQyA The food and the service was the absolute best... 30.053 3
15 -8pbvWZH7Czk9YW1UkW4Ng MDtjD14H1sGLc4tSg0sUhw The service was fast, and friendly. The thai i... 10.124 3
16 -9-fkZ72_Qg4E6YYYXMqSg 4nnMgD9X62YrMqkQKhx-Pg Ate here for first time. My wife's parents own... -3.684 3
17 -9g6w1xoj6-4iZH29P3h7g kGEW4XXJQ2FS94gZv_N7VA This is one of the best places in Phoenix!! O... 17.372 3
18 -A01aSKVuOm42FnhvOCdKA JiLK9QPjd53pOBEAaY83lw Cannot say enough about this place!! It's our ... 9.072 3
19 -ARd7byPUILfnFVlKcn0Yg wct7rZKyZqZftzmAU-vhWQ We used to go here a lot and service was alway... -12.582 3
20 -BQFGG_hrORLkEs8oigCjg qcm7pfIdNn9XBuPEtoogbw Fantastic ambiance and food, I'm excited to ha... 5.605 3
21 -BVv1TDLLphHzgKw-eAJJQ fJzKYljToXOauSohw9cMIA I read the reviews and thought I should check ... 33.168 3
22 -BVv1TDLLphHzgKw-eAJJQ puFrm8eNizztqaWr_e32pQ As a resident of San Fransisco for almost thre... 22.912 3
23 -BrtOvg4tL1xcaQnQTaZow oXQmAzFj_qKNhUGYGNWLSA So first time here and the combo appetizer was... -28.467 3
24 -CyBQG3dc4UnpluY7UdMOA ujLZmyy11g1JHCQTxRA3Dw i visited on jan 19 and ordered the thai basil... 18.329 3
25 -DRza4wuHHWfQx5HcG6qaw NCtzWkMbE13r2M2Sg0wH9w We have been several times and have positive f... 21.893 3
26 -E7e4sTuVAHwwWjQYBG07w 5W48_DnrXVD7EbtmE4pxOQ I came in 15 minutes before close and got my f... -9.253 3
27 -EFuxDYchSSVkb4Q9Iivpg shCdCHRbnY5FTMJbWl-myQ Can I get a wha what? This place ROCKS! I li... 38.506 3
28 -EyEj5BujVFisco6OwmR8A puFrm8eNizztqaWr_e32pQ A friend wanted to meet for an early dinner an... 15.682 3
29 -F32Vl8Rk4dwsmk0f2wRIw NH67MdKaFGNcP-dlu56pyw I ordered take-out from Thai Elephant tonight ... 40.291 3
... ... ... ... ... ...
10881 zkhOTlhe6dn-jrwDpYDN6Q LbBxrQJl-ny02-eCM1LYNg I grew up eating amazing Thai food at a little... 19.861 3
10882 zlcHQII8dyI8I0LHGj8nOA 90AXjqb4O-wrTHDKDoDUzg Came in on a Sunday evening, place was quiet a... 18.955 3
10883 zlcHQII8dyI8I0LHGj8nOA AsX-6ECbV83zGJLUVMre9w Roasted duck was way too salty. Papaya salad w... 10.533 3
10884 zlneJ82kppmQXOUGHqCLaQ RlfX4muX5LfJsvmI9qWGvw Amazing service, food, & decor. Best Thai food... 10.771 3
10885 zmZtT1T6-J4NcqP8j1L5jA Nz_AasmpsQ8MLSqhCTRVoA Great food and phenomenal service, but please ... -0.849 3
10886 zmZtT1T6-J4NcqP8j1L5jA joxWCp6dgN-kTE9GMziwjA Unfortunately I can't give this place a 5 beca... -4.963 3
10887 zn81QpflLDUaGZkCMUowCg 8qrICL2tS2Rq7b5gxUdQwQ I am not sure why there were great reviews of ... -2.737 3
10888 zn81QpflLDUaGZkCMUowCg I1rvqU2k5UQGo2lGdY6hyw Great authentic Thai food. So glad to have fo... 16.930 3
10889 zn81QpflLDUaGZkCMUowCg NH67MdKaFGNcP-dlu56pyw Ordered drunken noodles with seafood last nigh... 17.177 3
10890 zn81QpflLDUaGZkCMUowCg qyNtVViurIcChc35mfYIEw Just shoot me. A Touch of Thai is just that: ... 7.038 3
10891 zn81QpflLDUaGZkCMUowCg vtQOervVVTXjhvSZQiZ6PA Red curry was watery and rather average. Hot ... 3.414 3
10892 znDOmt2ifMXWiAkrhjiuig AsX-6ECbV83zGJLUVMre9w I love Thai food. It is my favorite food. The ... 46.203 3
10893 zp-DF3qfvOn5ko_vjpQLOg KPoTixdjoJxSqRSEApSAGg The Wild Thaiger is the best kept Thai restaur... 77.339 4
10894 zqHznU4iL06NziZIEGWHJw lliksv-tglfUz1T3B3vgvA I now understand what all the hype is about. T... 34.365 3
10895 zrBmkDDLS94GYexyp0LyqQ NH67MdKaFGNcP-dlu56pyw FANTASTIC. Service? 5 stars. Friendly, effi... 16.308 3
10896 zrO1ENicvYdPsQk8ykJOkg a1t31qMLd5fQocEjbSJ61A Saw that there were some mixed reviews for the... 0.719 3
10897 zs1msKnmTFD3iV2u69USuA PXmR1MgOAWB066XH20HjxQ Went here on a recent trip to AZ at the recomm... 9.741 3
10898 zs6wQGh1r726ZzaNKRa-bw JiLK9QPjd53pOBEAaY83lw Delicious. The vegetable penang curry was so ... 6.660 3
10899 zs6wQGh1r726ZzaNKRa-bw j2a5uJz76rK9uTRgLn5TdQ The food was fine, nothing special. The most ... -4.652 3
10900 zs6wQGh1r726ZzaNKRa-bw xcxkEmy4CD-qaJUqprvpHA This is my favorite Thai restaurant in the Val... 20.621 3
10901 zssolmdLpaX1tpRRMDWYwg 0udEgNqy5rLR5pZ4kD19Og Yum Yum!\r\r\nI love that they actually use sp... 10.904 3
10902 zulNp3NWnv7sYODNZ1Xrow o15PeOAUzpcCl8ngk0lMHw One of my favorite places for thai takeout! Lo... 6.922 3
10903 zuoeE7GdXXlCgr995ImWfQ xcxkEmy4CD-qaJUqprvpHA Worst experience I have ever had at a restaura... -31.274 3
10904 zv4i7JjhI9v9j4ZzX7TGDw NCtzWkMbE13r2M2Sg0wH9w LOVE THIS PLACE! I've lived in NY and LA so I'... 30.269 3
10905 zvFDYEFo_xO8VqLQfmB-DA shCdCHRbnY5FTMJbWl-myQ Shopped at the market and decided to eat here.... 14.770 3
10906 zw5NmE_epbvJ22xOYLdIoQ cBwc3dhdHw0emmg9nd5SXw Decor is very nice and clean. In a small shopp... 30.799 3
10907 zxQaAt4awDFVWme2I9mFgg 0udEgNqy5rLR5pZ4kD19Og My friends and I joke that the Pad Thai has cr... 13.315 3
10908 zxcrlC3cmH5S2TGIxuLwBw MDtjD14H1sGLc4tSg0sUhw There really aren't that many Thai food spots ... 10.809 3
10909 zxcrlC3cmH5S2TGIxuLwBw kGEW4XXJQ2FS94gZv_N7VA This was my go-to for delivery Thai food. Pret... -1.059 3
10910 zyor9BbfHNjTsaRFfePRwQ apGVTRZRCQ9-89hu2qW-vw Had dinner at the Bangkok Thai Bar-B-Q tonight... 22.703 3

10911 rows × 5 columns

Convert Yelp Review Sentiment Scores to a 1-to-5 Scale with 0.5 Increments


In [11]:
OldMax = max(ThaiTextByUniqueUserBiz['YRsentiment'])
OldMin = min(ThaiTextByUniqueUserBiz['YRsentiment'])
NewMax = 5
NewMin = 1
OldRange = (OldMax - OldMin)
NewRange = (NewMax - NewMin)
for row in range(len(ThaiTextByUniqueUserBiz)):
    n = ThaiTextByUniqueUserBiz.loc[row, 'YRsentiment']
    ThaiTextByUniqueUserBiz.loc[row,"YRSentScore2"] = 0.5 * np.ceil(2*(((n - OldMin) * NewRange / OldRange) + NewMin))
ThaiTextByUniqueUserBiz


Out[11]:
user_id business_id text YRsentiment YRSentScore YRSentScore2
0 --65q1FpAL_UQtVZ2PTGew JiLK9QPjd53pOBEAaY83lw I'm a big fan of this place and have dropped i... 43.539 3 3.5
1 --ijvARuRJhZrBdS9_jF2A ApUCpJ9aa6yVgsde16gYrg Food was ok but the service was less than exce... -9.641 3 3.0
2 --ohLoec6PU9_yxhbIlVWg 2XXwiASSS6685OhWWnIt_A I got the Penang curry and have to say the foo... 0.105 3 3.0
3 --qEXbk-cA0HmbPyhcffdA CVos739DJ06t8-dNiRMyeQ To sum up in one sentence: "I only go to Thai ... -16.096 3 3.0
4 --qEXbk-cA0HmbPyhcffdA jQST5lkLGX9L52-A10TGTQ I LOVE THIS PLACE!\r\r\n\r\r\nIt's a cute mom-... 5.802 3 3.0
5 -0fMBkX7QvWKQrtOp7H-GQ 3rqoxOasrRKxNubxjLSElA The food was delicious and the service was ama... 12.067 3 3.5
6 -2EuoueswhqEERWezJY8gw cInzGnaFZ3EIItvFXl1MvQ My Girlfriend and I eat here occasionally and ... 14.353 3 3.5
7 -2Ig3GSBkj8JQT8eETmDPg d-YNxMKL6ZhkiRhfUPxKHg Very friendly family business. We had the pad... 3.156 3 3.0
8 -3WzrbWjnaKg2QWAsouy_g jQST5lkLGX9L52-A10TGTQ Yellow curry w/tofu is my favorite! 4.719 3 3.0
9 -45GJdo8Ye8A1AStuUZp9Q -SNpLwJNup8N96yq7sBJyw Excellent food, reasonable price and great atm... 15.490 3 3.5
10 -4c_mgQdLH5axJ3j2In5_Q WPmamMTGAmNYXGoXW1mWyQ This place has the best green curry and spicy ... 16.382 3 3.5
11 -7R1u0HzHKmhLy9qE2MBpw KTF-E3NfkJy2wiwcgOPyVQ Just ordered carry-out while at work and was p... 13.661 3 3.5
12 -8BqfYouq3o_UoazAQWwNw a_wK-2KhPu-8DAwwRObr8A Came here for lunch time was sat quickly, the ... 11.081 3 3.5
13 -8BqfYouq3o_UoazAQWwNw jQST5lkLGX9L52-A10TGTQ This place is one of the best thai places you ... 10.130 3 3.5
14 -8gRkiYaVm3zfoQ4pcg75w UxiSHVZxMdey7vRwm1fQyA The food and the service was the absolute best... 30.053 3 3.5
15 -8pbvWZH7Czk9YW1UkW4Ng MDtjD14H1sGLc4tSg0sUhw The service was fast, and friendly. The thai i... 10.124 3 3.5
16 -9-fkZ72_Qg4E6YYYXMqSg 4nnMgD9X62YrMqkQKhx-Pg Ate here for first time. My wife's parents own... -3.684 3 3.0
17 -9g6w1xoj6-4iZH29P3h7g kGEW4XXJQ2FS94gZv_N7VA This is one of the best places in Phoenix!! O... 17.372 3 3.5
18 -A01aSKVuOm42FnhvOCdKA JiLK9QPjd53pOBEAaY83lw Cannot say enough about this place!! It's our ... 9.072 3 3.5
19 -ARd7byPUILfnFVlKcn0Yg wct7rZKyZqZftzmAU-vhWQ We used to go here a lot and service was alway... -12.582 3 3.0
20 -BQFGG_hrORLkEs8oigCjg qcm7pfIdNn9XBuPEtoogbw Fantastic ambiance and food, I'm excited to ha... 5.605 3 3.0
21 -BVv1TDLLphHzgKw-eAJJQ fJzKYljToXOauSohw9cMIA I read the reviews and thought I should check ... 33.168 3 3.5
22 -BVv1TDLLphHzgKw-eAJJQ puFrm8eNizztqaWr_e32pQ As a resident of San Fransisco for almost thre... 22.912 3 3.5
23 -BrtOvg4tL1xcaQnQTaZow oXQmAzFj_qKNhUGYGNWLSA So first time here and the combo appetizer was... -28.467 3 3.0
24 -CyBQG3dc4UnpluY7UdMOA ujLZmyy11g1JHCQTxRA3Dw i visited on jan 19 and ordered the thai basil... 18.329 3 3.5
25 -DRza4wuHHWfQx5HcG6qaw NCtzWkMbE13r2M2Sg0wH9w We have been several times and have positive f... 21.893 3 3.5
26 -E7e4sTuVAHwwWjQYBG07w 5W48_DnrXVD7EbtmE4pxOQ I came in 15 minutes before close and got my f... -9.253 3 3.0
27 -EFuxDYchSSVkb4Q9Iivpg shCdCHRbnY5FTMJbWl-myQ Can I get a wha what? This place ROCKS! I li... 38.506 3 3.5
28 -EyEj5BujVFisco6OwmR8A puFrm8eNizztqaWr_e32pQ A friend wanted to meet for an early dinner an... 15.682 3 3.5
29 -F32Vl8Rk4dwsmk0f2wRIw NH67MdKaFGNcP-dlu56pyw I ordered take-out from Thai Elephant tonight ... 40.291 3 3.5
... ... ... ... ... ... ...
10881 zkhOTlhe6dn-jrwDpYDN6Q LbBxrQJl-ny02-eCM1LYNg I grew up eating amazing Thai food at a little... 19.861 3 3.5
10882 zlcHQII8dyI8I0LHGj8nOA 90AXjqb4O-wrTHDKDoDUzg Came in on a Sunday evening, place was quiet a... 18.955 3 3.5
10883 zlcHQII8dyI8I0LHGj8nOA AsX-6ECbV83zGJLUVMre9w Roasted duck was way too salty. Papaya salad w... 10.533 3 3.5
10884 zlneJ82kppmQXOUGHqCLaQ RlfX4muX5LfJsvmI9qWGvw Amazing service, food, & decor. Best Thai food... 10.771 3 3.5
10885 zmZtT1T6-J4NcqP8j1L5jA Nz_AasmpsQ8MLSqhCTRVoA Great food and phenomenal service, but please ... -0.849 3 3.0
10886 zmZtT1T6-J4NcqP8j1L5jA joxWCp6dgN-kTE9GMziwjA Unfortunately I can't give this place a 5 beca... -4.963 3 3.0
10887 zn81QpflLDUaGZkCMUowCg 8qrICL2tS2Rq7b5gxUdQwQ I am not sure why there were great reviews of ... -2.737 3 3.0
10888 zn81QpflLDUaGZkCMUowCg I1rvqU2k5UQGo2lGdY6hyw Great authentic Thai food. So glad to have fo... 16.930 3 3.5
10889 zn81QpflLDUaGZkCMUowCg NH67MdKaFGNcP-dlu56pyw Ordered drunken noodles with seafood last nigh... 17.177 3 3.5
10890 zn81QpflLDUaGZkCMUowCg qyNtVViurIcChc35mfYIEw Just shoot me. A Touch of Thai is just that: ... 7.038 3 3.0
10891 zn81QpflLDUaGZkCMUowCg vtQOervVVTXjhvSZQiZ6PA Red curry was watery and rather average. Hot ... 3.414 3 3.0
10892 znDOmt2ifMXWiAkrhjiuig AsX-6ECbV83zGJLUVMre9w I love Thai food. It is my favorite food. The ... 46.203 3 3.5
10893 zp-DF3qfvOn5ko_vjpQLOg KPoTixdjoJxSqRSEApSAGg The Wild Thaiger is the best kept Thai restaur... 77.339 4 4.0
10894 zqHznU4iL06NziZIEGWHJw lliksv-tglfUz1T3B3vgvA I now understand what all the hype is about. T... 34.365 3 3.5
10895 zrBmkDDLS94GYexyp0LyqQ NH67MdKaFGNcP-dlu56pyw FANTASTIC. Service? 5 stars. Friendly, effi... 16.308 3 3.5
10896 zrO1ENicvYdPsQk8ykJOkg a1t31qMLd5fQocEjbSJ61A Saw that there were some mixed reviews for the... 0.719 3 3.0
10897 zs1msKnmTFD3iV2u69USuA PXmR1MgOAWB066XH20HjxQ Went here on a recent trip to AZ at the recomm... 9.741 3 3.5
10898 zs6wQGh1r726ZzaNKRa-bw JiLK9QPjd53pOBEAaY83lw Delicious. The vegetable penang curry was so ... 6.660 3 3.0
10899 zs6wQGh1r726ZzaNKRa-bw j2a5uJz76rK9uTRgLn5TdQ The food was fine, nothing special. The most ... -4.652 3 3.0
10900 zs6wQGh1r726ZzaNKRa-bw xcxkEmy4CD-qaJUqprvpHA This is my favorite Thai restaurant in the Val... 20.621 3 3.5
10901 zssolmdLpaX1tpRRMDWYwg 0udEgNqy5rLR5pZ4kD19Og Yum Yum!\r\r\nI love that they actually use sp... 10.904 3 3.5
10902 zulNp3NWnv7sYODNZ1Xrow o15PeOAUzpcCl8ngk0lMHw One of my favorite places for thai takeout! Lo... 6.922 3 3.0
10903 zuoeE7GdXXlCgr995ImWfQ xcxkEmy4CD-qaJUqprvpHA Worst experience I have ever had at a restaura... -31.274 3 3.0
10904 zv4i7JjhI9v9j4ZzX7TGDw NCtzWkMbE13r2M2Sg0wH9w LOVE THIS PLACE! I've lived in NY and LA so I'... 30.269 3 3.5
10905 zvFDYEFo_xO8VqLQfmB-DA shCdCHRbnY5FTMJbWl-myQ Shopped at the market and decided to eat here.... 14.770 3 3.5
10906 zw5NmE_epbvJ22xOYLdIoQ cBwc3dhdHw0emmg9nd5SXw Decor is very nice and clean. In a small shopp... 30.799 3 3.5
10907 zxQaAt4awDFVWme2I9mFgg 0udEgNqy5rLR5pZ4kD19Og My friends and I joke that the Pad Thai has cr... 13.315 3 3.5
10908 zxcrlC3cmH5S2TGIxuLwBw MDtjD14H1sGLc4tSg0sUhw There really aren't that many Thai food spots ... 10.809 3 3.5
10909 zxcrlC3cmH5S2TGIxuLwBw kGEW4XXJQ2FS94gZv_N7VA This was my go-to for delivery Thai food. Pret... -1.059 3 3.0
10910 zyor9BbfHNjTsaRFfePRwQ apGVTRZRCQ9-89hu2qW-vw Had dinner at the Bangkok Thai Bar-B-Q tonight... 22.703 3 3.5

10911 rows × 6 columns

Extract Sentiment Scores from Reviews Using Algorithm, TweetTokenizer and AFINN Dictionary


In [12]:
from nltk.tokenize import TweetTokenizer
tknzr = TweetTokenizer()
for row in range(len(ThaiTextByUniqueUserBiz)):
    n = ThaiTextByUniqueUserBiz.loc[row, 'text']
    words = tknzr.tokenize(n.lower())
    ThaiTextByUniqueUserBiz.loc[row,"AFINNSentScore"] = sum(sentiment_dictionary.get(word, 0) for word in words)
ThaiTextByUniqueUserBiz


Out[12]:
user_id business_id text YRsentiment YRSentScore YRSentScore2 AFINNSentScore
0 --65q1FpAL_UQtVZ2PTGew JiLK9QPjd53pOBEAaY83lw I'm a big fan of this place and have dropped i... 43.539 3 3.5 22
1 --ijvARuRJhZrBdS9_jF2A ApUCpJ9aa6yVgsde16gYrg Food was ok but the service was less than exce... -9.641 3 3.0 -1
2 --ohLoec6PU9_yxhbIlVWg 2XXwiASSS6685OhWWnIt_A I got the Penang curry and have to say the foo... 0.105 3 3.0 5
3 --qEXbk-cA0HmbPyhcffdA CVos739DJ06t8-dNiRMyeQ To sum up in one sentence: "I only go to Thai ... -16.096 3 3.0 7
4 --qEXbk-cA0HmbPyhcffdA jQST5lkLGX9L52-A10TGTQ I LOVE THIS PLACE!\r\r\n\r\r\nIt's a cute mom-... 5.802 3 3.0 21
5 -0fMBkX7QvWKQrtOp7H-GQ 3rqoxOasrRKxNubxjLSElA The food was delicious and the service was ama... 12.067 3 3.5 15
6 -2EuoueswhqEERWezJY8gw cInzGnaFZ3EIItvFXl1MvQ My Girlfriend and I eat here occasionally and ... 14.353 3 3.5 13
7 -2Ig3GSBkj8JQT8eETmDPg d-YNxMKL6ZhkiRhfUPxKHg Very friendly family business. We had the pad... 3.156 3 3.0 6
8 -3WzrbWjnaKg2QWAsouy_g jQST5lkLGX9L52-A10TGTQ Yellow curry w/tofu is my favorite! 4.719 3 3.0 2
9 -45GJdo8Ye8A1AStuUZp9Q -SNpLwJNup8N96yq7sBJyw Excellent food, reasonable price and great atm... 15.490 3 3.5 13
10 -4c_mgQdLH5axJ3j2In5_Q WPmamMTGAmNYXGoXW1mWyQ This place has the best green curry and spicy ... 16.382 3 3.5 6
11 -7R1u0HzHKmhLy9qE2MBpw KTF-E3NfkJy2wiwcgOPyVQ Just ordered carry-out while at work and was p... 13.661 3 3.5 2
12 -8BqfYouq3o_UoazAQWwNw a_wK-2KhPu-8DAwwRObr8A Came here for lunch time was sat quickly, the ... 11.081 3 3.5 11
13 -8BqfYouq3o_UoazAQWwNw jQST5lkLGX9L52-A10TGTQ This place is one of the best thai places you ... 10.130 3 3.5 10
14 -8gRkiYaVm3zfoQ4pcg75w UxiSHVZxMdey7vRwm1fQyA The food and the service was the absolute best... 30.053 3 3.5 31
15 -8pbvWZH7Czk9YW1UkW4Ng MDtjD14H1sGLc4tSg0sUhw The service was fast, and friendly. The thai i... 10.124 3 3.5 2
16 -9-fkZ72_Qg4E6YYYXMqSg 4nnMgD9X62YrMqkQKhx-Pg Ate here for first time. My wife's parents own... -3.684 3 3.0 2
17 -9g6w1xoj6-4iZH29P3h7g kGEW4XXJQ2FS94gZv_N7VA This is one of the best places in Phoenix!! O... 17.372 3 3.5 19
18 -A01aSKVuOm42FnhvOCdKA JiLK9QPjd53pOBEAaY83lw Cannot say enough about this place!! It's our ... 9.072 3 3.5 2
19 -ARd7byPUILfnFVlKcn0Yg wct7rZKyZqZftzmAU-vhWQ We used to go here a lot and service was alway... -12.582 3 3.0 -2
20 -BQFGG_hrORLkEs8oigCjg qcm7pfIdNn9XBuPEtoogbw Fantastic ambiance and food, I'm excited to ha... 5.605 3 3.0 9
21 -BVv1TDLLphHzgKw-eAJJQ fJzKYljToXOauSohw9cMIA I read the reviews and thought I should check ... 33.168 3 3.5 39
22 -BVv1TDLLphHzgKw-eAJJQ puFrm8eNizztqaWr_e32pQ As a resident of San Fransisco for almost thre... 22.912 3 3.5 13
23 -BrtOvg4tL1xcaQnQTaZow oXQmAzFj_qKNhUGYGNWLSA So first time here and the combo appetizer was... -28.467 3 3.0 3
24 -CyBQG3dc4UnpluY7UdMOA ujLZmyy11g1JHCQTxRA3Dw i visited on jan 19 and ordered the thai basil... 18.329 3 3.5 11
25 -DRza4wuHHWfQx5HcG6qaw NCtzWkMbE13r2M2Sg0wH9w We have been several times and have positive f... 21.893 3 3.5 12
26 -E7e4sTuVAHwwWjQYBG07w 5W48_DnrXVD7EbtmE4pxOQ I came in 15 minutes before close and got my f... -9.253 3 3.0 -2
27 -EFuxDYchSSVkb4Q9Iivpg shCdCHRbnY5FTMJbWl-myQ Can I get a wha what? This place ROCKS! I li... 38.506 3 3.5 32
28 -EyEj5BujVFisco6OwmR8A puFrm8eNizztqaWr_e32pQ A friend wanted to meet for an early dinner an... 15.682 3 3.5 11
29 -F32Vl8Rk4dwsmk0f2wRIw NH67MdKaFGNcP-dlu56pyw I ordered take-out from Thai Elephant tonight ... 40.291 3 3.5 28
... ... ... ... ... ... ... ...
10881 zkhOTlhe6dn-jrwDpYDN6Q LbBxrQJl-ny02-eCM1LYNg I grew up eating amazing Thai food at a little... 19.861 3 3.5 11
10882 zlcHQII8dyI8I0LHGj8nOA 90AXjqb4O-wrTHDKDoDUzg Came in on a Sunday evening, place was quiet a... 18.955 3 3.5 23
10883 zlcHQII8dyI8I0LHGj8nOA AsX-6ECbV83zGJLUVMre9w Roasted duck was way too salty. Papaya salad w... 10.533 3 3.5 23
10884 zlneJ82kppmQXOUGHqCLaQ RlfX4muX5LfJsvmI9qWGvw Amazing service, food, & decor. Best Thai food... 10.771 3 3.5 9
10885 zmZtT1T6-J4NcqP8j1L5jA Nz_AasmpsQ8MLSqhCTRVoA Great food and phenomenal service, but please ... -0.849 3 3.0 12
10886 zmZtT1T6-J4NcqP8j1L5jA joxWCp6dgN-kTE9GMziwjA Unfortunately I can't give this place a 5 beca... -4.963 3 3.0 4
10887 zn81QpflLDUaGZkCMUowCg 8qrICL2tS2Rq7b5gxUdQwQ I am not sure why there were great reviews of ... -2.737 3 3.0 6
10888 zn81QpflLDUaGZkCMUowCg I1rvqU2k5UQGo2lGdY6hyw Great authentic Thai food. So glad to have fo... 16.930 3 3.5 12
10889 zn81QpflLDUaGZkCMUowCg NH67MdKaFGNcP-dlu56pyw Ordered drunken noodles with seafood last nigh... 17.177 3 3.5 10
10890 zn81QpflLDUaGZkCMUowCg qyNtVViurIcChc35mfYIEw Just shoot me. A Touch of Thai is just that: ... 7.038 3 3.0 3
10891 zn81QpflLDUaGZkCMUowCg vtQOervVVTXjhvSZQiZ6PA Red curry was watery and rather average. Hot ... 3.414 3 3.0 8
10892 znDOmt2ifMXWiAkrhjiuig AsX-6ECbV83zGJLUVMre9w I love Thai food. It is my favorite food. The ... 46.203 3 3.5 32
10893 zp-DF3qfvOn5ko_vjpQLOg KPoTixdjoJxSqRSEApSAGg The Wild Thaiger is the best kept Thai restaur... 77.339 4 4.0 40
10894 zqHznU4iL06NziZIEGWHJw lliksv-tglfUz1T3B3vgvA I now understand what all the hype is about. T... 34.365 3 3.5 9
10895 zrBmkDDLS94GYexyp0LyqQ NH67MdKaFGNcP-dlu56pyw FANTASTIC. Service? 5 stars. Friendly, effi... 16.308 3 3.5 20
10896 zrO1ENicvYdPsQk8ykJOkg a1t31qMLd5fQocEjbSJ61A Saw that there were some mixed reviews for the... 0.719 3 3.0 1
10897 zs1msKnmTFD3iV2u69USuA PXmR1MgOAWB066XH20HjxQ Went here on a recent trip to AZ at the recomm... 9.741 3 3.5 11
10898 zs6wQGh1r726ZzaNKRa-bw JiLK9QPjd53pOBEAaY83lw Delicious. The vegetable penang curry was so ... 6.660 3 3.0 4
10899 zs6wQGh1r726ZzaNKRa-bw j2a5uJz76rK9uTRgLn5TdQ The food was fine, nothing special. The most ... -4.652 3 3.0 0
10900 zs6wQGh1r726ZzaNKRa-bw xcxkEmy4CD-qaJUqprvpHA This is my favorite Thai restaurant in the Val... 20.621 3 3.5 8
10901 zssolmdLpaX1tpRRMDWYwg 0udEgNqy5rLR5pZ4kD19Og Yum Yum!\r\r\nI love that they actually use sp... 10.904 3 3.5 6
10902 zulNp3NWnv7sYODNZ1Xrow o15PeOAUzpcCl8ngk0lMHw One of my favorite places for thai takeout! Lo... 6.922 3 3.0 5
10903 zuoeE7GdXXlCgr995ImWfQ xcxkEmy4CD-qaJUqprvpHA Worst experience I have ever had at a restaura... -31.274 3 3.0 1
10904 zv4i7JjhI9v9j4ZzX7TGDw NCtzWkMbE13r2M2Sg0wH9w LOVE THIS PLACE! I've lived in NY and LA so I'... 30.269 3 3.5 23
10905 zvFDYEFo_xO8VqLQfmB-DA shCdCHRbnY5FTMJbWl-myQ Shopped at the market and decided to eat here.... 14.770 3 3.5 20
10906 zw5NmE_epbvJ22xOYLdIoQ cBwc3dhdHw0emmg9nd5SXw Decor is very nice and clean. In a small shopp... 30.799 3 3.5 19
10907 zxQaAt4awDFVWme2I9mFgg 0udEgNqy5rLR5pZ4kD19Og My friends and I joke that the Pad Thai has cr... 13.315 3 3.5 7
10908 zxcrlC3cmH5S2TGIxuLwBw MDtjD14H1sGLc4tSg0sUhw There really aren't that many Thai food spots ... 10.809 3 3.5 8
10909 zxcrlC3cmH5S2TGIxuLwBw kGEW4XXJQ2FS94gZv_N7VA This was my go-to for delivery Thai food. Pret... -1.059 3 3.0 4
10910 zyor9BbfHNjTsaRFfePRwQ apGVTRZRCQ9-89hu2qW-vw Had dinner at the Bangkok Thai Bar-B-Q tonight... 22.703 3 3.5 24

10911 rows × 7 columns

Convert AFINN Sentiment Scores to a 1-to-5 Scale with 0.5 Increments


In [13]:
OldMax = max(ThaiTextByUniqueUserBiz['AFINNSentScore'])
OldMin = min(ThaiTextByUniqueUserBiz['AFINNSentScore'])
NewMax = 5
NewMin = 1
OldRange = (OldMax - OldMin)
NewRange = (NewMax - NewMin)
for row in range(len(ThaiTextByUniqueUserBiz)):
    n = ThaiTextByUniqueUserBiz.loc[row, 'AFINNSentScore']
    ThaiTextByUniqueUserBiz.loc[row,"AFINNSentScore2"] = 0.5 * np.ceil(2*(((n - OldMin) * NewRange / OldRange) + NewMin))
ThaiTextByUniqueUserBiz


Out[13]:
user_id business_id text YRsentiment YRSentScore YRSentScore2 AFINNSentScore AFINNSentScore2
0 --65q1FpAL_UQtVZ2PTGew JiLK9QPjd53pOBEAaY83lw I'm a big fan of this place and have dropped i... 43.539 3 3.5 22 2.0
1 --ijvARuRJhZrBdS9_jF2A ApUCpJ9aa6yVgsde16gYrg Food was ok but the service was less than exce... -9.641 3 3.0 -1 2.0
2 --ohLoec6PU9_yxhbIlVWg 2XXwiASSS6685OhWWnIt_A I got the Penang curry and have to say the foo... 0.105 3 3.0 5 2.0
3 --qEXbk-cA0HmbPyhcffdA CVos739DJ06t8-dNiRMyeQ To sum up in one sentence: "I only go to Thai ... -16.096 3 3.0 7 2.0
4 --qEXbk-cA0HmbPyhcffdA jQST5lkLGX9L52-A10TGTQ I LOVE THIS PLACE!\r\r\n\r\r\nIt's a cute mom-... 5.802 3 3.0 21 2.0
5 -0fMBkX7QvWKQrtOp7H-GQ 3rqoxOasrRKxNubxjLSElA The food was delicious and the service was ama... 12.067 3 3.5 15 2.0
6 -2EuoueswhqEERWezJY8gw cInzGnaFZ3EIItvFXl1MvQ My Girlfriend and I eat here occasionally and ... 14.353 3 3.5 13 2.0
7 -2Ig3GSBkj8JQT8eETmDPg d-YNxMKL6ZhkiRhfUPxKHg Very friendly family business. We had the pad... 3.156 3 3.0 6 2.0
8 -3WzrbWjnaKg2QWAsouy_g jQST5lkLGX9L52-A10TGTQ Yellow curry w/tofu is my favorite! 4.719 3 3.0 2 2.0
9 -45GJdo8Ye8A1AStuUZp9Q -SNpLwJNup8N96yq7sBJyw Excellent food, reasonable price and great atm... 15.490 3 3.5 13 2.0
10 -4c_mgQdLH5axJ3j2In5_Q WPmamMTGAmNYXGoXW1mWyQ This place has the best green curry and spicy ... 16.382 3 3.5 6 2.0
11 -7R1u0HzHKmhLy9qE2MBpw KTF-E3NfkJy2wiwcgOPyVQ Just ordered carry-out while at work and was p... 13.661 3 3.5 2 2.0
12 -8BqfYouq3o_UoazAQWwNw a_wK-2KhPu-8DAwwRObr8A Came here for lunch time was sat quickly, the ... 11.081 3 3.5 11 2.0
13 -8BqfYouq3o_UoazAQWwNw jQST5lkLGX9L52-A10TGTQ This place is one of the best thai places you ... 10.130 3 3.5 10 2.0
14 -8gRkiYaVm3zfoQ4pcg75w UxiSHVZxMdey7vRwm1fQyA The food and the service was the absolute best... 30.053 3 3.5 31 2.5
15 -8pbvWZH7Czk9YW1UkW4Ng MDtjD14H1sGLc4tSg0sUhw The service was fast, and friendly. The thai i... 10.124 3 3.5 2 2.0
16 -9-fkZ72_Qg4E6YYYXMqSg 4nnMgD9X62YrMqkQKhx-Pg Ate here for first time. My wife's parents own... -3.684 3 3.0 2 2.0
17 -9g6w1xoj6-4iZH29P3h7g kGEW4XXJQ2FS94gZv_N7VA This is one of the best places in Phoenix!! O... 17.372 3 3.5 19 2.0
18 -A01aSKVuOm42FnhvOCdKA JiLK9QPjd53pOBEAaY83lw Cannot say enough about this place!! It's our ... 9.072 3 3.5 2 2.0
19 -ARd7byPUILfnFVlKcn0Yg wct7rZKyZqZftzmAU-vhWQ We used to go here a lot and service was alway... -12.582 3 3.0 -2 2.0
20 -BQFGG_hrORLkEs8oigCjg qcm7pfIdNn9XBuPEtoogbw Fantastic ambiance and food, I'm excited to ha... 5.605 3 3.0 9 2.0
21 -BVv1TDLLphHzgKw-eAJJQ fJzKYljToXOauSohw9cMIA I read the reviews and thought I should check ... 33.168 3 3.5 39 2.5
22 -BVv1TDLLphHzgKw-eAJJQ puFrm8eNizztqaWr_e32pQ As a resident of San Fransisco for almost thre... 22.912 3 3.5 13 2.0
23 -BrtOvg4tL1xcaQnQTaZow oXQmAzFj_qKNhUGYGNWLSA So first time here and the combo appetizer was... -28.467 3 3.0 3 2.0
24 -CyBQG3dc4UnpluY7UdMOA ujLZmyy11g1JHCQTxRA3Dw i visited on jan 19 and ordered the thai basil... 18.329 3 3.5 11 2.0
25 -DRza4wuHHWfQx5HcG6qaw NCtzWkMbE13r2M2Sg0wH9w We have been several times and have positive f... 21.893 3 3.5 12 2.0
26 -E7e4sTuVAHwwWjQYBG07w 5W48_DnrXVD7EbtmE4pxOQ I came in 15 minutes before close and got my f... -9.253 3 3.0 -2 2.0
27 -EFuxDYchSSVkb4Q9Iivpg shCdCHRbnY5FTMJbWl-myQ Can I get a wha what? This place ROCKS! I li... 38.506 3 3.5 32 2.5
28 -EyEj5BujVFisco6OwmR8A puFrm8eNizztqaWr_e32pQ A friend wanted to meet for an early dinner an... 15.682 3 3.5 11 2.0
29 -F32Vl8Rk4dwsmk0f2wRIw NH67MdKaFGNcP-dlu56pyw I ordered take-out from Thai Elephant tonight ... 40.291 3 3.5 28 2.0
... ... ... ... ... ... ... ... ...
10881 zkhOTlhe6dn-jrwDpYDN6Q LbBxrQJl-ny02-eCM1LYNg I grew up eating amazing Thai food at a little... 19.861 3 3.5 11 2.0
10882 zlcHQII8dyI8I0LHGj8nOA 90AXjqb4O-wrTHDKDoDUzg Came in on a Sunday evening, place was quiet a... 18.955 3 3.5 23 2.0
10883 zlcHQII8dyI8I0LHGj8nOA AsX-6ECbV83zGJLUVMre9w Roasted duck was way too salty. Papaya salad w... 10.533 3 3.5 23 2.0
10884 zlneJ82kppmQXOUGHqCLaQ RlfX4muX5LfJsvmI9qWGvw Amazing service, food, & decor. Best Thai food... 10.771 3 3.5 9 2.0
10885 zmZtT1T6-J4NcqP8j1L5jA Nz_AasmpsQ8MLSqhCTRVoA Great food and phenomenal service, but please ... -0.849 3 3.0 12 2.0
10886 zmZtT1T6-J4NcqP8j1L5jA joxWCp6dgN-kTE9GMziwjA Unfortunately I can't give this place a 5 beca... -4.963 3 3.0 4 2.0
10887 zn81QpflLDUaGZkCMUowCg 8qrICL2tS2Rq7b5gxUdQwQ I am not sure why there were great reviews of ... -2.737 3 3.0 6 2.0
10888 zn81QpflLDUaGZkCMUowCg I1rvqU2k5UQGo2lGdY6hyw Great authentic Thai food. So glad to have fo... 16.930 3 3.5 12 2.0
10889 zn81QpflLDUaGZkCMUowCg NH67MdKaFGNcP-dlu56pyw Ordered drunken noodles with seafood last nigh... 17.177 3 3.5 10 2.0
10890 zn81QpflLDUaGZkCMUowCg qyNtVViurIcChc35mfYIEw Just shoot me. A Touch of Thai is just that: ... 7.038 3 3.0 3 2.0
10891 zn81QpflLDUaGZkCMUowCg vtQOervVVTXjhvSZQiZ6PA Red curry was watery and rather average. Hot ... 3.414 3 3.0 8 2.0
10892 znDOmt2ifMXWiAkrhjiuig AsX-6ECbV83zGJLUVMre9w I love Thai food. It is my favorite food. The ... 46.203 3 3.5 32 2.5
10893 zp-DF3qfvOn5ko_vjpQLOg KPoTixdjoJxSqRSEApSAGg The Wild Thaiger is the best kept Thai restaur... 77.339 4 4.0 40 2.5
10894 zqHznU4iL06NziZIEGWHJw lliksv-tglfUz1T3B3vgvA I now understand what all the hype is about. T... 34.365 3 3.5 9 2.0
10895 zrBmkDDLS94GYexyp0LyqQ NH67MdKaFGNcP-dlu56pyw FANTASTIC. Service? 5 stars. Friendly, effi... 16.308 3 3.5 20 2.0
10896 zrO1ENicvYdPsQk8ykJOkg a1t31qMLd5fQocEjbSJ61A Saw that there were some mixed reviews for the... 0.719 3 3.0 1 2.0
10897 zs1msKnmTFD3iV2u69USuA PXmR1MgOAWB066XH20HjxQ Went here on a recent trip to AZ at the recomm... 9.741 3 3.5 11 2.0
10898 zs6wQGh1r726ZzaNKRa-bw JiLK9QPjd53pOBEAaY83lw Delicious. The vegetable penang curry was so ... 6.660 3 3.0 4 2.0
10899 zs6wQGh1r726ZzaNKRa-bw j2a5uJz76rK9uTRgLn5TdQ The food was fine, nothing special. The most ... -4.652 3 3.0 0 2.0
10900 zs6wQGh1r726ZzaNKRa-bw xcxkEmy4CD-qaJUqprvpHA This is my favorite Thai restaurant in the Val... 20.621 3 3.5 8 2.0
10901 zssolmdLpaX1tpRRMDWYwg 0udEgNqy5rLR5pZ4kD19Og Yum Yum!\r\r\nI love that they actually use sp... 10.904 3 3.5 6 2.0
10902 zulNp3NWnv7sYODNZ1Xrow o15PeOAUzpcCl8ngk0lMHw One of my favorite places for thai takeout! Lo... 6.922 3 3.0 5 2.0
10903 zuoeE7GdXXlCgr995ImWfQ xcxkEmy4CD-qaJUqprvpHA Worst experience I have ever had at a restaura... -31.274 3 3.0 1 2.0
10904 zv4i7JjhI9v9j4ZzX7TGDw NCtzWkMbE13r2M2Sg0wH9w LOVE THIS PLACE! I've lived in NY and LA so I'... 30.269 3 3.5 23 2.0
10905 zvFDYEFo_xO8VqLQfmB-DA shCdCHRbnY5FTMJbWl-myQ Shopped at the market and decided to eat here.... 14.770 3 3.5 20 2.0
10906 zw5NmE_epbvJ22xOYLdIoQ cBwc3dhdHw0emmg9nd5SXw Decor is very nice and clean. In a small shopp... 30.799 3 3.5 19 2.0
10907 zxQaAt4awDFVWme2I9mFgg 0udEgNqy5rLR5pZ4kD19Og My friends and I joke that the Pad Thai has cr... 13.315 3 3.5 7 2.0
10908 zxcrlC3cmH5S2TGIxuLwBw MDtjD14H1sGLc4tSg0sUhw There really aren't that many Thai food spots ... 10.809 3 3.5 8 2.0
10909 zxcrlC3cmH5S2TGIxuLwBw kGEW4XXJQ2FS94gZv_N7VA This was my go-to for delivery Thai food. Pret... -1.059 3 3.0 4 2.0
10910 zyor9BbfHNjTsaRFfePRwQ apGVTRZRCQ9-89hu2qW-vw Had dinner at the Bangkok Thai Bar-B-Q tonight... 22.703 3 3.5 24 2.0

10911 rows × 8 columns

Read and Merge the Thai Review Ratings file to the Sentiment Score Dataframe (ThaiTextByUniqueUserBiz)

Read the Thai Review Ratings File


In [14]:
ThaiReviewRatingsByUserBiz = pd.read_pickle('ThaiReviewRatingsByUserBiz.pkl')
ThaiReviewRatingsByUserBiz


Out[14]:
user_id business_id review_ratings
0 --65q1FpAL_UQtVZ2PTGew JiLK9QPjd53pOBEAaY83lw 5
1 --ijvARuRJhZrBdS9_jF2A ApUCpJ9aa6yVgsde16gYrg 1
2 --ohLoec6PU9_yxhbIlVWg 2XXwiASSS6685OhWWnIt_A 3
3 --qEXbk-cA0HmbPyhcffdA CVos739DJ06t8-dNiRMyeQ 3
4 --qEXbk-cA0HmbPyhcffdA jQST5lkLGX9L52-A10TGTQ 5
5 -0fMBkX7QvWKQrtOp7H-GQ 3rqoxOasrRKxNubxjLSElA 4
6 -2EuoueswhqEERWezJY8gw cInzGnaFZ3EIItvFXl1MvQ 4
7 -2Ig3GSBkj8JQT8eETmDPg d-YNxMKL6ZhkiRhfUPxKHg 3
8 -3WzrbWjnaKg2QWAsouy_g jQST5lkLGX9L52-A10TGTQ 5
9 -45GJdo8Ye8A1AStuUZp9Q -SNpLwJNup8N96yq7sBJyw 5
10 -4c_mgQdLH5axJ3j2In5_Q WPmamMTGAmNYXGoXW1mWyQ 5
11 -7R1u0HzHKmhLy9qE2MBpw KTF-E3NfkJy2wiwcgOPyVQ 4
12 -8BqfYouq3o_UoazAQWwNw a_wK-2KhPu-8DAwwRObr8A 5
13 -8BqfYouq3o_UoazAQWwNw jQST5lkLGX9L52-A10TGTQ 5
14 -8gRkiYaVm3zfoQ4pcg75w UxiSHVZxMdey7vRwm1fQyA 5
15 -8pbvWZH7Czk9YW1UkW4Ng MDtjD14H1sGLc4tSg0sUhw 3
16 -9-fkZ72_Qg4E6YYYXMqSg 4nnMgD9X62YrMqkQKhx-Pg 2
17 -9g6w1xoj6-4iZH29P3h7g kGEW4XXJQ2FS94gZv_N7VA 5
18 -A01aSKVuOm42FnhvOCdKA JiLK9QPjd53pOBEAaY83lw 5
19 -ARd7byPUILfnFVlKcn0Yg wct7rZKyZqZftzmAU-vhWQ 1
20 -BQFGG_hrORLkEs8oigCjg qcm7pfIdNn9XBuPEtoogbw 5
21 -BVv1TDLLphHzgKw-eAJJQ fJzKYljToXOauSohw9cMIA 4
22 -BVv1TDLLphHzgKw-eAJJQ puFrm8eNizztqaWr_e32pQ 4
23 -BrtOvg4tL1xcaQnQTaZow oXQmAzFj_qKNhUGYGNWLSA 1
24 -CyBQG3dc4UnpluY7UdMOA ujLZmyy11g1JHCQTxRA3Dw 3
25 -DRza4wuHHWfQx5HcG6qaw NCtzWkMbE13r2M2Sg0wH9w 4
26 -E7e4sTuVAHwwWjQYBG07w 5W48_DnrXVD7EbtmE4pxOQ 4
27 -EFuxDYchSSVkb4Q9Iivpg shCdCHRbnY5FTMJbWl-myQ 5
28 -EyEj5BujVFisco6OwmR8A puFrm8eNizztqaWr_e32pQ 4
29 -F32Vl8Rk4dwsmk0f2wRIw NH67MdKaFGNcP-dlu56pyw 3
... ... ... ...
10881 zkhOTlhe6dn-jrwDpYDN6Q LbBxrQJl-ny02-eCM1LYNg 4
10882 zlcHQII8dyI8I0LHGj8nOA 90AXjqb4O-wrTHDKDoDUzg 4
10883 zlcHQII8dyI8I0LHGj8nOA AsX-6ECbV83zGJLUVMre9w 2
10884 zlneJ82kppmQXOUGHqCLaQ RlfX4muX5LfJsvmI9qWGvw 5
10885 zmZtT1T6-J4NcqP8j1L5jA Nz_AasmpsQ8MLSqhCTRVoA 3
10886 zmZtT1T6-J4NcqP8j1L5jA joxWCp6dgN-kTE9GMziwjA 4
10887 zn81QpflLDUaGZkCMUowCg 8qrICL2tS2Rq7b5gxUdQwQ 2
10888 zn81QpflLDUaGZkCMUowCg I1rvqU2k5UQGo2lGdY6hyw 5
10889 zn81QpflLDUaGZkCMUowCg NH67MdKaFGNcP-dlu56pyw 5
10890 zn81QpflLDUaGZkCMUowCg qyNtVViurIcChc35mfYIEw 1
10891 zn81QpflLDUaGZkCMUowCg vtQOervVVTXjhvSZQiZ6PA 3
10892 znDOmt2ifMXWiAkrhjiuig AsX-6ECbV83zGJLUVMre9w 5
10893 zp-DF3qfvOn5ko_vjpQLOg KPoTixdjoJxSqRSEApSAGg 5
10894 zqHznU4iL06NziZIEGWHJw lliksv-tglfUz1T3B3vgvA 4
10895 zrBmkDDLS94GYexyp0LyqQ NH67MdKaFGNcP-dlu56pyw 5
10896 zrO1ENicvYdPsQk8ykJOkg a1t31qMLd5fQocEjbSJ61A 1
10897 zs1msKnmTFD3iV2u69USuA PXmR1MgOAWB066XH20HjxQ 4
10898 zs6wQGh1r726ZzaNKRa-bw JiLK9QPjd53pOBEAaY83lw 4
10899 zs6wQGh1r726ZzaNKRa-bw j2a5uJz76rK9uTRgLn5TdQ 3
10900 zs6wQGh1r726ZzaNKRa-bw xcxkEmy4CD-qaJUqprvpHA 5
10901 zssolmdLpaX1tpRRMDWYwg 0udEgNqy5rLR5pZ4kD19Og 5
10902 zulNp3NWnv7sYODNZ1Xrow o15PeOAUzpcCl8ngk0lMHw 4
10903 zuoeE7GdXXlCgr995ImWfQ xcxkEmy4CD-qaJUqprvpHA 1
10904 zv4i7JjhI9v9j4ZzX7TGDw NCtzWkMbE13r2M2Sg0wH9w 5
10905 zvFDYEFo_xO8VqLQfmB-DA shCdCHRbnY5FTMJbWl-myQ 5
10906 zw5NmE_epbvJ22xOYLdIoQ cBwc3dhdHw0emmg9nd5SXw 4
10907 zxQaAt4awDFVWme2I9mFgg 0udEgNqy5rLR5pZ4kD19Og 5
10908 zxcrlC3cmH5S2TGIxuLwBw MDtjD14H1sGLc4tSg0sUhw 3
10909 zxcrlC3cmH5S2TGIxuLwBw kGEW4XXJQ2FS94gZv_N7VA 2
10910 zyor9BbfHNjTsaRFfePRwQ apGVTRZRCQ9-89hu2qW-vw 5

10911 rows × 3 columns

Merge Ratings to Sentiment Scores in the ThaiTextByUniqueUserBiz Dataframe


In [15]:
ThaiTextByUniqueUserBiz = pd.merge(ThaiTextByUniqueUserBiz, ThaiReviewRatingsByUserBiz, how='left', on=['user_id', 'business_id'])
ThaiTextByUniqueUserBiz


Out[15]:
user_id business_id text YRsentiment YRSentScore YRSentScore2 AFINNSentScore AFINNSentScore2 review_ratings
0 --65q1FpAL_UQtVZ2PTGew JiLK9QPjd53pOBEAaY83lw I'm a big fan of this place and have dropped i... 43.539 3 3.5 22 2.0 5
1 --ijvARuRJhZrBdS9_jF2A ApUCpJ9aa6yVgsde16gYrg Food was ok but the service was less than exce... -9.641 3 3.0 -1 2.0 1
2 --ohLoec6PU9_yxhbIlVWg 2XXwiASSS6685OhWWnIt_A I got the Penang curry and have to say the foo... 0.105 3 3.0 5 2.0 3
3 --qEXbk-cA0HmbPyhcffdA CVos739DJ06t8-dNiRMyeQ To sum up in one sentence: "I only go to Thai ... -16.096 3 3.0 7 2.0 3
4 --qEXbk-cA0HmbPyhcffdA jQST5lkLGX9L52-A10TGTQ I LOVE THIS PLACE!\r\r\n\r\r\nIt's a cute mom-... 5.802 3 3.0 21 2.0 5
5 -0fMBkX7QvWKQrtOp7H-GQ 3rqoxOasrRKxNubxjLSElA The food was delicious and the service was ama... 12.067 3 3.5 15 2.0 4
6 -2EuoueswhqEERWezJY8gw cInzGnaFZ3EIItvFXl1MvQ My Girlfriend and I eat here occasionally and ... 14.353 3 3.5 13 2.0 4
7 -2Ig3GSBkj8JQT8eETmDPg d-YNxMKL6ZhkiRhfUPxKHg Very friendly family business. We had the pad... 3.156 3 3.0 6 2.0 3
8 -3WzrbWjnaKg2QWAsouy_g jQST5lkLGX9L52-A10TGTQ Yellow curry w/tofu is my favorite! 4.719 3 3.0 2 2.0 5
9 -45GJdo8Ye8A1AStuUZp9Q -SNpLwJNup8N96yq7sBJyw Excellent food, reasonable price and great atm... 15.490 3 3.5 13 2.0 5
10 -4c_mgQdLH5axJ3j2In5_Q WPmamMTGAmNYXGoXW1mWyQ This place has the best green curry and spicy ... 16.382 3 3.5 6 2.0 5
11 -7R1u0HzHKmhLy9qE2MBpw KTF-E3NfkJy2wiwcgOPyVQ Just ordered carry-out while at work and was p... 13.661 3 3.5 2 2.0 4
12 -8BqfYouq3o_UoazAQWwNw a_wK-2KhPu-8DAwwRObr8A Came here for lunch time was sat quickly, the ... 11.081 3 3.5 11 2.0 5
13 -8BqfYouq3o_UoazAQWwNw jQST5lkLGX9L52-A10TGTQ This place is one of the best thai places you ... 10.130 3 3.5 10 2.0 5
14 -8gRkiYaVm3zfoQ4pcg75w UxiSHVZxMdey7vRwm1fQyA The food and the service was the absolute best... 30.053 3 3.5 31 2.5 5
15 -8pbvWZH7Czk9YW1UkW4Ng MDtjD14H1sGLc4tSg0sUhw The service was fast, and friendly. The thai i... 10.124 3 3.5 2 2.0 3
16 -9-fkZ72_Qg4E6YYYXMqSg 4nnMgD9X62YrMqkQKhx-Pg Ate here for first time. My wife's parents own... -3.684 3 3.0 2 2.0 2
17 -9g6w1xoj6-4iZH29P3h7g kGEW4XXJQ2FS94gZv_N7VA This is one of the best places in Phoenix!! O... 17.372 3 3.5 19 2.0 5
18 -A01aSKVuOm42FnhvOCdKA JiLK9QPjd53pOBEAaY83lw Cannot say enough about this place!! It's our ... 9.072 3 3.5 2 2.0 5
19 -ARd7byPUILfnFVlKcn0Yg wct7rZKyZqZftzmAU-vhWQ We used to go here a lot and service was alway... -12.582 3 3.0 -2 2.0 1
20 -BQFGG_hrORLkEs8oigCjg qcm7pfIdNn9XBuPEtoogbw Fantastic ambiance and food, I'm excited to ha... 5.605 3 3.0 9 2.0 5
21 -BVv1TDLLphHzgKw-eAJJQ fJzKYljToXOauSohw9cMIA I read the reviews and thought I should check ... 33.168 3 3.5 39 2.5 4
22 -BVv1TDLLphHzgKw-eAJJQ puFrm8eNizztqaWr_e32pQ As a resident of San Fransisco for almost thre... 22.912 3 3.5 13 2.0 4
23 -BrtOvg4tL1xcaQnQTaZow oXQmAzFj_qKNhUGYGNWLSA So first time here and the combo appetizer was... -28.467 3 3.0 3 2.0 1
24 -CyBQG3dc4UnpluY7UdMOA ujLZmyy11g1JHCQTxRA3Dw i visited on jan 19 and ordered the thai basil... 18.329 3 3.5 11 2.0 3
25 -DRza4wuHHWfQx5HcG6qaw NCtzWkMbE13r2M2Sg0wH9w We have been several times and have positive f... 21.893 3 3.5 12 2.0 4
26 -E7e4sTuVAHwwWjQYBG07w 5W48_DnrXVD7EbtmE4pxOQ I came in 15 minutes before close and got my f... -9.253 3 3.0 -2 2.0 4
27 -EFuxDYchSSVkb4Q9Iivpg shCdCHRbnY5FTMJbWl-myQ Can I get a wha what? This place ROCKS! I li... 38.506 3 3.5 32 2.5 5
28 -EyEj5BujVFisco6OwmR8A puFrm8eNizztqaWr_e32pQ A friend wanted to meet for an early dinner an... 15.682 3 3.5 11 2.0 4
29 -F32Vl8Rk4dwsmk0f2wRIw NH67MdKaFGNcP-dlu56pyw I ordered take-out from Thai Elephant tonight ... 40.291 3 3.5 28 2.0 3
... ... ... ... ... ... ... ... ... ...
10881 zkhOTlhe6dn-jrwDpYDN6Q LbBxrQJl-ny02-eCM1LYNg I grew up eating amazing Thai food at a little... 19.861 3 3.5 11 2.0 4
10882 zlcHQII8dyI8I0LHGj8nOA 90AXjqb4O-wrTHDKDoDUzg Came in on a Sunday evening, place was quiet a... 18.955 3 3.5 23 2.0 4
10883 zlcHQII8dyI8I0LHGj8nOA AsX-6ECbV83zGJLUVMre9w Roasted duck was way too salty. Papaya salad w... 10.533 3 3.5 23 2.0 2
10884 zlneJ82kppmQXOUGHqCLaQ RlfX4muX5LfJsvmI9qWGvw Amazing service, food, & decor. Best Thai food... 10.771 3 3.5 9 2.0 5
10885 zmZtT1T6-J4NcqP8j1L5jA Nz_AasmpsQ8MLSqhCTRVoA Great food and phenomenal service, but please ... -0.849 3 3.0 12 2.0 3
10886 zmZtT1T6-J4NcqP8j1L5jA joxWCp6dgN-kTE9GMziwjA Unfortunately I can't give this place a 5 beca... -4.963 3 3.0 4 2.0 4
10887 zn81QpflLDUaGZkCMUowCg 8qrICL2tS2Rq7b5gxUdQwQ I am not sure why there were great reviews of ... -2.737 3 3.0 6 2.0 2
10888 zn81QpflLDUaGZkCMUowCg I1rvqU2k5UQGo2lGdY6hyw Great authentic Thai food. So glad to have fo... 16.930 3 3.5 12 2.0 5
10889 zn81QpflLDUaGZkCMUowCg NH67MdKaFGNcP-dlu56pyw Ordered drunken noodles with seafood last nigh... 17.177 3 3.5 10 2.0 5
10890 zn81QpflLDUaGZkCMUowCg qyNtVViurIcChc35mfYIEw Just shoot me. A Touch of Thai is just that: ... 7.038 3 3.0 3 2.0 1
10891 zn81QpflLDUaGZkCMUowCg vtQOervVVTXjhvSZQiZ6PA Red curry was watery and rather average. Hot ... 3.414 3 3.0 8 2.0 3
10892 znDOmt2ifMXWiAkrhjiuig AsX-6ECbV83zGJLUVMre9w I love Thai food. It is my favorite food. The ... 46.203 3 3.5 32 2.5 5
10893 zp-DF3qfvOn5ko_vjpQLOg KPoTixdjoJxSqRSEApSAGg The Wild Thaiger is the best kept Thai restaur... 77.339 4 4.0 40 2.5 5
10894 zqHznU4iL06NziZIEGWHJw lliksv-tglfUz1T3B3vgvA I now understand what all the hype is about. T... 34.365 3 3.5 9 2.0 4
10895 zrBmkDDLS94GYexyp0LyqQ NH67MdKaFGNcP-dlu56pyw FANTASTIC. Service? 5 stars. Friendly, effi... 16.308 3 3.5 20 2.0 5
10896 zrO1ENicvYdPsQk8ykJOkg a1t31qMLd5fQocEjbSJ61A Saw that there were some mixed reviews for the... 0.719 3 3.0 1 2.0 1
10897 zs1msKnmTFD3iV2u69USuA PXmR1MgOAWB066XH20HjxQ Went here on a recent trip to AZ at the recomm... 9.741 3 3.5 11 2.0 4
10898 zs6wQGh1r726ZzaNKRa-bw JiLK9QPjd53pOBEAaY83lw Delicious. The vegetable penang curry was so ... 6.660 3 3.0 4 2.0 4
10899 zs6wQGh1r726ZzaNKRa-bw j2a5uJz76rK9uTRgLn5TdQ The food was fine, nothing special. The most ... -4.652 3 3.0 0 2.0 3
10900 zs6wQGh1r726ZzaNKRa-bw xcxkEmy4CD-qaJUqprvpHA This is my favorite Thai restaurant in the Val... 20.621 3 3.5 8 2.0 5
10901 zssolmdLpaX1tpRRMDWYwg 0udEgNqy5rLR5pZ4kD19Og Yum Yum!\r\r\nI love that they actually use sp... 10.904 3 3.5 6 2.0 5
10902 zulNp3NWnv7sYODNZ1Xrow o15PeOAUzpcCl8ngk0lMHw One of my favorite places for thai takeout! Lo... 6.922 3 3.0 5 2.0 4
10903 zuoeE7GdXXlCgr995ImWfQ xcxkEmy4CD-qaJUqprvpHA Worst experience I have ever had at a restaura... -31.274 3 3.0 1 2.0 1
10904 zv4i7JjhI9v9j4ZzX7TGDw NCtzWkMbE13r2M2Sg0wH9w LOVE THIS PLACE! I've lived in NY and LA so I'... 30.269 3 3.5 23 2.0 5
10905 zvFDYEFo_xO8VqLQfmB-DA shCdCHRbnY5FTMJbWl-myQ Shopped at the market and decided to eat here.... 14.770 3 3.5 20 2.0 5
10906 zw5NmE_epbvJ22xOYLdIoQ cBwc3dhdHw0emmg9nd5SXw Decor is very nice and clean. In a small shopp... 30.799 3 3.5 19 2.0 4
10907 zxQaAt4awDFVWme2I9mFgg 0udEgNqy5rLR5pZ4kD19Og My friends and I joke that the Pad Thai has cr... 13.315 3 3.5 7 2.0 5
10908 zxcrlC3cmH5S2TGIxuLwBw MDtjD14H1sGLc4tSg0sUhw There really aren't that many Thai food spots ... 10.809 3 3.5 8 2.0 3
10909 zxcrlC3cmH5S2TGIxuLwBw kGEW4XXJQ2FS94gZv_N7VA This was my go-to for delivery Thai food. Pret... -1.059 3 3.0 4 2.0 2
10910 zyor9BbfHNjTsaRFfePRwQ apGVTRZRCQ9-89hu2qW-vw Had dinner at the Bangkok Thai Bar-B-Q tonight... 22.703 3 3.5 24 2.0 5

10911 rows × 9 columns

Save the Working Dataframe for Future Reference and for Recommender Activities


In [16]:
ThaiTextByUniqueUserBiz.to_pickle('ThaiTextByUniqueUserBizWithSentimentScoresRatings.pkl')

In [17]:
ThaiTextByUniqueUserBiz.to_csv('ThaiTextByUniqueUserBizWithSentimentScoresRatings.csv', encoding='utf-8')

Preliminary Descriptive Analysis of Sentiment Scores and Ratings

In this section of the kernel, we first provide a summary of the score and rating dataframe we created in the previous section. We add to the summary a count of values per column and each their median to assess the distribution.

Draw up a Summary for the Score and Rating Working Dataframe


In [18]:
ThaiTextByUniqueUserBiz.describe()


Out[18]:
YRsentiment YRSentScore YRSentScore2 AFINNSentScore AFINNSentScore2 review_ratings
count 10911.000000 10911.000000 10911.000000 10911.000000 10911.000000 10911.000000
mean 13.940278 3.009898 3.332692 13.196957 2.034002 3.869112
std 19.472534 0.142946 0.260143 11.963478 0.185616 1.175408
min -233.857000 1.000000 1.000000 -36.000000 1.000000 1.000000
25% 4.834500 3.000000 3.000000 6.000000 2.000000 3.000000
50% 12.261000 3.000000 3.500000 11.000000 2.000000 4.000000
75% 21.616000 3.000000 3.500000 18.000000 2.000000 5.000000
max 249.665000 5.000000 5.000000 221.000000 5.000000 5.000000

Observations 1:

(1) The sentiment score results from the algorithm and the Yelp Restaurant Review Sentiment Lexicon range from -234 to 250.

(2) The sentiment score results from the algorithm and the two AFINN Lexicons combined range from -36 to 221.

Observations 2:

(1) Like the review_ratings, both sets of converted sentiment scores range from 1 to 5. The mean changes: it is the lowest at 3.0 with sentiment scores converted with no decimals, 3.3 with sentiment scores converted to a 0.5-increment scale and 3.9 with review ratings.

Perform a Count of Yelp Review Sentiment Scores Scaled to 1-to-5 with No Decimals and Compute the Median


In [19]:
pd.value_counts(ThaiTextByUniqueUserBiz.YRSentScore.ravel())


Out[19]:
3    10699
4      156
2       52
5        3
1        1
dtype: int64

In [20]:
ThaiTextByUniqueUserBiz.YRSentScore.median()


Out[20]:
3.0

Observations 3:

(1) The Yelp Restaurant Review Lexicon produced a range of values where 10,699 values out of 10,911 lie in the middle which happens to be the median. This median corresponds to a count of 0 in the initial score count; in other words, the median corresponds to an evenly balanced score of sentiments or a neutral sentiment, which is neither positive nor negative.

(2) This leaves us with 156 + 3 Thai restaurant reviews in the Yelp dataset (given reviewers of Phoenix restaurants) for recommendation purposes that have sentiments that are either positive or very positive. A pertinent question might be how many of these 159 positive-sentiment elements are reviews for Thai restaurants in Las Vegas.

Perform a Count of Yelp Review Sentiment Scores Scaled to 1-to-5 with 0.5 Increments and Compute the Median


In [21]:
pd.value_counts(ThaiTextByUniqueUserBiz.YRSentScore2.ravel())


Out[21]:
3.5    6987
3.0    3712
4.0     147
2.5      48
4.5       9
2.0       4
5.0       3
1.0       1
dtype: int64

In [22]:
ThaiTextByUniqueUserBiz.YRSentScore2.median()


Out[22]:
3.5

In [23]:
ThaiTextByUniqueUserBiz[ThaiTextByUniqueUserBiz['YRsentiment']==0]


Out[23]:
user_id business_id text YRsentiment YRSentScore YRSentScore2 AFINNSentScore AFINNSentScore2 review_ratings
10722 ywf8LhV2jCvbksPx6wc7aw ApUCpJ9aa6yVgsde16gYrg Closed 0 3 3 0 2 1

Observations 4:

(1) This version of the Yelp Review Sentiment Scores Scaled 1-to-5 with 0.5 increments offers new potentials for a recommender system as 6,987 + 147 + 9 + 3 reviews fare positively. 3,712 reviews are neutral whereas 48 + 4 + 1 reviews raise a negative sentiment.

(2) The distribution is skewed to positive (although not quite as dramatically as the review_ratings (see below)).

(3) As there is no 1.5 score, we acknowledge that the one review with a score of 1 may be an outlier.

(4) The filter "0" applied to the initial score count without scaling ('0' is the neutral value in this column) suggests that the neutral value for the corresponding scaled value (without decimals) is 3 and that the neutral value for the corresponding scaled value using 0.5 increments is also 3.

(5) The median of 3.5 is different than the corresponding scaled value of 3 (using 0.5 increments) in item 4.

Perform a Count of AFINN Sentiment Scores Scaled to 1-to-5 with 0.5 Increments and Compute the Median


In [24]:
pd.value_counts(ThaiTextByUniqueUserBiz.AFINNSentScore2.ravel())


Out[24]:
2.0    9573
2.5     952
1.5     328
3.0      54
3.5       2
1.0       1
5.0       1
dtype: int64

In [25]:
ThaiTextByUniqueUserBiz.AFINNSentScore2.median()


Out[25]:
2.0

In [26]:
ThaiTextByUniqueUserBiz[ThaiTextByUniqueUserBiz['AFINNSentScore']==0]


Out[26]:
user_id business_id text YRsentiment YRSentScore YRSentScore2 AFINNSentScore AFINNSentScore2 review_ratings
172 08PXVzu6ysM93s-HaUPOIQ 90AXjqb4O-wrTHDKDoDUzg Very nicely appointed place, nicely done. The ... 3.434 3 3.0 0 2 3
192 0GYLQqgu7v_qMZGgflv5Hg 4xpACaa99_KFokYvNLXMBA Spring roll was chewy and oily. Soup flavor pr... -11.058 3 3.0 0 2 1
206 0ItqxKwFSTgkifR6Fv13dQ shlAd7PLzWlQrkQ0uYcBBg Been here a couple times and it continues to s... 1.855 3 3.0 0 2 4
233 0U6YfdxR2aw-_4vctk9eKA 2bdKR3l4o-S1CscLqqnvVw I used to come here all the time for take out ... 1.156 3 3.0 0 2 1
242 0WPwr3Jr4a_Ywg_hZxm4KQ 1621ir5mjVgbHwxCbMAEjg When you ask for HOT it means hot. Not take it... 0.166 3 3.0 0 2 2
328 0npnrzUAhaiOX2awq3dPUw 4nnMgD9X62YrMqkQKhx-Pg Standard wok place. My experience was that the... -1.213 3 3.0 0 2 2
351 0zcc8klD5N3kUvHnpA5ZlA TqVjy0dxvNh51BF9KePCoQ Below average Chinese food. Chow mein was very... -0.873 3 3.0 0 2 2
390 1AM1mfGPGIlQi78p_OtnPQ CVos739DJ06t8-dNiRMyeQ It's been a while since we've been here. Must ... 2.051 3 3.0 0 2 3
396 1CpyQ1VkgmNwA6RxHoq2bA KPoTixdjoJxSqRSEApSAGg I've been here several times, but have to admi... 1.654 3 3.0 0 2 3
410 1J3EW_WdsDoOeQv5j1Ws8w 17DI33J8TkcfzyoiIYLQIw Ok -1.151 3 3.0 0 2 3
429 1Sujllo0dj27Yn1GyjLc0A PXmR1MgOAWB066XH20HjxQ Its a hole in the wall but the food and.staff ... 1.141 3 3.0 0 2 5
504 1ykGYq1ERicaZWCl5-Vi0Q JiLK9QPjd53pOBEAaY83lw Average Thai cuisine. Not the speediest service. 0.139 3 3.0 0 2 3
596 2UPKuscn_Th2C72OSn7Z8Q KPoTixdjoJxSqRSEApSAGg Not the best Thai food I have had, but not the... -1.453 3 3.0 0 2 3
630 2fJXLk_fkVdLAQmHLCMi-Q 1621ir5mjVgbHwxCbMAEjg We were going to go to our normal Thai Restaur... -3.966 3 3.0 0 2 3
661 2rawUU9YgDUy9mTXj5NuSw oJpmYvLibGrYPDvcaUeMOw Not so much.... \r\r\nI also had green curry,... 1.808 3 3.0 0 2 2
732 3EyjZYtNuzA0YY-2sDULoQ mPGgxatANSPw9KbMluhXkA First time there, we went for dinner. Ordered ... -33.042 3 3.0 0 2 2
854 3nDDsqAEpRa_wiDEz7Y4Ww JiLK9QPjd53pOBEAaY83lw Asian girl approve. mmm yuuum. 2.710 3 3.0 0 2 5
885 3xmdCeTGOQLB9w2qXk5QGA QC18oZ4atjW1vtZo71ohxw Place is ok nothing to brag about with all the... -1.181 3 3.0 0 2 2
889 3xmdCeTGOQLB9w2qXk5QGA uY1hOM4pySx07Yle9NGAiQ Archis is was one of the better Thai food rest... 3.258 3 3.0 0 2 4
903 43czilYWn6VjRWskwN2_9Q E4b5OC_6mZ0V7B6Nyjncsg Food was great, service was beyond slow. Took ... -3.102 3 3.0 0 2 2
907 441ApXa2cQHpsYtdN4Jvxw nefBMCGgMPvOnASoyO_X1A ok so.. I'm obviously no thai connoisseur.. bu... -1.323 3 3.0 0 2 2
916 47jZmzne8zhWEuCm8aZj8A j2a5uJz76rK9uTRgLn5TdQ Yum! 3.246 3 3.0 0 2 5
936 4D6fUykocgc99jXPKyYzTA 90AXjqb4O-wrTHDKDoDUzg I found this place watching check please Arizo... 5.496 3 3.0 0 2 5
938 4DX3XjoOaR9e4y3EQPoHSw -SNpLwJNup8N96yq7sBJyw Sep 25 2011 2:10 pm\r\r\n\r\r\n1 item lunch co... 7.114 3 3.0 0 2 3
946 4E_nPWw89FLFHdNsEgMH-g a1t31qMLd5fQocEjbSJ61A Hmmm The Noodle Shop. Me and the girls wanted ... 5.185 3 3.0 0 2 4
967 4GdN3itvzmFc_6kib2UzlA 2bdKR3l4o-S1CscLqqnvVw ... -0.051 3 3.0 0 2 1
1005 4UUIpbOTPmu43wuC2aSGkg puFrm8eNizztqaWr_e32pQ I was alone during my lunch break which meant ... 7.563 3 3.0 0 2 3
1027 4aYStmYjksIS4PPf2_l49g km3g-wDO2KfhfnTvJrLJig I am posting this becuase I ordered from anoth... -80.022 2 2.5 0 2 1
1109 58qQ6J55I1SiCMHl4PpQ0w SMpL3z4FLF07bRA6-y22JQ Their Tom Yum is incredible! 6.528 3 3.0 0 2 5
1136 5Ka8MMYEoZfsU-jZzUt00Q JiLK9QPjd53pOBEAaY83lw Thai House is a 2 minute drive from my work. I... 19.868 3 3.5 0 2 4
... ... ... ... ... ... ... ... ... ...
9500 rj98ZLijdmoq6XdW_oodJQ 4dL8yzQu974eCtTI-V4Znw Much like the other posters have asserted, the... -4.241 3 3.0 0 2 2
9621 sPFh-si4S3Ii6A-1Ycg99g ApUCpJ9aa6yVgsde16gYrg Went when they said they were open (M-F 11-2pm... -1.475 3 3.0 0 2 1
9647 sYGCmAr5HWknrMoihaLJ8g UxiSHVZxMdey7vRwm1fQyA The food is very good but the portions are sma... 5.278 3 3.0 0 2 3
9752 tCVQyYHcmOrx2i81C47Tew K4DHwck_2ds1wURnmZdFiA Food and service mediocre. -2.336 3 3.0 0 2 2
9784 tQ4FIilp-7oz4A5hRMy4EQ NCtzWkMbE13r2M2Sg0wH9w They have gorgeoussettings both outside and in... 6.396 3 3.0 0 2 2
9854 tlSSQwfHYJany7wPoTH46A wct7rZKyZqZftzmAU-vhWQ This place is the equivalent of salt lick Thai... 2.581 3 3.0 0 2 2
9895 u1KWcbPMvXFEEYkZZ0Yktg RFeDe3fNr14kvUKlVx6_4w this place was so so. we went on easter they ... -4.804 3 3.0 0 2 2
9963 uQhrZB0oIJoMdpesLawWcw -SNpLwJNup8N96yq7sBJyw Yet another overrated restaurant in Vegas. I n... -8.162 3 3.0 0 2 3
9992 uZbTb-u-GVjTa2gtQfry5g puFrm8eNizztqaWr_e32pQ They give you huge portions of food for your m... 14.030 3 3.5 0 2 4
9996 uaLKzVZZY8B6_u_yfER3nw 5hfQ5cNFDFPhcjx2De9-qQ I have eaten here several times and for the mo... -9.261 3 3.0 0 2 2
10023 ukXbRRMYn6I22UQtQ6STdg JiLK9QPjd53pOBEAaY83lw All I can say is Panang Chicken!!! This stuff ... 10.923 3 3.5 0 2 5
10034 uqEQsMbvsQQcqKpZrPyYwQ vtQOervVVTXjhvSZQiZ6PA Can't go wrong with the Lunch Special. Chicken... 4.465 3 3.0 0 2 4
10169 vYmM4KTsC8ZfQBg-j5MWkw jQST5lkLGX9L52-A10TGTQ Hate to be the sour apple amongst all these gl... -8.329 3 3.0 0 2 1
10191 vhxFLqRok6r-D_aQz0s-JQ NBvrN_ZDpmBCsPI1qLj1Qw My experience was only for takeout and not din... -5.156 3 3.0 0 2 2
10259 wBl40prH1tWmtvjtoPGcaA SMpL3z4FLF07bRA6-y22JQ The Tom Ka Gai soup is so flavorful, I have go... 5.494 3 3.0 0 2 5
10310 wRyLeAg0R9WMFg3gqpnX-Q qyNtVViurIcChc35mfYIEw No, No No! What in the H...well, first of all ... -12.924 3 3.0 0 2 1
10333 waD2euOMPTnTzQ0thq3H6Q 3ddQnTTIY-mdsBMn9b53Cw OK for lunch specials but a little pricey for ... 9.269 3 3.5 0 2 3
10346 wbhLZERyuuFSxrk441waBQ IRxBQfA7FdHhzrPjtBxuuw Ordered the red curry with chicken. Tasted goo... -3.606 3 3.0 0 2 1
10387 wo4b_NZrBfWLoPE0c_9Qng oJpmYvLibGrYPDvcaUeMOw Overpriced, just mediocre Thai food. If the p... -2.336 3 3.0 0 2 2
10495 xaXarFsZsvd3OhhutFNcNA kaeF6dd_f_Hd7raszKWAWg I came here with my daughter, who said "I am a... -2.056 3 3.0 0 2 4
10531 xnVrFpRkvTffPdisdDzw_A wzuIHNOJrxSwfhKt-wV1TA I'm hooked on the Drunken Noodles! Ordered the... 12.936 3 3.5 0 2 4
10534 xoeDPWM2khcwMPfDoFzmKg KTF-E3NfkJy2wiwcgOPyVQ Perhaps it's my fault... We arrived 30 minutes... -14.261 3 3.0 0 2 2
10579 y3sOyQUiwL_dV38H652dfw 3ddQnTTIY-mdsBMn9b53Cw I had really high expectations after I read th... -24.769 3 3.0 0 2 2
10607 yCZWhNAVXeZHv_g08P4x_Q cT_rocMh92B9t62Disp6gA My girlfriend and I are Pad Thai enthusiasts -... -6.850 3 3.0 0 2 2
10623 yIYQewxT56mvIDLuBm7k9Q 1621ir5mjVgbHwxCbMAEjg Yuck. -2.457 3 3.0 0 2 2
10670 yfl_yB-Bv4BTgtTAVMFR6w lliksv-tglfUz1T3B3vgvA Unbelievably tasty Thai food. Enormous menu wh... 11.467 3 3.5 0 2 5
10722 ywf8LhV2jCvbksPx6wc7aw ApUCpJ9aa6yVgsde16gYrg Closed 0.000 3 3.0 0 2 1
10786 zHgmWv6NbwZepUSqA_FlLw VbXy3tH5RAu7HjT7VeMMgA the bad part on travelling is: have to keep on... 1.988 3 3.0 0 2 1
10879 zjWmhqGl_L1F8VzuuzaDUQ mPGgxatANSPw9KbMluhXkA Another trendy restaurant with over priced sma... 0.530 3 3.0 0 2 2
10899 zs6wQGh1r726ZzaNKRa-bw j2a5uJz76rK9uTRgLn5TdQ The food was fine, nothing special. The most ... -4.652 3 3.0 0 2 3

252 rows × 9 columns

Observations 5:

(1) As the median is 2 (not 3 like what the results show from the Yelp Review Sentiment Score code cells above), the distribution is skewed to positive although 9,573 out of 10,911 reviews are neutral in sentiment.

(2) As there are no 4.0 and 4.5 scores, we acknowledge that the one review with a score of 5 may be an outlier.

(3) There are more neutral reviews from the computations with the AFINN Lexicon than there are with the Yelp Restaurant Review Lexicon in a 0.5 incremental scale.

(4) Only 952 + 54 + 2 + 1 reviews raise a positive sentiment.

(5) The filter "0" applied to the initial AFINN score count without scaling ('0' is the neutral value in this column) suggests that the neutral value for the corresponding scaled value using 0.5 increments is 2. There are 252 reviews that correspond to a neutral position according to the AFINN Lexicon initial count whereas there is one only one review that corresponds to a neutral position according to the Yelp Restaurant Review Lexicon count. The latter has also a neutral value of 2 in the scaled AFINN column. Interestingly, the corresponding rating given by the user is "1" which is "strongly dislike". The only word in that particular review is "closed" which likely means that it is not in the lexicons or that it is rated as neutral.

Perform a Count of Ratings Given by Individual Users and Compute the Median


In [27]:
pd.value_counts(ThaiTextByUniqueUserBiz.review_ratings.astype(int).ravel())


Out[27]:
5    3949
4    3775
3    1586
2     918
1     683
dtype: int64

In [28]:
ThaiTextByUniqueUserBiz.review_ratings.astype(int).median()


Out[28]:
4.0

Observations 6:

(1) As the median is 4 (not 2 or 3 like the results from the Yelp and AFINN lexicons), the distribution is clearly skewed to "liking" with 3,949 + 3,775 reviews that are leaning to "liking", 1,586 reviews that are satisfactory, and 918 + 683 reviews that are leaning towards 'disliking'.

(2) Although Yelp did not use 0.5 increments (we selected integer as a type to prevent function-produced decimals), the distribution offers a good range of possibilities. They do not match the range of possibilities produced by the sentiment detection methods above.

CONCLUSIONS

The limited number of positively-charged reviews using either lexicons may improve the odds of predicting top restaurants for a user. The very high number of neutral reviews using either lexicons is a little disappointing: we hoped to see a better range of sentiment scores. We anticipate that a normalizing step may necessary for better recommendation results before the scaling phase in this sentiment detection exercise; otherwise, the high number of neutral reviews may remain when it's time to normalize the distribution in the recommender's set of activities.