The aim is to analyze the body gestures of Japanese speakers when speaking in Japanese and English and to
The history of studies on language analysis dates back to 1950's, when Albert Mehrabian, a pioneer researcher of body language, revealed that the total impact of a message is 7% verbal (words only), 38% vocal (including tone of voice, inflection andother sounds) and 55% percent nonverbal. Anthropologist Ray Birdwhistell also created the fied of research called "kinesics", whihc is a study of non verbal communication. (Pease, A., & Pease, B. (2004). The Definitive Book of Body Language. Bantam Dell.)
In general, there are mainly three types of body gestures: metaphorics, iconics (both kinetographs and pictographs) and deictics. An iconic getsture depicts aspects of an object or action, and are usually does not require the accompanying of speech. A metaphoric gesture is special kind of iconic gesture that represents an abstract idea. Deictic gestures refer to objects or locations in physical or conceptual space, and conveys aspects of a speaker’s meaning that are difficult to express in words.
Adams (1998) found that native speakers of English produced more deictic and iconic gestures when retelling a narrative to L2 English learners than to English-speaking interlocutors (Adams, T. W. (1998). Gesture in foreigner talk. University of Pennsylvania, Philadelphia, PA. Retrieved from http://repository.upenn.edu/dissertations/AAI9829850/)
Gullberg (2003) found that L2 learners are much more likely to produce iconic gestures when referring to entities that they have previously mentioned when speaking the target language than their native language (Gullberg, M. (2003). Gestures, referents, and anaphoric linkage in learner varieties. Information structure and the dynamics of language acquisition. John Benjamins Publishing Company.)
Hayashi and Mori (2006) explains how one study revealed that L2 learners frequently gestured when they were unable to complete an utterance in Japanese, which at the same time encourages the Japanese interlocutors to suggest appropriate completions for their sentences (Mori, J., & Hayashi, M. (2006). The achievement of intersubjectivity through embodied completions: A study of interactions between first and second language speakers. Applied Linguistics, 27(2), 195 –219. doi:10.1093/applin/aml014)
Morett et al. (2012) showed through their experiments that interlocutors use iconic gesture to facilitate communication in a novel second language (Morett, L.M., Gibbs, R.W. & MacWhinney, B (2012). The role of gesture in second language learning: Communication, acquisition, and retention. Proceedings of CogSci)
Gullberg explains in her paper how Kida (2005) reported that, like the learning of the language itself, gestures of an L2 learner develops. (the original paper by Kida was,unfortunately, written in French.) Kida examined Japanese learners of French residing in France, while carefully noting the role played by the gestural properties of the source (Japanese) and target (French) cultures, of the situation and context of a particular type of interaction, as well as individual preferences. He states that the gestural development is "not linear but rather complex and multi-layered". (Kida, Tsuyoshi (2005). Appropriation du Geste par les Étrangers: Le Cas d’Étudiants Japonais Apprenant le Français. Aix en Provence: Laboratoire Parole et Langage.) (Gullberg, M. Some reasons for studying gesture and second language acquisition (Hommage à Adam Kendon))
The studies above suggest that the subjects, all of whom are L2 speakers of English, would generally use more gestures when they cannnot come up with the right vocabulary to express their feelings, and are also likely to use more gestures than they do in Japanese since it is a second language.
When a Japanese person speaks in English, the frequency and the size of body gestures increase compared to when they speak in Japanese. If the speaker is fluent in English, the differences of body gesture between English and Japanese are not so significant compared to speakers who are unconfident with their English. In other words, fluent English speakers will have just as much gestures in Japanese as they do in English.
The data was collected by asking 6 subjects to have a conversation for 3 minutes each in English and Japanese.
All the subjects were Japanese students at KMD. The initial plan was to experiment with diverse races and nationalities, but body languages may depend on the following three variables.
In order to reduce the number of variables into one, I chose the language being spoken as the variable and fixed the other two elements, which meant that the subjects should be limited to Japanese students whose mother tongue is Japanese.
The details of the subjects are as follows.
All 6 subjects were native Japanese speakers who have no problem in communicating with Japanese.
In addition to the 6 subjects, I also collected data from one more subject as the "test data" for the classifier that will be created from the 6 subjects. This subject has also lived in an English-speaking country for more than 5 years, and is considered bilingual.
The Japanese conversation was always started off with the topic "Favourite countries/cities the subjects have visited", and the English conversation was always started off with the topic "Description of the subjects' mother". The subjects spoke facing myself with a table in between.
The acclerometer was attached to the subject's good hand's wrist (y axis pointing towards the fingers, z axis pointing upwards from the back of the hand), and the gyroscope was attached to the subject's head with a head band (y axis pointing upwards from the head, x axis pointing towards the right ear). The data is raw sensor data (not calibrated). The data structure of the file is shown in the header of the file.
Below are the images of the subjects during the experiment.
In [1]:
import matplotlib.pyplot as plt
from pandas import read_csv
%pylab inline
dohi_jgx = read_csv('data/jap_english/dohi/japanese/gx.txt',delimiter=',', names=['date','gx'])
dohi_jgy = read_csv('data/jap_english/dohi/japanese/gy.txt',delimiter=',', names=['date','gy'])
dohi_jgz = read_csv('data/jap_english/dohi/japanese/gz.txt',delimiter=',', names=['date','gz'])
dohi_ja = read_csv('data/jap_english/dohi/japanese/a.txt',delimiter=',',skiprows=1,names=['ax','ay','az'])
dohi_egx = read_csv('data/jap_english/dohi/english/gx.txt',delimiter=',', names=['date','gx'])
dohi_egy = read_csv('data/jap_english/dohi/english/gy.txt',delimiter=',', names=['date','gy'])
dohi_egz = read_csv('data/jap_english/dohi/english/gz.txt',delimiter=',', names=['date','gz'])
dohi_ea = read_csv('data/jap_english/dohi/english/a.txt',delimiter=',',skiprows=1,names=['ax','ay','az'])
kohei_jgx = read_csv('data/jap_english/kohei/japanese/gx.txt',delimiter=',', names=['date','gx'])
kohei_jgy = read_csv('data/jap_english/kohei/japanese/gy.txt',delimiter=',', names=['date','gy'])
kohei_jgz = read_csv('data/jap_english/kohei/japanese/gz.txt',delimiter=',', names=['date','gz'])
kohei_ja = read_csv('data/jap_english/kohei/japanese/a_cut.txt',delimiter='\t',names=['date', 'ax','ay','az'])
kohei_egx = read_csv('data/jap_english/kohei/english/gx.txt',delimiter=',', names=['date','gx'])
kohei_egy = read_csv('data/jap_english/kohei/english/gy.txt',delimiter=',', names=['date','gy'])
kohei_egz = read_csv('data/jap_english/kohei/english/gz.txt',delimiter=',', names=['date','gz'])
kohei_ea = read_csv('data/jap_english/kohei/english/a_cut.txt',delimiter='\t',names=['date', 'ax','ay','az'])
nure_jgx = read_csv('data/jap_english/nure/japanese/gx.txt',delimiter=',', names=['date','gx'])
nure_jgy = read_csv('data/jap_english/nure/japanese/gy.txt',delimiter=',', names=['date','gy'])
nure_jgz = read_csv('data/jap_english/nure/japanese/gz.txt',delimiter=',', names=['date','gz'])
nure_ja = read_csv('data/jap_english/nure/japanese/a_cut.txt',delimiter='\t',names=['date', 'ax','ay','az'])
nure_egx = read_csv('data/jap_english/nure/english/gx.txt',delimiter=',', names=['date','gx'])
nure_egy = read_csv('data/jap_english/nure/english/gy.txt',delimiter=',', names=['date','gy'])
nure_egz = read_csv('data/jap_english/nure/english/gz.txt',delimiter=',', names=['date','gz'])
nure_ea = read_csv('data/jap_english/nure/english/a_cut.txt',delimiter='\t',names=['date', 'ax','ay','az'])
yamana_jgx = read_csv('data/jap_english/yamana/japanese/gx.txt',delimiter=',', names=['date','gx'])
yamana_jgy = read_csv('data/jap_english/yamana/japanese/gy.txt',delimiter=',', names=['date','gy'])
yamana_jgz = read_csv('data/jap_english/yamana/japanese/gz.txt',delimiter=',', names=['date','gz'])
yamana_ja = read_csv('data/jap_english/yamana/japanese/a_cut.txt',delimiter='\t',names=['date', 'ax','ay','az'])
yamana_egx = read_csv('data/jap_english/yamana/english/gx.txt',delimiter=',', names=['date','gx'])
yamana_egy = read_csv('data/jap_english/yamana/english/gy.txt',delimiter=',', names=['date','gy'])
yamana_egz = read_csv('data/jap_english/yamana/english/gz.txt',delimiter=',', names=['date','gz'])
yamana_ea = read_csv('data/jap_english/yamana/english/a_cut.txt',delimiter='\t',names=['date', 'ax','ay','az'])
toshi_jgx = read_csv('data/jap_english/toshi/japanese/gx.txt',delimiter=',', names=['date','gx'])
toshi_jgy = read_csv('data/jap_english/toshi/japanese/gy.txt',delimiter=',', names=['date','gy'])
toshi_jgz = read_csv('data/jap_english/toshi/japanese/gz.txt',delimiter=',', names=['date','gz'])
toshi_ja = read_csv('data/jap_english/toshi/japanese/a_cut.txt',delimiter='\t',names=['date', 'ax','ay','az'])
toshi_egx = read_csv('data/jap_english/toshi/english/gx.txt',delimiter=',', names=['date','gx'])
toshi_egy = read_csv('data/jap_english/toshi/english/gy.txt',delimiter=',', names=['date','gy'])
toshi_egz = read_csv('data/jap_english/toshi/english/gz.txt',delimiter=',', names=['date','gz'])
toshi_ea = read_csv('data/jap_english/toshi/english/a_cut.txt',delimiter='\t',names=['date', 'ax','ay','az'])
yukita_jgx = read_csv('data/jap_english/yukita/japanese/gx.txt',delimiter=',', names=['date','gx'])
yukita_jgy = read_csv('data/jap_english/yukita/japanese/gy.txt',delimiter=',', names=['date','gy'])
yukita_jgz = read_csv('data/jap_english/yukita/japanese/gz.txt',delimiter=',', names=['date','gz'])
yukita_ja = read_csv('data/jap_english/yukita/japanese/a_cut.txt',delimiter='\t',names=['date', 'ax','ay','az'])
yukita_egx = read_csv('data/jap_english/yukita/english/gx.txt',delimiter=',', names=['date','gx'])
yukita_egy = read_csv('data/jap_english/yukita/english/gy.txt',delimiter=',', names=['date','gy'])
yukita_egz = read_csv('data/jap_english/yukita/english/gz.txt',delimiter=',', names=['date','gz'])
yukita_ea = read_csv('data/jap_english/yukita/english/a_cut.txt',delimiter='\t',names=['date', 'ax','ay','az'])
miyuki_jgx = read_csv('data/jap_english/miyuki/japanese/gx.txt',delimiter=',', names=['date','gx'])
miyuki_jgy = read_csv('data/jap_english/miyuki/japanese/gy.txt',delimiter=',', names=['date','gy'])
miyuki_jgz = read_csv('data/jap_english/miyuki/japanese/gz.txt',delimiter=',', names=['date','gz'])
miyuki_ja = read_csv('data/jap_english/miyuki/japanese/a_cut.txt',delimiter='\t',names=['date', 'ax','ay','az'])
miyuki_egx = read_csv('data/jap_english/miyuki/english/gx.txt',delimiter=',', names=['date','gx'])
miyuki_egy = read_csv('data/jap_english/miyuki/english/gy.txt',delimiter=',', names=['date','gy'])
miyuki_egz = read_csv('data/jap_english/miyuki/english/gz.txt',delimiter=',', names=['date','gz'])
miyuki_ea = read_csv('data/jap_english/miyuki/english/a_cut.txt',delimiter='\t',names=['date', 'ax','ay','az'])
In [2]:
import pandas as pd
dohi_jgxy = pd.merge(dohi_jgx, dohi_jgy)
dohi_jgxy.head()
Out[2]:
In [3]:
dohi_jg = pd.merge(dohi_jgxy, dohi_jgz)
dohi_jg['gx'] = dohi_jg['gx'].abs()
dohi_jg['gy'] = dohi_jg['gy'].abs()
dohi_jg['gz'] = dohi_jg['gz'].abs()
In [5]:
dohi_jg.head()
Out[5]:
In [4]:
nure_jgxy = pd.merge(nure_jgx, nure_jgy)
nure_jg = pd.merge(nure_jgxy, nure_jgz)
nure_jg['gx'] = nure_jg['gx'].abs()
nure_jg['gy'] = nure_jg['gy'].abs()
nure_jg['gz'] = nure_jg['gz'].abs()
kohei_jgxy = pd.merge(kohei_jgx, kohei_jgy)
kohei_jg = pd.merge(kohei_jgxy, kohei_jgz)
kohei_jg['gx'] = kohei_jg['gx'].abs()
kohei_jg['gy'] = kohei_jg['gy'].abs()
kohei_jg['gz'] = kohei_jg['gz'].abs()
yamana_jgxy = pd.merge(yamana_jgx, yamana_jgy)
yamana_jg = pd.merge(yamana_jgxy, yamana_jgz)
yamana_jg['gx'] = yamana_jg['gx'].abs()
yamana_jg['gy'] = yamana_jg['gy'].abs()
yamana_jg['gz'] = yamana_jg['gz'].abs()
toshi_jgxy = pd.merge(toshi_jgx, toshi_jgy)
toshi_jg = pd.merge(toshi_jgxy, toshi_jgz)
toshi_jg['gx'] = toshi_jg['gx'].abs()
toshi_jg['gy'] = toshi_jg['gy'].abs()
toshi_jg['gz'] = toshi_jg['gz'].abs()
yukita_jgxy = pd.merge(yukita_jgx, yukita_jgy)
yukita_jg = pd.merge(yukita_jgxy, yukita_jgz)
yukita_jg['gx'] = yukita_jg['gx'].abs()
yukita_jg['gy'] = yukita_jg['gy'].abs()
yukita_jg['gz'] = yukita_jg['gz'].abs()
miyuki_jgxy = pd.merge(miyuki_jgx, miyuki_jgy)
miyuki_jg = pd.merge(miyuki_jgxy, miyuki_jgz)
miyuki_jg['gx'] = miyuki_jg['gx'].abs()
miyuki_jg['gy'] = miyuki_jg['gy'].abs()
miyuki_jg['gz'] = miyuki_jg['gz'].abs()
dohi_egxy = pd.merge(dohi_egx, dohi_egy)
dohi_eg = pd.merge(dohi_egxy, dohi_egz)
dohi_eg['gx'] = dohi_eg['gx'].abs()
dohi_eg['gy'] = dohi_eg['gy'].abs()
dohi_eg['gz'] = dohi_eg['gz'].abs()
nure_egxy = pd.merge(nure_egx, nure_egy)
nure_eg = pd.merge(nure_egxy, nure_egz)
nure_eg['gx'] = nure_eg['gx'].abs()
nure_eg['gy'] = nure_eg['gy'].abs()
nure_eg['gz'] = nure_eg['gz'].abs()
kohei_egxy = pd.merge(kohei_egx, kohei_egy)
kohei_eg = pd.merge(kohei_egxy, kohei_egz)
kohei_eg['gx'] = kohei_eg['gx'].abs()
kohei_eg['gy'] = kohei_eg['gy'].abs()
kohei_eg['gz'] = kohei_eg['gz'].abs()
yamana_egxy = pd.merge(yamana_egx, yamana_egy)
yamana_eg = pd.merge(yamana_egxy, yamana_egz)
yamana_eg['gx'] = yamana_eg['gx'].abs()
yamana_eg['gy'] = yamana_eg['gy'].abs()
yamana_eg['gz'] = yamana_eg['gz'].abs()
toshi_egxy = pd.merge(toshi_egx, toshi_egy)
toshi_eg = pd.merge(toshi_egxy, toshi_egz)
toshi_eg['gx'] = toshi_eg['gx'].abs()
toshi_eg['gy'] = toshi_eg['gy'].abs()
toshi_eg['gz'] = toshi_eg['gz'].abs()
yukita_egxy = pd.merge(yukita_egx, yukita_egy)
yukita_eg = pd.merge(yukita_egxy, yukita_egz)
yukita_eg['gx'] = yukita_eg['gx'].abs()
yukita_eg['gy'] = yukita_eg['gy'].abs()
yukita_eg['gz'] = yukita_eg['gz'].abs()
miyuki_egxy = pd.merge(miyuki_egx, miyuki_egy)
miyuki_eg = pd.merge(miyuki_egxy, miyuki_egz)
miyuki_eg['gx'] = miyuki_eg['gx'].abs()
miyuki_eg['gy'] = miyuki_eg['gy'].abs()
miyuki_eg['gz'] = miyuki_eg['gz'].abs()
In [5]:
dohi_ja['ax'] = (dohi_ja['ax'] - 498).abs()
dohi_ja['ay'] = (dohi_ja['ay'] - 536).abs()
dohi_ja['az'] = (dohi_ja['az'] - 715).abs()
dohi_ea['ax'] = (dohi_ea['ax'] - 498).abs()
dohi_ea['ay'] = (dohi_ea['ay'] - 536).abs()
dohi_ea['az'] = (dohi_ea['az'] - 715).abs()
nure_ja['ax'] = (nure_ja['ax'] - 498).abs()
nure_ja['ay'] = (nure_ja['ay'] - 536).abs()
nure_ja['az'] = (nure_ja['az'] - 715).abs()
nure_ea['ax'] = (nure_ea['ax'] - 498).abs()
nure_ea['ay'] = (nure_ea['ay'] - 536).abs()
nure_ea['az'] = (nure_ea['az'] - 715).abs()
kohei_ja['ax'] = (kohei_ja['ax'] - 498).abs()
kohei_ja['ay'] = (kohei_ja['ay'] - 536).abs()
kohei_ja['az'] = (kohei_ja['az'] - 715).abs()
kohei_ea['ax'] = (kohei_ea['ax'] - 498).abs()
kohei_ea['ay'] = (kohei_ea['ay'] - 536).abs()
kohei_ea['az'] = (kohei_ea['az'] - 715).abs()
yamana_ja['ax'] = (yamana_ja['ax'] - 498).abs()
yamana_ja['ay'] = (yamana_ja['ay'] - 536).abs()
yamana_ja['az'] = (yamana_ja['az'] - 715).abs()
yamana_ea['ax'] = (yamana_ea['ax'] - 498).abs()
yamana_ea['ay'] = (yamana_ea['ay'] - 536).abs()
yamana_ea['az'] = (yamana_ea['az'] - 715).abs()
toshi_ja['ax'] = (toshi_ja['ax'] - 498).abs()
toshi_ja['ay'] = (toshi_ja['ay'] - 536).abs()
toshi_ja['az'] = (toshi_ja['az'] - 715).abs()
toshi_ea['ax'] = (toshi_ea['ax'] - 498).abs()
toshi_ea['ay'] = (toshi_ea['ay'] - 536).abs()
toshi_ea['az'] = (toshi_ea['az'] - 715).abs()
yukita_ja['ax'] = (yukita_ja['ax'] - 498).abs()
yukita_ja['ay'] = (yukita_ja['ay'] - 536).abs()
yukita_ja['az'] = (yukita_ja['az'] - 715).abs()
yukita_ea['ax'] = (yukita_ea['ax'] - 498).abs()
yukita_ea['ay'] = (yukita_ea['ay'] - 536).abs()
yukita_ea['az'] = (yukita_ea['az'] - 715).abs()
miyuki_ja['ax'] = (miyuki_ja['ax'] - 498).abs()
miyuki_ja['ay'] = (miyuki_ja['ay'] - 536).abs()
miyuki_ja['az'] = (miyuki_ja['az'] - 715).abs()
miyuki_ea['ax'] = (miyuki_ea['ax'] - 498).abs()
miyuki_ea['ay'] = (miyuki_ea['ay'] - 536).abs()
miyuki_ea['az'] = (miyuki_ea['az'] - 715).abs()
In [6]:
dohi_ja.describe()
Out[6]:
In [7]:
dohi_ea.describe()
Out[7]:
In [8]:
dohi_jg.describe()
Out[8]:
In [9]:
dohi_eg.describe()
Out[9]:
In [10]:
nure_ja.describe()
Out[10]:
In [11]:
nure_ea.describe()
Out[11]:
In [12]:
nure_jg.describe()
Out[12]:
In [13]:
nure_eg.describe()
Out[13]:
In [14]:
kohei_ja.describe()
Out[14]:
In [15]:
kohei_ea.describe()
Out[15]:
In [16]:
kohei_jg.describe()
Out[16]:
In [17]:
kohei_eg.describe()
Out[17]:
In [18]:
yamana_ja.describe()
Out[18]:
In [19]:
yamana_ea.describe()
Out[19]:
In [20]:
yamana_jg.describe()
Out[20]:
In [21]:
yamana_eg.describe()
Out[21]:
In [22]:
toshi_ja.describe()
Out[22]:
In [23]:
toshi_ea.describe()
Out[23]:
In [24]:
toshi_jg.describe()
Out[24]:
In [25]:
toshi_eg.describe()
Out[25]:
In [26]:
yukita_ja.describe()
Out[26]:
In [27]:
yukita_ea.describe()
Out[27]:
In [28]:
yukita_jg.describe()
Out[28]:
In [29]:
yukita_eg.describe()
Out[29]:
In visualizing the data, I chose three plotting methods.
Since the standard deviation of each subject's sensor data would show the richness of their gestures in each language, I believe that the comparison of standard deviation would lead to the confirmation of my hypotheses.
In [30]:
dja = dohi_ja.plot(title="Wrist Accelerometer in Japanese")
dja.set_ylabel("Accelerometer value")
dja.set_xlabel("Data Index")
dea = dohi_ea.plot(title="Wrist Accelerometer in English")
dea.set_ylabel("Accelerometer value")
dea.set_xlabel("Data Index")
Out[30]:
In [31]:
dohi_ja.hist(bins=30, figsize=(10,10), sharey = True)
dohi_ea.hist(bins=30, facecolor='green', figsize=(10,10), sharey = True)
Out[31]:
In [32]:
djg = dohi_jg.plot(title="Head Gyroscope in Japanese")
djg.set_ylabel("Gyroscope value")
djg.set_xlabel("Data Index")
deg = dohi_eg.plot(title="Head Gyroscope in English")
deg.set_ylabel("Gyroscope value")
deg.set_xlabel("Data Index")
Out[32]:
In [33]:
dohi_jg.hist(bins=30, figsize=(10,10), sharey = True)
dohi_eg.hist(bins=30, facecolor='green', figsize=(10,10), sharey = True)
Out[33]:
In [34]:
hold(True)
#dohi_ja.std().plot(kind="bar")
#dohi_ea.std().plot(kind="bar")
plt.plot(dohi_ja.std(),label='Japanese')
plt.plot(dohi_ea.std(),label='English')
plt.legend(loc='upper right')
Out[34]:
In [35]:
hold(True)
plt.plot(dohi_jg.std(),label='Japanese')
plt.plot(dohi_eg.std(),label='English')
plt.legend(loc='upper right')
Out[35]:
In [36]:
kja = kohei_ja.plot(title="Wrist Accelerometer in Japanese")
kja.set_ylabel("Accelerometer value")
kja.set_xlabel("Data Index")
kea = kohei_ea.plot(title="Wrist Accelerometer in English")
kea.set_ylabel("Accelerometer value")
kea.set_xlabel("Data Index")
Out[36]:
In [37]:
kohei_ja.hist(bins=30, figsize=(10,10), sharey = True)
kohei_ea.hist(bins=30, facecolor='green', figsize=(10,10), sharey = True)
Out[37]:
In [38]:
kjg = kohei_jg.plot(title="Head Gyroscope in Japanese")
kjg.set_ylabel("Gyroscope value")
kjg.set_xlabel("Data Index")
keg = kohei_eg.plot(title="Head Gyroscope in English")
keg.set_ylabel("Gyroscope value")
keg.set_xlabel("Data Index")
Out[38]:
In [39]:
kohei_jg.hist(bins=30, figsize=(10,10), sharey = True)
kohei_eg.hist(bins=30, facecolor='green', figsize=(10,10), sharey = True)
Out[39]:
In [40]:
hold(True)
plt.plot(kohei_ja.std(),label='Japanese')
plt.plot(kohei_ea.std(),label='English')
plt.legend(loc='upper right')
Out[40]:
In [41]:
hold(True)
plt.plot(kohei_jg.std(),label='Japanese')
plt.plot(kohei_eg.std(),label='English')
plt.legend(loc='upper right')
Out[41]:
In [42]:
nja = nure_ja.plot(title="Wrist Accelerometer in Japanese")
nja.set_ylabel("Accelerometer value")
nja.set_xlabel("Data Index")
nea = nure_ea.plot(title="Wrist Accelerometer in English")
nea.set_ylabel("Accelerometer value")
nea.set_xlabel("Data Index")
Out[42]:
In [43]:
nure_ja.hist(bins=30, figsize=(10,10), sharey = True)
nure_ea.hist(bins=30, facecolor='green', figsize=(10,10), sharey = True)
Out[43]:
In [44]:
njg = nure_jg.plot(title="Head Gyroscope in Japanese")
njg.set_ylabel("Gyroscope value")
njg.set_xlabel("Data Index")
neg = nure_eg.plot(title="Head Gyroscope in English")
neg.set_ylabel("Gyroscope value")
neg.set_xlabel("Data Index")
Out[44]:
In [45]:
nure_jg.hist(bins=30, figsize=(10,10), sharey = True)
nure_eg.hist(bins=30, facecolor='green', figsize=(10,10), sharey = True)
Out[45]:
In [46]:
hold(True)
plt.plot(nure_ja.std(),label='Japanese')
plt.plot(nure_ea.std(),label='English')
plt.legend(loc='upper right')
Out[46]:
In [47]:
hold(True)
plt.plot(nure_jg.std(),label='Japanese')
plt.plot(nure_eg.std(),label='English')
plt.legend(loc='upper right')
Out[47]:
In [48]:
tja = toshi_ja.plot(title="Wrist Accelerometer in Japanese")
tja.set_ylabel("Accelerometer value")
tja.set_xlabel("Data Index")
tea = toshi_ea.plot(title="Wrist Accelerometer in English")
tea.set_ylabel("Accelerometer value")
tea.set_xlabel("Data Index")
Out[48]:
In [49]:
toshi_ja.hist(bins=30, figsize=(10,10), sharey = True)
toshi_ea.hist(bins=30, facecolor='green', figsize=(10,10), sharey = True)
Out[49]:
In [50]:
tjg = toshi_jg.plot(title="Head Gyroscope in Japanese")
tjg.set_ylabel("Gyroscope value")
tjg.set_xlabel("Data Index")
teg = toshi_eg.plot(title="Head Gyroscope in English")
teg.set_ylabel("Gyroscope value")
teg.set_xlabel("Data Index")
Out[50]:
In [51]:
toshi_jg.hist(bins=30, figsize=(10,10), sharey = True)
toshi_eg.hist(bins=30, facecolor='green', figsize=(10,10), sharey = True)
Out[51]:
In [52]:
hold(True)
plt.plot(toshi_ja.std(),label='Japanese')
plt.plot(toshi_ea.std(),label='English')
plt.legend(loc='upper right')
Out[52]:
In [53]:
hold(True)
plt.plot(toshi_jg.std(),label='Japanese')
plt.plot(toshi_eg.std(),label='English')
plt.legend(loc='upper right')
Out[53]:
In [54]:
yuja = yukita_ja.plot(title="Wrist Accelerometer in Japanese")
yuja.set_ylabel("Accelerometer value")
yuja.set_xlabel("Data Index")
yuea = yukita_ea.plot(title="Wrist Accelerometer in English")
yuea.set_ylabel("Accelerometer value")
yuea.set_xlabel("Data Index")
Out[54]:
In [55]:
yukita_ja.hist(bins=30, figsize=(10,10), sharey = True)
yukita_ea.hist(bins=30, facecolor='green', figsize=(10,10), sharey = True)
Out[55]:
In [56]:
yujg = yukita_jg.plot(title="Head Gyroscope in Japanese")
yujg.set_ylabel("Accelerometer value")
yujg.set_xlabel("Data Index")
yueg = yukita_eg.plot(title="Head Gyroscope in English")
yueg.set_ylabel("Accelerometer value")
yueg.set_xlabel("Data Index")
Out[56]:
In [57]:
yukita_jg.hist(bins=30, figsize=(10,10), sharey = True)
yukita_eg.hist(bins=30, facecolor='green', figsize=(10,10), sharey = True)
Out[57]:
In [58]:
hold(True)
plt.plot(yukita_ja.std(),label='Japanese')
plt.plot(yukita_ea.std(),label='English')
plt.legend(loc='upper right')
Out[58]:
In [59]:
hold(True)
plt.plot(yukita_jg.std(),label='Japanese')
plt.plot(yukita_eg.std(),label='English')
plt.legend(loc='upper right')
Out[59]:
In [60]:
yja = yamana_ja.plot(title="Wrist Accelerometer in Japanese")
yja.set_ylabel("Accelerometer value")
yja.set_xlabel("Data Index")
yea = yamana_ea.plot(title="Wrist Accelerometer in English")
yea.set_ylabel("Accelerometer value")
yea.set_xlabel("Data Index")
Out[60]:
In [61]:
yamana_ja.hist(bins=30, figsize=(10,10), sharey = True)
yamana_ea.hist(bins=30, facecolor='green', figsize=(10,10), sharey = True)
Out[61]:
In [62]:
yjg = yamana_jg.plot(title="Head Gyroscope in Japanese")
yjg.set_ylabel("Accelerometer value")
yjg.set_xlabel("Data Index")
yeg = yamana_eg.plot(title="Head Gyroscope in English")
yeg.set_ylabel("Accelerometer value")
yeg.set_xlabel("Data Index")
Out[62]:
In [63]:
yamana_jg.hist(bins=30, figsize=(10,10), sharey = True)
yamana_eg.hist(bins=30, facecolor='green', figsize=(10,10), sharey = True)
Out[63]:
In [64]:
hold(True)
plt.plot(yamana_ja.std(),label='Japanese')
plt.plot(yamana_ea.std(),label='English')
plt.legend(loc='upper right')
Out[64]:
In [65]:
hold(True)
plt.plot(yamana_jg.std(),label='Japanese')
plt.plot(yamana_eg.std(),label='English')
plt.legend(loc='upper right')
Out[65]:
In order to create a classifier, I created a vector for each subject. The vector contains the counts of 30 bins for the subject's histogram of accelerator and gyroscope. Since each subject has 12 histograms each (ax, ay, az in Japanese and English & gx, gy, gz in Japanese and Engish), each subject's vector has 12x30 = 360 dimensions. The range of the division for each historgram was decided by the maximum and the minium value of every participant's ax, ay, az, gx, gy and gz.
In [66]:
import numpy as np
dohi_jax_count,dohi_jax_division = np.histogram(dohi_ja['ax'], bins=30, range=(0,522))
dohi_jay_count,dohi_jay_division = np.histogram(dohi_ja['ay'], bins=30, range=(0,460))
dohi_jaz_count,dohi_jaz_division = np.histogram(dohi_ja['az'], bins=30, range=(0,710))
dohi_eax_count,dohi_eax_division = np.histogram(dohi_ea['ax'], bins=30, range=(0,522))
dohi_eay_count,dohi_eay_division = np.histogram(dohi_ea['ay'], bins=30, range=(0,460))
dohi_eaz_count,dohi_eaz_division = np.histogram(dohi_ea['az'], bins=30, range=(0,710))
dohi_jgx_count,dohi_jgx_division = np.histogram(dohi_jg['gx'], bins=30, range=(0,2.71))
dohi_jgy_count,dohi_jgy_division = np.histogram(dohi_jg['gy'], bins=30, range=(0,6.18))
dohi_jgz_count,dohi_jgz_division = np.histogram(dohi_jg['gz'], bins=30, range=(0,4.56))
dohi_egx_count,dohi_egx_division = np.histogram(dohi_eg['gx'], bins=30, range=(0,2.71))
dohi_egy_count,dohi_egy_division = np.histogram(dohi_eg['gy'], bins=30, range=(0,6.18))
dohi_egz_count,dohi_egz_division = np.histogram(dohi_eg['gz'], bins=30, range=(0,4.56))
dohi_combined = (dohi_jax_count.tolist() + dohi_jay_count.tolist() + dohi_jaz_count.tolist() + dohi_eax_count.tolist() + dohi_eay_count.tolist() + dohi_eaz_count.tolist() + dohi_jgx_count.tolist() + dohi_jgy_count.tolist() + dohi_jgz_count.tolist() + dohi_egx_count.tolist() + dohi_egy_count.tolist() + dohi_egz_count.tolist())
kohei_jax_count,kohei_jax_division = np.histogram(kohei_ja['ax'], bins=30, range=(0,522))
kohei_jay_count,kohei_jay_division = np.histogram(kohei_ja['ay'], bins=30, range=(0,460))
kohei_jaz_count,kohei_jaz_division = np.histogram(kohei_ja['az'], bins=30, range=(0,710))
kohei_eax_count,kohei_eax_division = np.histogram(kohei_ea['ax'], bins=30, range=(0,522))
kohei_eay_count,kohei_eay_division = np.histogram(kohei_ea['ay'], bins=30, range=(0,460))
kohei_eaz_count,kohei_eaz_division = np.histogram(kohei_ea['az'], bins=30, range=(0,710))
kohei_jgx_count,kohei_jgx_division = np.histogram(kohei_jg['gx'], bins=30, range=(0,2.71))
kohei_jgy_count,kohei_jgy_division = np.histogram(kohei_jg['gy'], bins=30, range=(0,6.18))
kohei_jgz_count,kohei_jgz_division = np.histogram(kohei_jg['gz'], bins=30, range=(0,4.56))
kohei_egx_count,kohei_egx_division = np.histogram(kohei_eg['gx'], bins=30, range=(0,2.71))
kohei_egy_count,kohei_egy_division = np.histogram(kohei_eg['gy'], bins=30, range=(0,6.18))
kohei_egz_count,kohei_egz_division = np.histogram(kohei_eg['gz'], bins=30, range=(0,4.56))
kohei_combined = (kohei_jax_count.tolist() + kohei_jay_count.tolist() + kohei_jaz_count.tolist() + kohei_eax_count.tolist() + kohei_eay_count.tolist() + kohei_eaz_count.tolist() + kohei_jgx_count.tolist() + kohei_jgy_count.tolist() + kohei_jgz_count.tolist() + kohei_egx_count.tolist() + kohei_egy_count.tolist() + kohei_egz_count.tolist())
nure_jax_count,nure_jax_division = np.histogram(nure_ja['ax'], bins=30, range=(0,522))
nure_jay_count,nure_jay_division = np.histogram(nure_ja['ay'], bins=30, range=(0,460))
nure_jaz_count,nure_jaz_division = np.histogram(nure_ja['az'], bins=30, range=(0,710))
nure_eax_count,nure_eax_division = np.histogram(nure_ea['ax'], bins=30, range=(0,522))
nure_eay_count,nure_eay_division = np.histogram(nure_ea['ay'], bins=30, range=(0,460))
nure_eaz_count,nure_eaz_division = np.histogram(nure_ea['az'], bins=30, range=(0,710))
nure_jgx_count,nure_jgx_division = np.histogram(nure_jg['gx'], bins=30, range=(0,2.71))
nure_jgy_count,nure_jgy_division = np.histogram(nure_jg['gy'], bins=30, range=(0,6.18))
nure_jgz_count,nure_jgz_division = np.histogram(nure_jg['gz'], bins=30, range=(0,4.56))
nure_egx_count,nure_egx_division = np.histogram(nure_eg['gx'], bins=30, range=(0,2.71))
nure_egy_count,nure_egy_division = np.histogram(nure_eg['gy'], bins=30, range=(0,6.18))
nure_egz_count,nure_egz_division = np.histogram(nure_eg['gz'], bins=30, range=(0,4.56))
nure_combined = (nure_jax_count.tolist() + nure_jay_count.tolist() + nure_jaz_count.tolist() + nure_eax_count.tolist() + nure_eay_count.tolist() + nure_eaz_count.tolist() + nure_jgx_count.tolist() + nure_jgy_count.tolist() + nure_jgz_count.tolist() + nure_egx_count.tolist() + nure_egy_count.tolist() + nure_egz_count.tolist())
toshi_jax_count,toshi_jax_division = np.histogram(toshi_ja['ax'], bins=30, range=(0,522))
toshi_jay_count,toshi_jay_division = np.histogram(toshi_ja['ay'], bins=30, range=(0,460))
toshi_jaz_count,toshi_jaz_division = np.histogram(toshi_ja['az'], bins=30, range=(0,710))
toshi_eax_count,toshi_eax_division = np.histogram(toshi_ea['ax'], bins=30, range=(0,522))
toshi_eay_count,toshi_eay_division = np.histogram(toshi_ea['ay'], bins=30, range=(0,460))
toshi_eaz_count,toshi_eaz_division = np.histogram(toshi_ea['az'], bins=30, range=(0,710))
toshi_jgx_count,toshi_jgx_division = np.histogram(toshi_jg['gx'], bins=30, range=(0,2.71))
toshi_jgy_count,toshi_jgy_division = np.histogram(toshi_jg['gy'], bins=30, range=(0,6.18))
toshi_jgz_count,toshi_jgz_division = np.histogram(toshi_jg['gz'], bins=30, range=(0,4.56))
toshi_egx_count,toshi_egx_division = np.histogram(toshi_eg['gx'], bins=30, range=(0,2.71))
toshi_egy_count,toshi_egy_division = np.histogram(toshi_eg['gy'], bins=30, range=(0,6.18))
toshi_egz_count,toshi_egz_division = np.histogram(toshi_eg['gz'], bins=30, range=(0,4.56))
toshi_combined = (toshi_jax_count.tolist() + toshi_jay_count.tolist() + toshi_jaz_count.tolist() + toshi_eax_count.tolist() + toshi_eay_count.tolist() + toshi_eaz_count.tolist() + toshi_jgx_count.tolist() + toshi_jgy_count.tolist() + toshi_jgz_count.tolist() + toshi_egx_count.tolist() + toshi_egy_count.tolist() + toshi_egz_count.tolist())
yukita_jax_count,yukita_jax_division = np.histogram(yukita_ja['ax'], bins=30, range=(0,522))
yukita_jay_count,yukita_jay_division = np.histogram(yukita_ja['ay'], bins=30, range=(0,460))
yukita_jaz_count,yukita_jaz_division = np.histogram(yukita_ja['az'], bins=30, range=(0,710))
yukita_eax_count,yukita_eax_division = np.histogram(yukita_ea['ax'], bins=30, range=(0,522))
yukita_eay_count,yukita_eay_division = np.histogram(yukita_ea['ay'], bins=30, range=(0,460))
yukita_eaz_count,yukita_eaz_division = np.histogram(yukita_ea['az'], bins=30, range=(0,710))
yukita_jgx_count,yukita_jgx_division = np.histogram(yukita_jg['gx'], bins=30, range=(0,2.71))
yukita_jgy_count,yukita_jgy_division = np.histogram(yukita_jg['gy'], bins=30, range=(0,6.18))
yukita_jgz_count,yukita_jgz_division = np.histogram(yukita_jg['gz'], bins=30, range=(0,4.56))
yukita_egx_count,yukita_egx_division = np.histogram(yukita_eg['gx'], bins=30, range=(0,2.71))
yukita_egy_count,yukita_egy_division = np.histogram(yukita_eg['gy'], bins=30, range=(0,6.18))
yukita_egz_count,yukita_egz_division = np.histogram(yukita_eg['gz'], bins=30, range=(0,4.56))
yukita_combined = (yukita_jax_count.tolist() + yukita_jay_count.tolist() + yukita_jaz_count.tolist() + yukita_eax_count.tolist() + yukita_eay_count.tolist() + yukita_eaz_count.tolist() + yukita_jgx_count.tolist() + yukita_jgy_count.tolist() + yukita_jgz_count.tolist() + yukita_egx_count.tolist() + yukita_egy_count.tolist() + yukita_egz_count.tolist())
yamana_jax_count,yamana_jax_division = np.histogram(yamana_ja['ax'], bins=30, range=(0,522))
yamana_jay_count,yamana_jay_division = np.histogram(yamana_ja['ay'], bins=30, range=(0,460))
yamana_jaz_count,yamana_jaz_division = np.histogram(yamana_ja['az'], bins=30, range=(0,710))
yamana_eax_count,yamana_eax_division = np.histogram(yamana_ea['ax'], bins=30, range=(0,522))
yamana_eay_count,yamana_eay_division = np.histogram(yamana_ea['ay'], bins=30, range=(0,460))
yamana_eaz_count,yamana_eaz_division = np.histogram(yamana_ea['az'], bins=30, range=(0,710))
yamana_jgx_count,yamana_jgx_division = np.histogram(yamana_jg['gx'], bins=30, range=(0,2.71))
yamana_jgy_count,yamana_jgy_division = np.histogram(yamana_jg['gy'], bins=30, range=(0,6.18))
yamana_jgz_count,yamana_jgz_division = np.histogram(yamana_jg['gz'], bins=30, range=(0,4.56))
yamana_egx_count,yamana_egx_division = np.histogram(yamana_eg['gx'], bins=30, range=(0,2.71))
yamana_egy_count,yamana_egy_division = np.histogram(yamana_eg['gy'], bins=30, range=(0,6.18))
yamana_egz_count,yamana_egz_division = np.histogram(yamana_eg['gz'], bins=30, range=(0,4.56))
yamana_combined = (yamana_jax_count.tolist() + yamana_jay_count.tolist() + yamana_jaz_count.tolist() + yamana_eax_count.tolist() + yamana_eay_count.tolist() + yamana_eaz_count.tolist() + yamana_jgx_count.tolist() + yamana_jgy_count.tolist() + yamana_jgz_count.tolist() + yamana_egx_count.tolist() + yamana_egy_count.tolist() + yamana_egz_count.tolist())
miyuki_jax_count,miyuki_jax_division = np.histogram(miyuki_ja['ax'], bins=30, range=(0,522))
miyuki_jay_count,miyuki_jay_division = np.histogram(miyuki_ja['ay'], bins=30, range=(0,460))
miyuki_jaz_count,miyuki_jaz_division = np.histogram(miyuki_ja['az'], bins=30, range=(0,710))
miyuki_eax_count,miyuki_eax_division = np.histogram(miyuki_ea['ax'], bins=30, range=(0,522))
miyuki_eay_count,miyuki_eay_division = np.histogram(miyuki_ea['ay'], bins=30, range=(0,460))
miyuki_eaz_count,miyuki_eaz_division = np.histogram(miyuki_ea['az'], bins=30, range=(0,710))
miyuki_jgx_count,miyuki_jgx_division = np.histogram(miyuki_jg['gx'], bins=30, range=(0,2.71))
miyuki_jgy_count,miyuki_jgy_division = np.histogram(miyuki_jg['gy'], bins=30, range=(0,6.18))
miyuki_jgz_count,miyuki_jgz_division = np.histogram(miyuki_jg['gz'], bins=30, range=(0,4.56))
miyuki_egx_count,miyuki_egx_division = np.histogram(miyuki_eg['gx'], bins=30, range=(0,2.71))
miyuki_egy_count,miyuki_egy_division = np.histogram(miyuki_eg['gy'], bins=30, range=(0,6.18))
miyuki_egz_count,miyuki_egz_division = np.histogram(miyuki_eg['gz'], bins=30, range=(0,4.56))
miyuki_combined = (miyuki_jax_count.tolist() + miyuki_jay_count.tolist() + miyuki_jaz_count.tolist() + miyuki_eax_count.tolist() + miyuki_eay_count.tolist() + miyuki_eaz_count.tolist() + miyuki_jgx_count.tolist() + miyuki_jgy_count.tolist() + miyuki_jgz_count.tolist() + miyuki_egx_count.tolist() + miyuki_egy_count.tolist() + miyuki_egz_count.tolist())
In [67]:
dohi_combined
Out[67]:
In [73]:
train_data = pd.DataFrame({'dohi': dohi_combined,'kohei':kohei_combined, 'nure':nure_combined, 'toshi':toshi_combined, 'yukita':yukita_combined, 'yamana':yamana_combined})
test_data = pd.DataFrame({'miyuki': miyuki_combined})
In [74]:
train_data.T.head()
Out[74]:
In [75]:
label = pd.DataFrame({'label': [1,1,2,2,3,3]})
In [76]:
label.head()
Out[76]:
In [77]:
state = np.random.RandomState(1)
import sklearn.svm as svm
svc = svm.LinearSVC(random_state=state)
svc.fit(train_data.T, label)
Out[77]:
In [85]:
predicted = svc.predict(test_data['miyuki'])
predicted = pd.Series(predicted)
predicted
Out[85]:
The above result shows that the classifier returned the value 1, which is the label for those whose English level were "LOW". The correct label for the test data, however, is 3.
The comparison of standard deviation showed that there significant difference in a Japanese person's body gesture when speaking in Engish and Japanese. The differences, however, did not necessarily confirm the first hypothesis. That is, there were cases when the body gestures were significantly larger when speaking in Japanese than in English (e.g. arm movement for subject no.2, head movement for subject no.4 and subject no.6).
The differences in arm movement were similar for fluent English speakers, which does confirm the second hypothesis, but at the same time the head movement for subject no.6 was significantly larger in Japanese which contradicts to the hypothesis.
For subject no.1, who was unconfident in English, showed significantly larger body gestures when speaking in English than in Japanese, but subject no.2, who was also unconfident in English, showed significantly larger head arm movement when speaking in Japanese, which also contradicts to the second hypothesis.
Although I was able to create a classifier, the result for the test data was not correct. Since I only had 6 learning data and 1 test data it is difficult to discuss about the accuracy of the classifier, but I presume that the test data was not predicted correctly because the subject apeared to be in a nervous state and lacked in body movement than usual. I chose Support Vector Classifier since it was based on LIBSVM, an opensource SVM library made in Taiwan, of which I was familiar with from my final undergraduate thesis, and because I was curious of how non-linerar SVM would deal with the collected data.
Although I was able to find a significant difference in the body gestures, I was not able to confirm the initial hypothesis that Japanese people would have larger body gestures when speaking in English than in Japanese.
I was also unable to confirm the hypothesis that English fluency has a negative correlation with the difference of body gestures between Japanese and English.
Improvements can be made by putting the accelerometer on both hands, for some subjects seemed to have a havit of making body gestures with a specific hand. In addition, the weight of the Arduino may have discouraged the subjects from moving their arms, so collecting data without distrubing the subjects is another issue that must be dealt with.
In [ ]: