Collect
Here we show how to use the command line interface to collect tweets that will be used in the project:
Collect tweets containing given words
To collect fitness tweets, we use a list of fitness applications as a filter (shown below). This filter can be changed as one wishes.
In [25]:
!cat collect/sport_tags
In the case where we want to collect tweets that express a mood of state, the filter we use is the expanded list of POMS words built in the previous step of the project. Following, we show the 10 first words of the POMS Tension/Anxiety dimension.
In [35]:
!head -n 10 collect/words_TA
Here we collect 20 tweets that contains one of these words and we store them in the file 'collect/output_tweets'.
In [16]:
!sporty-cli tweets collect collect/settings.json collect/output_tweets collect/sport_tags -c 20
We can easily see the content of the tweets by loading them using the json
module. Following, we display the user id and the abbreviated content of each tweet. Using this method, we can detect exercising users assuming that an exercising user uses a fitness application.
In [33]:
import json
with open("collect/output_tweets") as outtw:
for line in outtw:
tw = json.loads(line)
print "%d\t%s ... %s" % (tw['user']['id'], tw['text'][:30], tw['text'][-30:])
In [34]:
!sporty-cli