This notebook introduces students to popular computational tools used in the Digital Humanities and Social Sciences and the research possibilities they create. It then provides an abbreviated introduction to Python focussing on analysis and preparing for a Twitter data analysis on Day 2.
Estimated Time: 180 minutes
Topics Covered:
Parts:
Online Point-And-Click Tools
Common Programming Languages for Research and Visualization
Qualitative Data Analysis
Geospatial Analysis
Data Management
Qualitative Analysis
Quantitative Analysis
Model Building and Machine Learning
Linguistic Analysis and Natural Language Processing (NLP)
Visualization
Pedagogy
= symbol assigns the value on the right to the name on the left.age
and a name in quotation marks to a variable first_name.
In [ ]:
age = 42
first_name = 'Ahmed'
__alistairs_real_age have a special meaning
so we won't do that until we understand the convention.print to display values.print that prints things as text.
In [ ]:
print(first_name, 'is', age, 'years old')
print automatically puts a single space between items to separate them.
In [ ]:
print(last_name)
Name and name are different variables.
In [ ]:
flabadab = 42
ewr_422_yY = 'Ahmed'
print(ewr_422_yY, 'is', flabadab, 'years old')
In [ ]:
age = age + 3
print('Age in three years:', age)
In [ ]:
int): counting numbers like 3 or -512.float): fractional numbers like 3.14159 or -2.5.str): text.type to find the type of a value.type to find out what type a value has.
In [ ]:
print(type(52))
In [ ]:
pi = 3.14159
print(type(pi))
In [ ]:
fitness = 'average'
print(type(fitness))
In [ ]:
print(5 - 3)
In [ ]:
print('hello' - 'h')
In [ ]:
full_name = 'Ahmed' + ' ' + 'Walsh'
print(full_name)
In [ ]:
separator = '=' * 10
print(separator)
In [ ]:
print(len(full_name))
In [ ]:
print(len(52))
In [ ]:
print(1 + '2')
1 + '2' be 3 or '12'?
In [ ]:
print(1 + int('2'))
print(str(1) + '2')
In [ ]:
print('half is', 1 / 2.0)
print('three squared is', 3.0 ** 2)
In [ ]:
first = 1
second = 5 * first
first = 2
print('first is', first, 'and second is', second)
first when doing the multiplication,
creates a new value, and assigns it to second.second does not remember where it came from.first when doing the multiplication,
creates a new value, and assigns it to second.second does not remember where it came from.year and assign it as the year you were bornyear_floatyear_float to a string, and assign it to a new variable year_stringyear_string.
In [ ]:
In [ ]:
first_name = "Johan"
last_name = "Gambolputty"
full_name = first_name + last_name
print(full_name)
In [ ]:
full_name = first_name + " " + last_name
print(full_name)
[].
In [ ]:
full_name[1]
Gotcha - Python (and many other langauges) start counting from 0.
In [ ]:
full_name[0]
In [ ]:
full_name[4]
In [ ]:
full_name[0:4]
In [ ]:
full_name[0:5]
In [ ]:
full_name[:5]
In [ ]:
full_name[5:]
In [ ]:
str.
In [ ]:
str.upper?
So we can use it to upper-caseify a string.
In [ ]:
full_name.upper()
You have to use the parenthesis at the end because upper is a method of the string class.
Don't forget, simply calling the method does not change the original variable, you must reassign the variable:
In [ ]:
print(full_name)
In [ ]:
full_name = full_name.upper()
print(full_name)
For what its worth, you don't need to have a variable to use the upper() method, you could use it on the string itself.
In [ ]:
"Johann Gambolputty".upper()
What do you think should happen when you take upper of an int? What about a string representation of an int?
In [ ]:
In [ ]:
tweet = 'RT @JasonBelich: #March4Trump #berkeley elderly man pepper sprayed by #antifa https://t.co/5z3O6UZuhL'
Using this tweet, try seeing what the following string methods do:
* `split`
* `join`
* `replace`
* `strip`
* `find`
In [ ]:
In [ ]:
country_list = ["Afghanistan", "Canada", "Sierra Leone", "Denmark", "Japan"]
type(country_list)
len to find out how many values are in a list.
In [ ]:
len(country_list)
In [ ]:
print('the first item is:', country_list[0])
print('the fourth item is:', country_list[3])
In [ ]:
print(country_list[-1])
print(country_list[-2])
In [ ]:
print(country_list[1:4])
In [ ]:
print(country_list[:4])
In [ ]:
print(country_list[2:])
In [ ]:
country_list[0] = "Iran"
print('Country List is now:', country_list)
In [ ]:
mystring = "Donut"
mystring[0] = 'C'
object_name.method_name to call methods.
In [ ]:
country_list.
append method.
In [ ]:
country_list.append("United States")
print(country_list)
In [ ]:
print("original list was:", country_list)
del country_list[3]
print("the list is now:", country_list)
In [ ]:
complex_list = ['life', 42, 'the universe', [1,2,3]]
print(complex_list)
In [ ]:
print(complex_list[3])
print(complex_list[3][0])
[] on its own to represent a list that doesn't contain any values.IndexError if we attempt to access a value that doesn't exist.
In [ ]:
print(country_list[99])
In [ ]:
hashtags = ['#March4Trump',
'#Fascism',
'#TwitterIsFascist',
'#majority',
'#CouldntEvenStopDeVos',
'#IsTrumpCompromised',
'#Berkeley',
'#NotMyPresident',
'#mondaymotivation',
'#BlueLivesMatter',
'#Action4Trump',
'#impeachtrump'
'#Periscope',
'#march',
'#TrumpRussia',
'#obamagate',
'#Resist',
'#sedition',
'#NeverTrump',
'#maga']
print(hashtags[::2])
print()
print(hashtags[::-1])
How long is the hashtags list?
In [ ]:
Use the .index() method to find out what the index number is for #Resist:
In [ ]:
Read the help file (or the Python documentation) for join(), a string method.
In [ ]:
str.join?
Using the join method, concatenate all the values in hashtags into one long string:
In [ ]:
Using the string replace method and the list index method, print 'Never Trump' without the '#'
In [ ]: