In [1]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="white")
sns.set_context("talk")
In [2]:
df = pd.read_csv('raw/2016-17-ClassCentral-Survey-data-noUserText.csv', decimal=',', encoding = "ISO-8859-1")
df.head(10)
Out[2]:
Timestamp
How familiar are you with MOOCs?
How important is the ability to earn a certificate when you complete a MOOC?
How many MOOCs have you started?
# MOOCs Started
How many MOOCs have you finished?
# MOOCs Finished
When did you first start taking MOOCs?
Why are you interested in taking MOOCs?
Which of the following are important reasons for you to take MOOCs?
...
Which of the following have a strong impact on your willingness to pay for a MOOC certificate?
Pay: The topic/subject
Pay: The institution/university offering the MOOC
Pay: The instructor/professor
Pay: The MOOC platform being used
Pay: A multi-course certification that the MOOC is a part of
(Selected answer)
Which region of the world are you in?
What is your level of formal education?
What is your age range?
0
2016/11/28 11:59:22 AM PST
5.0
1.0
10-Jun
8
05-Apr
4.5
2+ years ago
I like learning more about topics that interes...
Personal interest
...
Whether it's useful professionally.
0.0
0.0
0.0
0.0
0.0
NaN
United Kingdom
NaN
30-35 years old
1
2016/12/01 6:45:40 AM PST
2.0
4.0
1
1
Zero
0
Within past 6 months
Low cost and flexible schedule so can work
Learning skills for current career;Learning sk...
...
The topic/subject;The institution/university o...
1.0
1.0
0.0
0.0
1.0
Yes
Western Europe (except UK)
Graduate school degree
26-29 years old
2
2016/12/01 6:54:23 AM PST
5.0
2.0
20-Nov
15
10-Jun
8
2+ years ago
NaN
Learning skills for new career;Personal interest
...
The topic/subject;The institution/university o...
1.0
1.0
0.0
0.0
0.0
Yes
Eastern Europe
NaN
26-29 years old
3
2016/12/01 7:17:09 AM PST
4.0
1.0
10-Jun
8
05-Apr
4.5
2+ years ago
Learning is keeping my mind sharp.
Personal interest
...
I am retired so no interest in certificate.
0.0
0.0
0.0
0.0
0.0
NaN
United States
2-year college degree
56-65 years old
4
2016/12/01 7:24:51 AM PST
3.0
2.0
05-Apr
4.5
03-Feb
2.5
6 months to 1 year ago
Knowledge
Personal interest
...
The institution/university offering the MOOC
0.0
1.0
0.0
0.0
0.0
Yes
NaN
Graduate school degree
66+ years old
5
2016/12/01 7:57:31 AM PST
5.0
3.0
10-Jun
8
05-Apr
4.5
2+ years ago
NaN
Learning skills for current career;Learning sk...
...
The topic/subject;The institution/university o...
1.0
1.0
1.0
0.0
0.0
Yes
Eastern Europe
3 or 4 year college degree
22-25 years old
6
2016/12/01 8:07:15 AM PST
5.0
2.0
20-Nov
15
10-Jun
8
2+ years ago
Basically, for fun. A chance to learn about a...
Learning skills for current career;Learning sk...
...
The topic/subject;The institution/university o...
1.0
1.0
0.0
0.0
1.0
Yes
United States
Graduate school degree
36-45 years old
7
2016/12/01 8:09:17 AM PST
4.0
1.0
05-Apr
4.5
03-Feb
2.5
1-2 years ago
Continuing education for personal reasons only...
Personal interest
...
Will not pay ever
0.0
0.0
0.0
0.0
0.0
NaN
United States
Graduate school degree
66+ years old
8
2016/12/01 8:24:28 AM PST
2.0
5.0
03-Feb
2.5
03-Feb
2.5
1-2 years ago
have a nex competences or skills
Learning skills for current career;Learning sk...
...
A multi-course certification that the MOOC is ...
0.0
0.0
0.0
0.0
1.0
Yes
Africa
High school degree
46-55 years old
9
2016/12/01 8:28:00 AM PST
5.0
3.0
More than 20
25
More than 20
25
2+ years ago
NaN
Learning skills for new career;Personal interest
...
The topic/subject;A multi-course certification...
1.0
0.0
0.0
0.0
1.0
Yes
Western Europe (except UK)
Graduate school degree
36-45 years old
10 rows × 53 columns
In [3]:
size_df = len(df)
size_df
Out[3]:
2491
In [4]:
df.describe().transpose()
Out[4]:
count
mean
std
min
25%
50%
75%
max
How familiar are you with MOOCs?
2468.0
3.343193
1.526727
1.0
2.0
4.0
5.0
5.0
How important is the ability to earn a certificate when you complete a MOOC?
2464.0
3.298701
1.446313
1.0
2.0
3.0
5.0
5.0
Reasons: Learning skills for current career
2491.0
0.510638
0.499987
0.0
0.0
1.0
1.0
1.0
Reasons: Learning skills for new career
2491.0
0.480530
0.499721
0.0
0.0
0.0
1.0
1.0
Reasons: School credit
2491.0
0.086311
0.280879
0.0
0.0
0.0
0.0
1.0
Reasons: Personal interest
2491.0
0.792453
0.405632
0.0
1.0
1.0
1.0
1.0
Reasons: Access to reference materials
2491.0
0.279807
0.448995
0.0
0.0
0.0
1.0
1.0
(answered Reasons)
2491.0
2.149739
1.114913
0.0
1.0
2.0
3.0
5.0
Decide: Topic/Subject
2491.0
0.910879
0.284975
0.0
1.0
1.0
1.0
1.0
Decide: Instructor
2491.0
0.160578
0.367215
0.0
0.0
0.0
0.0
1.0
Decide: Institution/university
2491.0
0.412686
0.492416
0.0
0.0
0.0
1.0
1.0
Decide: Platform
2491.0
0.271778
0.444966
0.0
0.0
0.0
1.0
1.0
Decide: Ratings
2491.0
0.328382
0.469719
0.0
0.0
0.0
1.0
1.0
Decide: Others recommendations
2491.0
0.252509
0.434539
0.0
0.0
0.0
1.0
1.0
Aspects: Browsing discussion forums
2491.0
0.538739
0.498597
0.0
0.0
1.0
1.0
1.0
Aspects: Actively contributing to discussion forums
2491.0
0.304295
0.460201
0.0
0.0
0.0
1.0
1.0
Aspects: Connecting with other learners in the course environment
2491.0
0.382979
0.486211
0.0
0.0
0.0
1.0
1.0
Aspects: Connecting with learners outside the course environment
2491.0
0.153352
0.360399
0.0
0.0
0.0
0.0
1.0
Aspects: Taking the course with other people you know (friends, colleagues, etc.)
2491.0
0.151746
0.358847
0.0
0.0
0.0
0.0
1.0
(selected Aspects)
2491.0
1.531112
1.180555
0.0
1.0
1.0
2.0
5.0
Benefit: Have not taken MOOCs
2491.0
0.160980
0.367586
0.0
0.0
0.0
0.0
1.0
Benefit: Not Really
2491.0
0.345243
0.475543
0.0
0.0
0.0
1.0
1.0
Benefit: School credit towards a degree
2491.0
0.030911
0.173112
0.0
0.0
0.0
0.0
1.0
Benefit: Promotion at current organization
2491.0
0.029305
0.168695
0.0
0.0
0.0
0.0
1.0
Benefit: Higher performance evaluation at current job
2491.0
0.114010
0.317888
0.0
0.0
0.0
0.0
1.0
Benefit: Helped me get a new job in the same field
2491.0
0.048976
0.215862
0.0
0.0
0.0
0.0
1.0
Benefit: Helped me get a new job in a different field
2491.0
0.042553
0.201888
0.0
0.0
0.0
0.0
1.0
(selected Benefits)
2491.0
1.119229
0.855314
0.0
1.0
1.0
1.0
6.0
(selected except Not really - Benefits)
2491.0
0.234845
0.574427
0.0
0.0
0.0
0.0
4.0
How much do you think employers value MOOC certificates?
2372.0
3.005902
1.124460
1.0
2.0
3.0
4.0
5.0
Pay: The topic/subject
2090.0
0.554545
0.497135
0.0
0.0
1.0
1.0
1.0
Pay: The institution/university offering the MOOC
2090.0
0.474641
0.499476
0.0
0.0
0.0
1.0
1.0
Pay: The instructor/professor
2090.0
0.166986
0.373052
0.0
0.0
0.0
0.0
1.0
Pay: The MOOC platform being used
2090.0
0.127273
0.333358
0.0
0.0
0.0
0.0
1.0
Pay: A multi-course certification that the MOOC is a part of
2090.0
0.292823
0.455167
0.0
0.0
0.0
1.0
1.0
In [5]:
df.isnull().sum()
Out[5]:
Timestamp 0
How familiar are you with MOOCs? 23
How important is the ability to earn a certificate when you complete a MOOC? 27
How many MOOCs have you started? 52
# MOOCs Started 52
How many MOOCs have you finished? 94
# MOOCs Finished 94
When did you first start taking MOOCs? 17
Why are you interested in taking MOOCs? 933
Which of the following are important reasons for you to take MOOCs? 17
Reasons: Learning skills for current career 0
Reasons: Learning skills for new career 0
Reasons: School credit 0
Reasons: Personal interest 0
Reasons: Access to reference materials 0
(answered Reasons) 0
Which are the most important factors in deciding which MOOC to take? 40
Decide: Topic/Subject 0
Decide: Instructor 0
Decide: Institution/university 0
Decide: Platform 0
Decide: Ratings 0
Decide: Others recommendations 0
Which of the following are important aspects of the MOOC experience to you? 405
Aspects: Browsing discussion forums 0
Aspects: Actively contributing to discussion forums 0
Aspects: Connecting with other learners in the course environment 0
Aspects: Connecting with learners outside the course environment 0
Aspects: Taking the course with other people you know (friends, colleagues, etc.) 0
(selected Aspects) 0
Have you received any tangible benefits from taking MOOCs? 195
Benefit: Have not taken MOOCs 0
Benefit: Not Really 0
Benefit: School credit towards a degree 0
Benefit: Promotion at current organization 0
Benefit: Higher performance evaluation at current job 0
Benefit: Helped me get a new job in the same field 0
Benefit: Helped me get a new job in a different field 0
(selected Benefits) 0
(selected except Not really - Benefits) 0
How important are MOOC certificates to you? 44
How much do you think employers value MOOC certificates? 119
How willing are you to pay for a certificate for a MOOC? 73
Which of the following have a strong impact on your willingness to pay for a MOOC certificate? 249
Pay: The topic/subject 401
Pay: The institution/university offering the MOOC 401
Pay: The instructor/professor 401
Pay: The MOOC platform being used 401
Pay: A multi-course certification that the MOOC is a part of 401
(Selected answer) 841
Which region of the world are you in? 50
What is your level of formal education? 39
What is your age range? 27
dtype: int64
In [6]:
df['How willing are you to pay for a certificate for a MOOC?'].unique()
Out[6]:
array(['Generally not that willing.', '3', '1', '2', '5', '4', nan], dtype=object)
In [7]:
print(df.columns, len(df.columns))
Index(['Timestamp', 'How familiar are you with MOOCs?',
'How important is the ability to earn a certificate when you complete a MOOC?',
'How many MOOCs have you started?', '# MOOCs Started',
'How many MOOCs have you finished?', '# MOOCs Finished',
'When did you first start taking MOOCs?',
'Why are you interested in taking MOOCs?',
'Which of the following are important reasons for you to take MOOCs?',
'Reasons: Learning skills for current career',
'Reasons: Learning skills for new career', 'Reasons: School credit',
'Reasons: Personal interest', 'Reasons: Access to reference materials',
'(answered Reasons)',
'Which are the most important factors in deciding which MOOC to take?',
'Decide: Topic/Subject', 'Decide: Instructor',
'Decide: Institution/university', 'Decide: Platform', 'Decide: Ratings',
'Decide: Others recommendations',
'Which of the following are important aspects of the MOOC experience to you?',
'Aspects: Browsing discussion forums',
'Aspects: Actively contributing to discussion forums',
'Aspects: Connecting with other learners in the course environment',
'Aspects: Connecting with learners outside the course environment',
'Aspects: Taking the course with other people you know (friends, colleagues, etc.)',
'(selected Aspects)',
'Have you received any tangible benefits from taking MOOCs?',
'Benefit: Have not taken MOOCs', 'Benefit: Not Really',
'Benefit: School credit towards a degree',
'Benefit: Promotion at current organization',
'Benefit: Higher performance evaluation at current job',
'Benefit: Helped me get a new job in the same field',
'Benefit: Helped me get a new job in a different field',
'(selected Benefits)', '(selected except Not really - Benefits)',
'How important are MOOC certificates to you?',
'How much do you think employers value MOOC certificates?',
'How willing are you to pay for a certificate for a MOOC?',
'Which of the following have a strong impact on your willingness to pay for a MOOC certificate?',
'Pay: The topic/subject',
'Pay: The institution/university offering the MOOC',
'Pay: The instructor/professor', 'Pay: The MOOC platform being used',
'Pay: A multi-course certification that the MOOC is a part of',
'(Selected answer)', 'Which region of the world are you in?',
'What is your level of formal education?', 'What is your age range?'],
dtype='object') 53
In [8]:
regions = df['Which region of the world are you in?'].value_counts() / size_df * 100
#regions
In [9]:
regions.sort_values().plot.barh()
plt.xlabel('percentage of respondents')
plt.title('Which region of the world are you in?')
plt.show()
In [10]:
education = df['What is your level of formal education?'].value_counts() / size_df * 100
In [11]:
education.sort_values().plot.barh()
plt.xlabel('percentage of respondents')
plt.title('Education level of the respondents')
plt.show()
In [12]:
df['# MOOCs Started'].describe()
Out[12]:
count 2439
unique 7
top 0
freq 668
Name: # MOOCs Started, dtype: object
In [13]:
df['# MOOCs Started'].value_counts()
Out[13]:
0 668
2.5 395
8 330
4.5 311
25 259
1 241
15 235
Name: # MOOCs Started, dtype: int64
In [14]:
df['# MOOCs Finished'].value_counts()
Out[14]:
0 916
2.5 435
1 301
4.5 275
8 200
25 139
15 131
Name: # MOOCs Finished, dtype: int64
In [15]:
reasons = df['(answered Reasons)'].value_counts() / size_df * 100
reasons.sort_values().plot.barh()
plt.title('number of reasons to take a MOOC')
plt.xlabel('percentage of respondents')
plt.ylabel('number of reasons given')
plt.show()
In [16]:
reasons = df['(selected Benefits)'].value_counts() / size_df * 100
reasons.sort_values().plot.barh()
plt.title('benefits in taking MOOC')
plt.xlabel('percentage of respondents')
plt.ylabel('number of benefits given')
plt.show()
In [17]:
reasons = df['(selected Aspects)'].value_counts() / size_df * 100
reasons.sort_values().plot.barh()
plt.title('benefits in taking MOOC')
plt.xlabel('percentage of respondents')
plt.ylabel('number of aspects given')
plt.show()
Content source: ronnydw/data-science-projects
Similar notebooks: