Analysing Public Member Info on Meetup.com

In this notebook we do some simple analysis of information about members registered on meetup.com. We extract the info using the official meetup API where you can also get your API key as a registered member.

N.B. This is work in progress.

Getting Started



In [1]:

    
%matplotlib inline

import re
import os
import json

import requests
import pandas as pd



In [2]:

    
server = 'https://api.meetup.com'
group_urlname = 'Python-Users-Berlin-PUB'
from meetup_api_key import key

Get information about a group on Meetup.com:



In [3]:

    
requests.get("https://api.meetup.com/%s?key=%s" % (group_urlname, key)).json()









    Out[3]:





{'category': {'id': 34,
  'name': 'Tech',
  'shortname': 'Tech',
  'sort_name': 'Tech'},
 'city': 'Berlin',
 'country': 'DE',
 'created': 1352071340000,
 'description': '<p>This is the Python Users Group Berlin (PUB). We host presentations and discussions about Python and Python-related software development (mostly framework-independent).</p>\n<p><span>Currently most talks are held in English language but we usually switch to German if all visitors speak German. Please indicate your interest on the mailing list (</span><a href="http://starship.python.net/cgi-bin/mailman/listinfo/python-berlin" class="linkified">http://starship.python.net/cgi-bin/mailman/...</a><span>) if you are interested in any specific language so the speakers can prepare for that. We meet every 2nd Thursday of the month at 7pm at <a href="http://co-up.de">co.up</a>&nbsp;(close to Kottbusser Tor or "Kotti" ;-).</span>&nbsp;<span>Since attendance rate in Berlin is usually under 50% only we\'ve stopped limiting the number of RSVPs. As we have space for 40-60 people at </span><a href="http://co-up.de">co.up</a><span> the rule is: first come, first seated.</span></p>\n<p><b>New:</b> There is a new <a href="http://slack.com">slack</a> team for us named&nbsp;<a href="http://pythonberlin.slack.com">pythonberlin</a> for which you can invite yourself here:&nbsp;<a href="https://pythonberlin.ngrok.io/">https://pythonberlin.eu.ngrok.io</a>&nbsp;(a service almost available 24/7, though, since it\'s running on a Raspberry Pi.).</p>\n<p>You might also want to subscribe to our&nbsp;<a href="http://starship.python.net/mailman/listinfo/python-berlin">mailing list</a>&nbsp;(usually in German) which has become rather low-traffic over the last years.</p>\n<br>',
 'group_photo': {'base_url': 'http://photos3.meetupstatic.com',
  'highres_link': 'http://photos3.meetupstatic.com/photos/event/7/b/f/4/highres_186571732.jpeg',
  'id': 186571732,
  'photo_link': 'http://photos3.meetupstatic.com/photos/event/7/b/f/4/600_186571732.jpeg',
  'thumb_link': 'http://photos3.meetupstatic.com/photos/event/7/b/f/4/thumb_186571732.jpeg',
  'type': 'event'},
 'id': 5700582,
 'join_mode': 'open',
 'key_photo': {'base_url': 'http://photos1.meetupstatic.com',
  'highres_link': 'http://photos1.meetupstatic.com/photos/event/e/4/d/9/highres_450958585.jpeg',
  'id': 450958585,
  'photo_link': 'http://photos1.meetupstatic.com/photos/event/e/4/d/9/600_450958585.jpeg',
  'thumb_link': 'http://photos1.meetupstatic.com/photos/event/e/4/d/9/thumb_450958585.jpeg',
  'type': 'event'},
 'lat': 52.52,
 'link': 'https://www.meetup.com/Python-Users-Berlin-PUB/',
 'localized_country_name': 'Germany',
 'lon': 13.38,
 'members': 2394,
 'name': 'Python Users Berlin (PUB)',
 'next_event': {'id': 'xmdjfmywdbmb',
  'name': 'Image Processing',
  'time': 1486663200000,
  'utc_offset': 3600000,
  'yes_rsvp_count': 98},
 'organizer': {'bio': '',
  'id': 72223432,
  'name': 'Veit Schiele',
  'photo': {'base_url': 'http://photos2.meetupstatic.com',
   'highres_link': 'http://photos2.meetupstatic.com/photos/member/d/e/6/4/highres_85556932.jpeg',
   'id': 85556932,
   'photo_link': 'http://photos2.meetupstatic.com/photos/member/d/e/6/4/member_85556932.jpeg',
   'thumb_link': 'http://photos2.meetupstatic.com/photos/member/d/e/6/4/thumb_85556932.jpeg',
   'type': 'member'}},
 'photos': [{'base_url': 'http://photos2.meetupstatic.com',
   'highres_link': 'http://photos4.meetupstatic.com/photos/event/6/a/0/6/highres_247467142.jpeg',
   'id': 247467142,
   'photo_link': 'http://photos2.meetupstatic.com/photos/event/6/a/0/6/600_247467142.jpeg',
   'thumb_link': 'http://photos2.meetupstatic.com/photos/event/6/a/0/6/thumb_247467142.jpeg',
   'type': 'event'},
  {'base_url': 'http://photos2.meetupstatic.com',
   'highres_link': 'http://photos2.meetupstatic.com/photos/event/8/4/8/2/highres_306693922.jpeg',
   'id': 306693922,
   'photo_link': 'http://photos2.meetupstatic.com/photos/event/8/4/8/2/600_306693922.jpeg',
   'thumb_link': 'http://photos2.meetupstatic.com/photos/event/8/4/8/2/thumb_306693922.jpeg',
   'type': 'event'},
  {'base_url': 'http://photos2.meetupstatic.com',
   'highres_link': 'http://photos2.meetupstatic.com/photos/event/d/5/0/4/highres_306714532.jpeg',
   'id': 306714532,
   'photo_link': 'http://photos2.meetupstatic.com/photos/event/d/5/0/4/600_306714532.jpeg',
   'thumb_link': 'http://photos2.meetupstatic.com/photos/event/d/5/0/4/thumb_306714532.jpeg',
   'type': 'event'},
  {'base_url': 'http://photos3.meetupstatic.com',
   'highres_link': 'http://photos3.meetupstatic.com/photos/event/9/7/3/2/highres_452138706.jpeg',
   'id': 452138706,
   'photo_link': 'http://photos3.meetupstatic.com/photos/event/9/7/3/2/600_452138706.jpeg',
   'thumb_link': 'http://photos3.meetupstatic.com/photos/event/9/7/3/2/thumb_452138706.jpeg',
   'type': 'event'}],
 'state': '',
 'timezone': 'Europe/Berlin',
 'urlname': 'Python-Users-Berlin-PUB',
 'visibility': 'public',
 'who': 'Pythonistas'}

Get information about two members of that group:



In [4]:

    
url = server + "/2/members?offset=1&page=2&order=name&group_urlname=%s&key=%s" % (group_urlname, key)
info = requests.get(url).json()
# hide key so it doesn't show up in some repository:
for f in ('next', 'url'):
    info['meta'][f] = re.sub('key=\w+', 'key=******', info['meta'][f])



In [5]:

    
info









    Out[5]:





{'meta': {'count': 2,
  'description': 'API method for accessing members of Meetup Groups',
  'id': '',
  'lat': '',
  'link': 'https://api.meetup.com/2/members',
  'lon': '',
  'method': 'Members',
  'next': 'https://api.meetup.com/2/members?offset=2&format=json&group_urlname=Python-Users-Berlin-PUB&page=2&key=******&order=name',
  'title': 'Meetup Members v2',
  'total_count': 2394,
  'updated': 1486231644000,
  'url': 'https://api.meetup.com/2/members?offset=1&format=json&group_urlname=Python-Users-Berlin-PUB&page=2&key=******&order=name'},
 'results': [{'city': 'Berlin',
   'country': 'de',
   'id': 214130302,
   'joined': 1475364688000,
   'lat': 52.52,
   'link': 'http://www.meetup.com/members/214130302',
   'lon': 13.38,
   'name': '/f',
   'other_services': {},
   'photo': {'base_url': 'http://photos1.meetupstatic.com',
    'highres_link': 'http://photos1.meetupstatic.com/photos/member/2/2/3/4/highres_260648756.jpeg',
    'photo_id': 260648756,
    'photo_link': 'http://photos3.meetupstatic.com/photos/member/2/2/3/4/member_260648756.jpeg',
    'thumb_link': 'http://photos1.meetupstatic.com/photos/member/2/2/3/4/thumb_260648756.jpeg',
    'type': 'member'},
   'self': {'common': {}},
   'status': 'active',
   'topics': [],
   'visited': 1475364688000},
  {'city': 'Berlin',
   'country': 'de',
   'id': 170339852,
   'joined': 1457638728000,
   'lat': 52.52,
   'link': 'http://www.meetup.com/members/170339852',
   'lon': 13.38,
   'name': 'A S Aditya',
   'other_services': {},
   'photo': {'base_url': 'http://photos3.meetupstatic.com',
    'highres_link': 'http://photos3.meetupstatic.com/photos/member/d/f/d/highres_254643581.jpeg',
    'photo_id': 254643581,
    'photo_link': 'http://photos1.meetupstatic.com/photos/member/d/f/d/member_254643581.jpeg',
    'thumb_link': 'http://photos3.meetupstatic.com/photos/member/d/f/d/thumb_254643581.jpeg',
    'type': 'member'},
   'self': {'common': {}},
   'status': 'active',
   'topics': [{'id': 85, 'name': 'Science', 'urlkey': 'science'},
    {'id': 86, 'name': 'Physics', 'urlkey': 'physics'},
    {'id': 88, 'name': 'Chemistry', 'urlkey': 'chemistry'},
    {'id': 18551, 'name': 'Quantum Physics', 'urlkey': 'quantum-physics'},
    {'id': 33876, 'name': 'Mathematics', 'urlkey': 'mathematics'},
    {'id': 28034, 'name': 'BioInformatics', 'urlkey': 'bioinformatics'}],
   'visited': 1484230753000}]}



In [6]:

    
def get_all_members(group_urlname, verbose=False):
    "Read members info from a sequence of pages."

    total = []
    offset = 1
    page = 200
    url = "{server}/2/members?offset={offset}&format=json&group_urlname={group_urlname}&page={page}&key={key}&order=name"
    url = url.format(server=server, offset=offset, page=page, group_urlname=group_urlname, key=key)
    info = requests.get(url).json()
    total += info['results']
    if verbose:
        print(url)
        print(len(total), info['meta']['count'])
    while True:
        next_url = info['meta']['next']
        print(next_url)
        if not next_url:
            break
        js = requests.get(next_url).json()
        total += info['results']
        print(len(total), info['meta']['count'])
    if verbose:
        print('found %d members' % len(total))
    return total



In [7]:

    
path = 'pub-members.json'
if os.path.exists(path):
    members = json.load(open(path))
else:
    members = get_all_members('Python-Users-Berlin-PUB')
    json.dump(members, open(path, 'w'))



In [8]:

    
members[0]









    Out[8]:





{'bio': 'loves numbers, data, and code',
 'city': 'Berlin',
 'country': 'de',
 'hometown': 'Kolkata, IN',
 'id': 193168948,
 'joined': 1452637632000,
 'lat': 52.52,
 'link': 'http://www.meetup.com/members/193168948',
 'lon': 13.38,
 'name': 'Arnab Dutta',
 'other_services': {},
 'photo': {'base_url': 'http://photos1.meetupstatic.com',
  'highres_link': 'http://photos1.meetupstatic.com/photos/member/d/e/6/d/highres_250016941.jpeg',
  'photo_id': 250016941,
  'photo_link': 'http://photos3.meetupstatic.com/photos/member/d/e/6/d/member_250016941.jpeg',
  'thumb_link': 'http://photos1.meetupstatic.com/photos/member/d/e/6/d/thumb_250016941.jpeg',
  'type': 'member'},
 'self': {'common': {}},
 'status': 'active',
 'topics': [{'id': 563, 'name': 'Open Source', 'urlkey': 'opensource'},
  {'id': 3833, 'name': 'Software Development', 'urlkey': 'softwaredev'},
  {'id': 9696, 'name': 'New Technology', 'urlkey': 'newtech'},
  {'id': 10209, 'name': 'Web Technology', 'urlkey': 'web'},
  {'id': 18062, 'name': 'Big Data', 'urlkey': 'big-data'},
  {'id': 38660, 'name': 'Lean Startup', 'urlkey': 'lean-startup'},
  {'id': 15167, 'name': 'Cloud Computing', 'urlkey': 'cloud-computing'},
  {'id': 29971, 'name': 'Machine Learning', 'urlkey': 'machine-learning'},
  {'id': 30928, 'name': 'Data Analytics', 'urlkey': 'data-analytics'},
  {'id': 37381, 'name': 'Data Visualization', 'urlkey': 'data-visualization'},
  {'id': 98137,
   'name': 'Machine Intelligence',
   'urlkey': 'machine-intelligence'},
  {'id': 153481,
   'name': 'Support Vector Machines',
   'urlkey': 'support-vector-machines'},
  {'id': 1481522,
   'name': 'Data Science using Python',
   'urlkey': 'data-science-using-python'},
  {'id': 1502, 'name': 'Art', 'urlkey': 'art'},
  {'id': 15236,
   'name': 'professional-networking',
   'urlkey': 'business-networking'}],
 'visited': 1461359873000}

PUB Members' Interests



In [9]:

    
members[0]['topics']









    Out[9]:





[{'id': 563, 'name': 'Open Source', 'urlkey': 'opensource'},
 {'id': 3833, 'name': 'Software Development', 'urlkey': 'softwaredev'},
 {'id': 9696, 'name': 'New Technology', 'urlkey': 'newtech'},
 {'id': 10209, 'name': 'Web Technology', 'urlkey': 'web'},
 {'id': 18062, 'name': 'Big Data', 'urlkey': 'big-data'},
 {'id': 38660, 'name': 'Lean Startup', 'urlkey': 'lean-startup'},
 {'id': 15167, 'name': 'Cloud Computing', 'urlkey': 'cloud-computing'},
 {'id': 29971, 'name': 'Machine Learning', 'urlkey': 'machine-learning'},
 {'id': 30928, 'name': 'Data Analytics', 'urlkey': 'data-analytics'},
 {'id': 37381, 'name': 'Data Visualization', 'urlkey': 'data-visualization'},
 {'id': 98137,
  'name': 'Machine Intelligence',
  'urlkey': 'machine-intelligence'},
 {'id': 153481,
  'name': 'Support Vector Machines',
  'urlkey': 'support-vector-machines'},
 {'id': 1481522,
  'name': 'Data Science using Python',
  'urlkey': 'data-science-using-python'},
 {'id': 1502, 'name': 'Art', 'urlkey': 'art'},
 {'id': 15236,
  'name': 'professional-networking',
  'urlkey': 'business-networking'}]



In [10]:

    
pd.DataFrame(members[0]['topics'])









    Out[10]:






  
    
      
      id
      name
      urlkey
    
  
  
    
      0
      563
      Open Source
      opensource
    
    
      1
      3833
      Software Development
      softwaredev
    
    
      2
      9696
      New Technology
      newtech
    
    
      3
      10209
      Web Technology
      web
    
    
      4
      18062
      Big Data
      big-data
    
    
      5
      38660
      Lean Startup
      lean-startup
    
    
      6
      15167
      Cloud Computing
      cloud-computing
    
    
      7
      29971
      Machine Learning
      machine-learning
    
    
      8
      30928
      Data Analytics
      data-analytics
    
    
      9
      37381
      Data Visualization
      data-visualization
    
    
      10
      98137
      Machine Intelligence
      machine-intelligence
    
    
      11
      153481
      Support Vector Machines
      support-vector-machines
    
    
      12
      1481522
      Data Science using Python
      data-science-using-python
    
    
      13
      1502
      Art
      art
    
    
      14
      15236
      professional-networking
      business-networking

Now build a dataframe with this information for all members:



In [11]:

    
df = pd.concat([pd.DataFrame(m['topics']) for m in members])



In [12]:

    
len(df)









    Out[12]:





36130



In [13]:

    
s = df.groupby('name').size().sort_values(ascending=True)[-20:]
s.plot.barh(title='Most cited topics people are interested in', figsize=(10, 5))









    Out[13]:





<matplotlib.axes._subplots.AxesSubplot at 0x113939b38>

PyData Members' Interests



In [14]:

    
path = 'pydata-members.json'
if os.path.exists(path):
    members = json.load(open(path))
else:
    members = get_all_members('PyData-Berlin')
    json.dump(members, open(path, 'w'))



In [15]:

    
df = pd.concat([pd.DataFrame(m['topics']) for m in members])
s = df.groupby('name').size().sort_values(ascending=True)[-20:]
s.plot.barh(title='Most cited topics people are interested in', figsize=(10, 5))









    Out[15]:





<matplotlib.axes._subplots.AxesSubplot at 0x10e2582b0>

Members' Groups?

Information about the groups a member has joined seems to be harder to find... (???)

	id	name	urlkey
0	563	Open Source	opensource
1	3833	Software Development	softwaredev
2	9696	New Technology	newtech
3	10209	Web Technology	web
4	18062	Big Data	big-data
5	38660	Lean Startup	lean-startup
6	15167	Cloud Computing	cloud-computing
7	29971	Machine Learning	machine-learning
8	30928	Data Analytics	data-analytics
9	37381	Data Visualization	data-visualization
10	98137	Machine Intelligence	machine-intelligence
11	153481	Support Vector Machines	support-vector-machines
12	1481522	Data Science using Python	data-science-using-python
13	1502	Art	art
14	15236	professional-networking	business-networking