Analysing Public Member Info on Meetup.com

In this notebook we do some simple analysis of information about members registered on meetup.com. We extract the info using the official meetup API where you can also get your API key as a registered member.

N.B. This is work in progress.

Getting Started


In [1]:
%matplotlib inline

import re
import os
import json

import requests
import pandas as pd

In [2]:
server = 'https://api.meetup.com'
group_urlname = 'Python-Users-Berlin-PUB'
from meetup_api_key import key

Get information about a group on Meetup.com:


In [3]:
requests.get("https://api.meetup.com/%s?key=%s" % (group_urlname, key)).json()


Out[3]:
{'category': {'id': 34,
  'name': 'Tech',
  'shortname': 'Tech',
  'sort_name': 'Tech'},
 'city': 'Berlin',
 'country': 'DE',
 'created': 1352071340000,
 'description': '<p>This is the Python Users Group Berlin (PUB). We host presentations and discussions about Python and Python-related software development (mostly framework-independent).</p>\n<p><span>Currently most talks are held in English language but we usually switch to German if all visitors speak German. Please indicate your interest on the mailing list (</span><a href="http://starship.python.net/cgi-bin/mailman/listinfo/python-berlin" class="linkified">http://starship.python.net/cgi-bin/mailman/...</a><span>) if you are interested in any specific language so the speakers can prepare for that. We meet every 2nd Thursday of the month at 7pm at <a href="http://co-up.de">co.up</a>&nbsp;(close to Kottbusser Tor or "Kotti" ;-).</span>&nbsp;<span>Since attendance rate in Berlin is usually under 50% only we\'ve stopped limiting the number of RSVPs. As we have space for 40-60 people at </span><a href="http://co-up.de">co.up</a><span> the rule is: first come, first seated.</span></p>\n<p><b>New:</b> There is a new <a href="http://slack.com">slack</a> team for us named&nbsp;<a href="http://pythonberlin.slack.com">pythonberlin</a> for which you can invite yourself here:&nbsp;<a href="https://pythonberlin.ngrok.io/">https://pythonberlin.eu.ngrok.io</a>&nbsp;(a service almost available 24/7, though, since it\'s running on a Raspberry Pi.).</p>\n<p>You might also want to subscribe to our&nbsp;<a href="http://starship.python.net/mailman/listinfo/python-berlin">mailing list</a>&nbsp;(usually in German) which has become rather low-traffic over the last years.</p>\n<br>',
 'group_photo': {'base_url': 'http://photos3.meetupstatic.com',
  'highres_link': 'http://photos3.meetupstatic.com/photos/event/7/b/f/4/highres_186571732.jpeg',
  'id': 186571732,
  'photo_link': 'http://photos3.meetupstatic.com/photos/event/7/b/f/4/600_186571732.jpeg',
  'thumb_link': 'http://photos3.meetupstatic.com/photos/event/7/b/f/4/thumb_186571732.jpeg',
  'type': 'event'},
 'id': 5700582,
 'join_mode': 'open',
 'key_photo': {'base_url': 'http://photos1.meetupstatic.com',
  'highres_link': 'http://photos1.meetupstatic.com/photos/event/e/4/d/9/highres_450958585.jpeg',
  'id': 450958585,
  'photo_link': 'http://photos1.meetupstatic.com/photos/event/e/4/d/9/600_450958585.jpeg',
  'thumb_link': 'http://photos1.meetupstatic.com/photos/event/e/4/d/9/thumb_450958585.jpeg',
  'type': 'event'},
 'lat': 52.52,
 'link': 'https://www.meetup.com/Python-Users-Berlin-PUB/',
 'localized_country_name': 'Germany',
 'lon': 13.38,
 'members': 2394,
 'name': 'Python Users Berlin (PUB)',
 'next_event': {'id': 'xmdjfmywdbmb',
  'name': 'Image Processing',
  'time': 1486663200000,
  'utc_offset': 3600000,
  'yes_rsvp_count': 98},
 'organizer': {'bio': '',
  'id': 72223432,
  'name': 'Veit Schiele',
  'photo': {'base_url': 'http://photos2.meetupstatic.com',
   'highres_link': 'http://photos2.meetupstatic.com/photos/member/d/e/6/4/highres_85556932.jpeg',
   'id': 85556932,
   'photo_link': 'http://photos2.meetupstatic.com/photos/member/d/e/6/4/member_85556932.jpeg',
   'thumb_link': 'http://photos2.meetupstatic.com/photos/member/d/e/6/4/thumb_85556932.jpeg',
   'type': 'member'}},
 'photos': [{'base_url': 'http://photos2.meetupstatic.com',
   'highres_link': 'http://photos4.meetupstatic.com/photos/event/6/a/0/6/highres_247467142.jpeg',
   'id': 247467142,
   'photo_link': 'http://photos2.meetupstatic.com/photos/event/6/a/0/6/600_247467142.jpeg',
   'thumb_link': 'http://photos2.meetupstatic.com/photos/event/6/a/0/6/thumb_247467142.jpeg',
   'type': 'event'},
  {'base_url': 'http://photos2.meetupstatic.com',
   'highres_link': 'http://photos2.meetupstatic.com/photos/event/8/4/8/2/highres_306693922.jpeg',
   'id': 306693922,
   'photo_link': 'http://photos2.meetupstatic.com/photos/event/8/4/8/2/600_306693922.jpeg',
   'thumb_link': 'http://photos2.meetupstatic.com/photos/event/8/4/8/2/thumb_306693922.jpeg',
   'type': 'event'},
  {'base_url': 'http://photos2.meetupstatic.com',
   'highres_link': 'http://photos2.meetupstatic.com/photos/event/d/5/0/4/highres_306714532.jpeg',
   'id': 306714532,
   'photo_link': 'http://photos2.meetupstatic.com/photos/event/d/5/0/4/600_306714532.jpeg',
   'thumb_link': 'http://photos2.meetupstatic.com/photos/event/d/5/0/4/thumb_306714532.jpeg',
   'type': 'event'},
  {'base_url': 'http://photos3.meetupstatic.com',
   'highres_link': 'http://photos3.meetupstatic.com/photos/event/9/7/3/2/highres_452138706.jpeg',
   'id': 452138706,
   'photo_link': 'http://photos3.meetupstatic.com/photos/event/9/7/3/2/600_452138706.jpeg',
   'thumb_link': 'http://photos3.meetupstatic.com/photos/event/9/7/3/2/thumb_452138706.jpeg',
   'type': 'event'}],
 'state': '',
 'timezone': 'Europe/Berlin',
 'urlname': 'Python-Users-Berlin-PUB',
 'visibility': 'public',
 'who': 'Pythonistas'}

Get information about two members of that group:


In [4]:
url = server + "/2/members?offset=1&page=2&order=name&group_urlname=%s&key=%s" % (group_urlname, key)
info = requests.get(url).json()
# hide key so it doesn't show up in some repository:
for f in ('next', 'url'):
    info['meta'][f] = re.sub('key=\w+', 'key=******', info['meta'][f])

In [5]:
info


Out[5]:
{'meta': {'count': 2,
  'description': 'API method for accessing members of Meetup Groups',
  'id': '',
  'lat': '',
  'link': 'https://api.meetup.com/2/members',
  'lon': '',
  'method': 'Members',
  'next': 'https://api.meetup.com/2/members?offset=2&format=json&group_urlname=Python-Users-Berlin-PUB&page=2&key=******&order=name',
  'title': 'Meetup Members v2',
  'total_count': 2394,
  'updated': 1486231644000,
  'url': 'https://api.meetup.com/2/members?offset=1&format=json&group_urlname=Python-Users-Berlin-PUB&page=2&key=******&order=name'},
 'results': [{'city': 'Berlin',
   'country': 'de',
   'id': 214130302,
   'joined': 1475364688000,
   'lat': 52.52,
   'link': 'http://www.meetup.com/members/214130302',
   'lon': 13.38,
   'name': '/f',
   'other_services': {},
   'photo': {'base_url': 'http://photos1.meetupstatic.com',
    'highres_link': 'http://photos1.meetupstatic.com/photos/member/2/2/3/4/highres_260648756.jpeg',
    'photo_id': 260648756,
    'photo_link': 'http://photos3.meetupstatic.com/photos/member/2/2/3/4/member_260648756.jpeg',
    'thumb_link': 'http://photos1.meetupstatic.com/photos/member/2/2/3/4/thumb_260648756.jpeg',
    'type': 'member'},
   'self': {'common': {}},
   'status': 'active',
   'topics': [],
   'visited': 1475364688000},
  {'city': 'Berlin',
   'country': 'de',
   'id': 170339852,
   'joined': 1457638728000,
   'lat': 52.52,
   'link': 'http://www.meetup.com/members/170339852',
   'lon': 13.38,
   'name': 'A S Aditya',
   'other_services': {},
   'photo': {'base_url': 'http://photos3.meetupstatic.com',
    'highres_link': 'http://photos3.meetupstatic.com/photos/member/d/f/d/highres_254643581.jpeg',
    'photo_id': 254643581,
    'photo_link': 'http://photos1.meetupstatic.com/photos/member/d/f/d/member_254643581.jpeg',
    'thumb_link': 'http://photos3.meetupstatic.com/photos/member/d/f/d/thumb_254643581.jpeg',
    'type': 'member'},
   'self': {'common': {}},
   'status': 'active',
   'topics': [{'id': 85, 'name': 'Science', 'urlkey': 'science'},
    {'id': 86, 'name': 'Physics', 'urlkey': 'physics'},
    {'id': 88, 'name': 'Chemistry', 'urlkey': 'chemistry'},
    {'id': 18551, 'name': 'Quantum Physics', 'urlkey': 'quantum-physics'},
    {'id': 33876, 'name': 'Mathematics', 'urlkey': 'mathematics'},
    {'id': 28034, 'name': 'BioInformatics', 'urlkey': 'bioinformatics'}],
   'visited': 1484230753000}]}

In [6]:
def get_all_members(group_urlname, verbose=False):
    "Read members info from a sequence of pages."

    total = []
    offset = 1
    page = 200
    url = "{server}/2/members?offset={offset}&format=json&group_urlname={group_urlname}&page={page}&key={key}&order=name"
    url = url.format(server=server, offset=offset, page=page, group_urlname=group_urlname, key=key)
    info = requests.get(url).json()
    total += info['results']
    if verbose:
        print(url)
        print(len(total), info['meta']['count'])
    while True:
        next_url = info['meta']['next']
        print(next_url)
        if not next_url:
            break
        js = requests.get(next_url).json()
        total += info['results']
        print(len(total), info['meta']['count'])
    if verbose:
        print('found %d members' % len(total))
    return total

In [7]:
path = 'pub-members.json'
if os.path.exists(path):
    members = json.load(open(path))
else:
    members = get_all_members('Python-Users-Berlin-PUB')
    json.dump(members, open(path, 'w'))

In [8]:
members[0]


Out[8]:
{'bio': 'loves numbers, data, and code',
 'city': 'Berlin',
 'country': 'de',
 'hometown': 'Kolkata, IN',
 'id': 193168948,
 'joined': 1452637632000,
 'lat': 52.52,
 'link': 'http://www.meetup.com/members/193168948',
 'lon': 13.38,
 'name': 'Arnab Dutta',
 'other_services': {},
 'photo': {'base_url': 'http://photos1.meetupstatic.com',
  'highres_link': 'http://photos1.meetupstatic.com/photos/member/d/e/6/d/highres_250016941.jpeg',
  'photo_id': 250016941,
  'photo_link': 'http://photos3.meetupstatic.com/photos/member/d/e/6/d/member_250016941.jpeg',
  'thumb_link': 'http://photos1.meetupstatic.com/photos/member/d/e/6/d/thumb_250016941.jpeg',
  'type': 'member'},
 'self': {'common': {}},
 'status': 'active',
 'topics': [{'id': 563, 'name': 'Open Source', 'urlkey': 'opensource'},
  {'id': 3833, 'name': 'Software Development', 'urlkey': 'softwaredev'},
  {'id': 9696, 'name': 'New Technology', 'urlkey': 'newtech'},
  {'id': 10209, 'name': 'Web Technology', 'urlkey': 'web'},
  {'id': 18062, 'name': 'Big Data', 'urlkey': 'big-data'},
  {'id': 38660, 'name': 'Lean Startup', 'urlkey': 'lean-startup'},
  {'id': 15167, 'name': 'Cloud Computing', 'urlkey': 'cloud-computing'},
  {'id': 29971, 'name': 'Machine Learning', 'urlkey': 'machine-learning'},
  {'id': 30928, 'name': 'Data Analytics', 'urlkey': 'data-analytics'},
  {'id': 37381, 'name': 'Data Visualization', 'urlkey': 'data-visualization'},
  {'id': 98137,
   'name': 'Machine Intelligence',
   'urlkey': 'machine-intelligence'},
  {'id': 153481,
   'name': 'Support Vector Machines',
   'urlkey': 'support-vector-machines'},
  {'id': 1481522,
   'name': 'Data Science using Python',
   'urlkey': 'data-science-using-python'},
  {'id': 1502, 'name': 'Art', 'urlkey': 'art'},
  {'id': 15236,
   'name': 'professional-networking',
   'urlkey': 'business-networking'}],
 'visited': 1461359873000}

PUB Members' Interests


In [9]:
members[0]['topics']


Out[9]:
[{'id': 563, 'name': 'Open Source', 'urlkey': 'opensource'},
 {'id': 3833, 'name': 'Software Development', 'urlkey': 'softwaredev'},
 {'id': 9696, 'name': 'New Technology', 'urlkey': 'newtech'},
 {'id': 10209, 'name': 'Web Technology', 'urlkey': 'web'},
 {'id': 18062, 'name': 'Big Data', 'urlkey': 'big-data'},
 {'id': 38660, 'name': 'Lean Startup', 'urlkey': 'lean-startup'},
 {'id': 15167, 'name': 'Cloud Computing', 'urlkey': 'cloud-computing'},
 {'id': 29971, 'name': 'Machine Learning', 'urlkey': 'machine-learning'},
 {'id': 30928, 'name': 'Data Analytics', 'urlkey': 'data-analytics'},
 {'id': 37381, 'name': 'Data Visualization', 'urlkey': 'data-visualization'},
 {'id': 98137,
  'name': 'Machine Intelligence',
  'urlkey': 'machine-intelligence'},
 {'id': 153481,
  'name': 'Support Vector Machines',
  'urlkey': 'support-vector-machines'},
 {'id': 1481522,
  'name': 'Data Science using Python',
  'urlkey': 'data-science-using-python'},
 {'id': 1502, 'name': 'Art', 'urlkey': 'art'},
 {'id': 15236,
  'name': 'professional-networking',
  'urlkey': 'business-networking'}]

In [10]:
pd.DataFrame(members[0]['topics'])


Out[10]:
id name urlkey
0 563 Open Source opensource
1 3833 Software Development softwaredev
2 9696 New Technology newtech
3 10209 Web Technology web
4 18062 Big Data big-data
5 38660 Lean Startup lean-startup
6 15167 Cloud Computing cloud-computing
7 29971 Machine Learning machine-learning
8 30928 Data Analytics data-analytics
9 37381 Data Visualization data-visualization
10 98137 Machine Intelligence machine-intelligence
11 153481 Support Vector Machines support-vector-machines
12 1481522 Data Science using Python data-science-using-python
13 1502 Art art
14 15236 professional-networking business-networking

Now build a dataframe with this information for all members:


In [11]:
df = pd.concat([pd.DataFrame(m['topics']) for m in members])

In [12]:
len(df)


Out[12]:
36130

In [13]:
s = df.groupby('name').size().sort_values(ascending=True)[-20:]
s.plot.barh(title='Most cited topics people are interested in', figsize=(10, 5))


Out[13]:
<matplotlib.axes._subplots.AxesSubplot at 0x113939b38>

PyData Members' Interests


In [14]:
path = 'pydata-members.json'
if os.path.exists(path):
    members = json.load(open(path))
else:
    members = get_all_members('PyData-Berlin')
    json.dump(members, open(path, 'w'))

In [15]:
df = pd.concat([pd.DataFrame(m['topics']) for m in members])
s = df.groupby('name').size().sort_values(ascending=True)[-20:]
s.plot.barh(title='Most cited topics people are interested in', figsize=(10, 5))


Out[15]:
<matplotlib.axes._subplots.AxesSubplot at 0x10e2582b0>

Members' Groups?

Information about the groups a member has joined seems to be harder to find... (???)