WorkUp Events

Author: pascal@bayes.org

Date: 2017-08-13

In July 2017, the WorkUp team gave us a programatical access to their events. This notebook explores how we could use it to give examples of pro events to our users to motivate them to go to some of them or others.

The whole dataset is directly downloadable at https://www.workuper.com/events/index_json.json and it is also available with the command docker-compose run --rm data-analysis-prepare data/workuper.json.

Loading and General View

First let's load the json file:



In [1]:

    
import json
import os
from os import path

import pandas as pd

DATA_FOLDER = os.getenv('DATA_FOLDER')

events = pd.read_json(path.join(DATA_FOLDER, 'workup.json'))
events.head()









    Out[1]:






  
    
      
      address
      category
      created_at
      date
      dateend
      description
      favorite
      id
      latitude
      longitude
      organiser
      price
      slug
      status
      subscription_link
      time
      title
      updated_at
      user_id
      website
    
  
  
    
      0
      Bordeaux, France
      ["Trouver sa voie", "Trouver un job", "Changer...
      2017-06-20 15:25:40
      2017-08-29
      2017-08-29
      <p>Chaque mois, MADIRCOM organise un événement...
      NaN
      238
      44.837789
      -0.579180
      MADIRCOM
      0
      ap-heros-candidats-madircom-bordeaux
      approved
      https://www.eventbrite.fr/e/billets-ap-heros-c...
      2000-01-01T18:00:00Z
      AP HEROS CANDIDATS MADIRCOM - BORDEAUX
      2017-07-07 10:05:39
      2505
      www.madircom.com
    
    
      1
      La Grande Halle de la Villette, 211 Avenue Jea...
      ["Trouver sa voie", "Trouver un job", "Changer...
      2017-03-24 15:30:09
      2018-01-19
      2018-01-20
      <b>Trouver un emploi</b>, <b>créer son entrepr...
      NaN
      53
      48.891172
      2.390472
      Altice Media Events
      0
      le-salon-du-travail-et-de-la-mobilite-professi...
      approved
      http://www.salondutravail.fr/
      2000-01-01T10:00:00Z
      Le Salon du Travail et de la Mobilité Professi...
      2017-03-25 14:40:19
      2063
      www.salondutravail.fr
    
    
      2
      50 Quai Charles de Gaulle, Lyon, France
      ["Trouver un job", "Changer de métier"]
      2017-05-16 10:15:31
      2017-09-20
      2017-09-20
      <blockquote><p>Le salon des 10 000 emplois met...
      NaN
      161
      45.785001
      4.854624
      Job Rencontres
      0
      salon-des-10-000-emplois-lyon
      approved
      http://www.jobrencontres.fr/salon-recrutement-...
      2000-01-01T08:30:00Z
      Salon des 10 000 emplois - Lyon
      2017-05-16 10:15:52
      2117
      http://www.jobrencontres.fr/
    
    
      3
      Quai des Chartrons, Bordeaux, France
      ["Trouver un job", "Changer de métier"]
      2017-05-16 10:18:50
      2017-09-28
      2017-09-28
      <blockquote><p>Le salon des 10 000 emplois met...
      NaN
      162
      44.853082
      -0.566987
      Job Rencontres
      0
      salon-des-10-000-emplois-bordeaux
      approved
      http://www.jobrencontres.fr/salon-recrutement-...
      2000-01-01T08:30:00Z
      Salon des 10 000 emplois - Bordeaux
      2017-05-16 10:19:04
      2117
      http://www.jobrencontres.fr/
    
    
      4
      Rond-Point du Prado, Marseille, France
      ["Trouver un job", "Changer de métier"]
      2017-05-16 10:37:05
      2017-10-06
      2017-10-06
      <blockquote><p>Le salon des 10 000 emplois met...
      NaN
      163
      43.272516
      5.391503
      Job Rencontres
      0
      salon-des-10-000-emplois-marseille
      approved
      http://www.jobrencontres.fr/salon-recrutement-...
      2000-01-01T08:30:00Z
      Salon des 10 000 emplois - Marseille
      2017-05-16 10:37:27
      2117
      http://www.jobrencontres.fr/

Cool! Before exploring each individual fields, let's see how many events there are, and whether those fields are always set:



In [2]:

    
events.describe(include='all').head(3)









    Out[2]:






  
    
      
      address
      category
      created_at
      date
      dateend
      description
      favorite
      id
      latitude
      longitude
      organiser
      price
      slug
      status
      subscription_link
      time
      title
      updated_at
      user_id
      website
    
  
  
    
      count
      13
      13
      13
      13
      13
      13
      0.0
      13.0
      13.0
      13.0
      13
      13.0
      13
      13
      13
      13
      13
      13
      13.0
      13
    
    
      unique
      13
      5
      13
      13
      13
      12
      NaN
      NaN
      NaN
      NaN
      8
      NaN
      13
      1
      11
      6
      12
      13
      NaN
      8
    
    
      top
      26 RUE SERPOLLET, 26 RUE SERPOLLET, 75020 PAR...
      ["Trouver sa voie", "Trouver un job", "Changer...
      2017-08-08 21:34:51
      2017-09-30 00:00:00
      2017-09-28
      <div>Notre devise : ensemble, transformons le ...
      NaN
      NaN
      NaN
      NaN
      Activ'Action
      NaN
      job-boost-3-atelier-coaching-emploi-et-reconve...
      approved
      http://www.jobrencontres.fr/salon-recrutement-...
      2000-01-01T12:45:00Z
      Activ'Boost @Paris 12 ème (12h45-16h)
      2017-08-03 09:38:18
      NaN
      activaction.org

Hum, there are not that many rows: 13 only. However all fields seem to be set, except for favorite which is never set.

Extracting Useful Info

By a quick glance to the data above, we can classify fields between useful, irrelevant, and others to explore.

The ones that seem directly useful:

title
address combined with latitude and longitude
date and dateend
organiser

And then in a lesser extent (too much details for what we want to use them):

description
subscription_link
website
time

However the following ones are irrelevant to us as they seem only useful for the WorkUp database:

favorite
id
status
created_at
updated_at
user_id

So it leaves 3 fields: category, price and slug that we should explore a bit more.

Obvious Fields

Let's check quickly that the obvious fields have useful values. The titles:



In [3]:

    
pd.options.display.max_colwidth = 100
events.title.to_frame()









    Out[3]:






  
    
      
      title
    
  
  
    
      0
      AP HEROS CANDIDATS MADIRCOM - BORDEAUX
    
    
      1
      Le Salon du Travail et de la Mobilité Professionnelle
    
    
      2
      Salon des 10 000 emplois - Lyon
    
    
      3
      Salon des 10 000 emplois - Bordeaux
    
    
      4
      Salon des 10 000 emplois - Marseille
    
    
      5
      Activ'Boost @Paris 12 ème (12h45-16h)
    
    
      6
      Activ'Jump @Paris 12 (12h45-16h)
    
    
      7
      Activ'Boost @Paris 12 ème (12h45-16h)
    
    
      8
      Activ'Boost @Paris 20ème (12h45-16h)
    
    
      9
      Freelance Day
    
    
      10
      JOB BOOST 3 - ATELIER COACHING EMPLOI et RECONVERSION
    
    
      11
      Atelier et Conférence au CIDJ - TROUVER SA VOIE by 4 coachs
    
    
      12
      Les Jeudis d'Amélie : Maîtrisez Linkedin dans votre recherche d'emploi @PARIS

Perfect, we can use it directly as a title to show to our users. Note that the title frequently involves the city, and sometimes the timing. Also we can see that the use of upper case letters is less than ideal.

The addresses:



In [4]:

    
events[['address', 'latitude', 'longitude']]









    Out[4]:






  
    
      
      address
      latitude
      longitude
    
  
  
    
      0
      Bordeaux, France
      44.837789
      -0.579180
    
    
      1
      La Grande Halle de la Villette, 211 Avenue Jean Jaurès, 75019 Paris
      48.891172
      2.390472
    
    
      2
      50 Quai Charles de Gaulle, Lyon, France
      45.785001
      4.854624
    
    
      3
      Quai des Chartrons, Bordeaux, France
      44.853082
      -0.566987
    
    
      4
      Rond-Point du Prado, Marseille, France
      43.272516
      5.391503
    
    
      5
      PARIS ANIM' - MAISON DES ENSEMBLES, 3 RUE D'ALIGRE, 75012 PARIS, FRANCE
      48.848123
      2.377220
    
    
      6
      MAISON DES ENSEMBLES, 3 RUE D'ALIGRE, 75012 PARIS, FRANCE
      48.848123
      2.377220
    
    
      7
      PARIS ANIM' - MAISON DES ENSEMBLES, 3 RUE D'ALIGRE, 75012 PARIS, FRANCE
      48.848123
      2.377220
    
    
      8
      26 RUE SERPOLLET, 26 RUE SERPOLLET, 75020 PARIS, FRANCE
      48.860908
      2.413105
    
    
      9
      37 Rue de Turenne, 75003 Paris, France
      48.856946
      2.364321
    
    
      10
      33 Rue Berger, Paris, France
      48.861721
      2.344056
    
    
      11
      101 Quai Branly, Paris, France
      48.854771
      2.289524
    
    
      12
      12 Villa de Lourcine, 75014 Paris, France
      48.830706
      2.339032

Quickly comparing two addresses in Bordeaux, we can see that the latitude, longitude is probably the exact one (pretty cool). For our application we could filter events that are not so far from the user's target city. The address is not always formatted the same way so we will use mainly the lat/lng.

The dates:



In [5]:

    
events[['date', 'dateend']]

The dates are all in the future which probably indicates that WorkUp already filters the past ones. Note that the dateend is almost always the same as date which probably indicates that those events only last one day. For our purpose we will ignore dateend for now.

The organisers:



In [6]:

    
events.organiser.to_frame()









    Out[6]:






  
    
      
      organiser
    
  
  
    
      0
      MADIRCOM
    
    
      1
      Altice Media Events
    
    
      2
      Job Rencontres
    
    
      3
      Job Rencontres
    
    
      4
      Job Rencontres
    
    
      5
      Activ'Action
    
    
      6
      Activ'Action
    
    
      7
      Activ'Action
    
    
      8
      Activ'Action
    
    
      9
      Cohome
    
    
      10
      AlexforjoB et CoWanted
    
    
      11
      AlexforjoB
    
    
      12
      Amélie Favre Guittet

As for the title, this is pretty clean and useful.

Richer Fields

The description, exact time, the website or a direct subscription_link are not required for what we want. Our goal is not to speed up the process of our users subscribing to a specific event but to have them realize that there are many of them and that they should dig a bit more this way of enlarging their network or improving their job search.

However it would be useful to get all those details from a secondary page if they wanted to. From the WorkUp website, there are pages with full details that are accessible with a URL like this one: https://www.workuper.com/events/salon-des-10-000-emplois-marseille. The good thing is that the dataset contains the last part of the URL in the slug field. So we will keep this field to rebuild the full page URLs.

Others

Let's check the price field: we could decide to hide events that are not free, or at least warn our users early.



In [7]:

    
events[['title', 'price']]









    Out[7]:






  
    
      
      title
      price
    
  
  
    
      0
      AP HEROS CANDIDATS MADIRCOM - BORDEAUX
      0
    
    
      1
      Le Salon du Travail et de la Mobilité Professionnelle
      0
    
    
      2
      Salon des 10 000 emplois - Lyon
      0
    
    
      3
      Salon des 10 000 emplois - Bordeaux
      0
    
    
      4
      Salon des 10 000 emplois - Marseille
      0
    
    
      5
      Activ'Boost @Paris 12 ème (12h45-16h)
      0
    
    
      6
      Activ'Jump @Paris 12 (12h45-16h)
      0
    
    
      7
      Activ'Boost @Paris 12 ème (12h45-16h)
      0
    
    
      8
      Activ'Boost @Paris 20ème (12h45-16h)
      0
    
    
      9
      Freelance Day
      12
    
    
      10
      JOB BOOST 3 - ATELIER COACHING EMPLOI et RECONVERSION
      300
    
    
      11
      Atelier et Conférence au CIDJ - TROUVER SA VOIE by 4 coachs
      0
    
    
      12
      Les Jeudis d'Amélie : Maîtrisez Linkedin dans votre recherche d'emploi @PARIS
      21

Cool! Most of them are free. However some of them are not, and one of them is really expensive. Here we would need a product decision on whether we show them and how.

Finally, let's check the category field:



In [8]:

    
events.category.iloc[0]









    Out[8]:





'["Trouver sa voie", "Trouver un job", "Changer de métier"]'

Woops, this seems like a JSON encoded string (which means there was a double JSON encoding as we already decoded once to create the dataset). Let's decode it:



In [9]:

    
events['categories'] = events.category.apply(lambda l: json.loads(l))
events.categories.iloc[0]









    Out[9]:





['Trouver sa voie', 'Trouver un job', 'Changer de métier']

Now let's list all the available categories:



In [10]:

    
all_categories = set(c for categories in events.categories.tolist() for c in categories)
all_categories









    Out[10]:





{'Changer de boite',
 'Changer de métier',
 'Créer sa boite',
 'Trouver sa voie',
 'Trouver un job'}

Great! Some of them match exactly some questions that our users have answered: so we could directly filter on the events that might interest them.

Conclusion

Despite having a very small amount of data today, the API provided by WorkUp is perfect for us and would easily get integrated in Bob Emploi.

Few things that WorkUp could fix (apart from getting more events across the country):

Make sure the capitalization of the title field is clean.
Fix the double-JSON-encoding of the category field.

After that, there are many fields that we can use out of the box to display the events, and few others that can be used to filter them as appropriate for a given user:

latitude, longitude, to narrow the list of events to the ones close to the user.
price, to select only the free or cheap ones.
category, to select the ones that match what the user is trying to do.

Ultimately we might also want to filter some events that are linked to certain industries or certain kinds of jobs (e.g. "Freelance Day") but as most events are actually kind of generic for now, this is not a priority.

	address	category	created_at	date	dateend	description	favorite	id	latitude	longitude	organiser	slug	status	subscription_link	time	title	updated_at	user_id	website
0	Bordeaux, France	["Trouver sa voie", "Trouver un job", "Changer...	2017-06-20 15:25:40	2017-08-29	2017-08-29	<p>Chaque mois, MADIRCOM organise un événement...	NaN	238	44.837789	-0.579180	MADIRCOM	ap-heros-candidats-madircom-bordeaux	approved	https://www.eventbrite.fr/e/billets-ap-heros-c...	2000-01-01T18:00:00Z	AP HEROS CANDIDATS MADIRCOM - BORDEAUX	2017-07-07 10:05:39	2505	www.madircom.com
1	La Grande Halle de la Villette, 211 Avenue Jea...	["Trouver sa voie", "Trouver un job", "Changer...	2017-03-24 15:30:09	2018-01-19	2018-01-20	<b>Trouver un emploi</b>, <b>créer son entrepr...	NaN	53	48.891172	2.390472	Altice Media Events	le-salon-du-travail-et-de-la-mobilite-professi...	approved	http://www.salondutravail.fr/	2000-01-01T10:00:00Z	Le Salon du Travail et de la Mobilité Professi...	2017-03-25 14:40:19	2063	www.salondutravail.fr
2	50 Quai Charles de Gaulle, Lyon, France	["Trouver un job", "Changer de métier"]	2017-05-16 10:15:31	2017-09-20	2017-09-20	<blockquote><p>Le salon des 10 000 emplois met...	NaN	161	45.785001	4.854624	Job Rencontres	salon-des-10-000-emplois-lyon	approved	http://www.jobrencontres.fr/salon-recrutement-...	2000-01-01T08:30:00Z	Salon des 10 000 emplois - Lyon	2017-05-16 10:15:52	2117	http://www.jobrencontres.fr/
3	Quai des Chartrons, Bordeaux, France	["Trouver un job", "Changer de métier"]	2017-05-16 10:18:50	2017-09-28	2017-09-28	<blockquote><p>Le salon des 10 000 emplois met...	NaN	162	44.853082	-0.566987	Job Rencontres	salon-des-10-000-emplois-bordeaux	approved	http://www.jobrencontres.fr/salon-recrutement-...	2000-01-01T08:30:00Z	Salon des 10 000 emplois - Bordeaux	2017-05-16 10:19:04	2117	http://www.jobrencontres.fr/
4	Rond-Point du Prado, Marseille, France	["Trouver un job", "Changer de métier"]	2017-05-16 10:37:05	2017-10-06	2017-10-06	<blockquote><p>Le salon des 10 000 emplois met...	NaN	163	43.272516	5.391503	Job Rencontres	salon-des-10-000-emplois-marseille	approved	http://www.jobrencontres.fr/salon-recrutement-...	2000-01-01T08:30:00Z	Salon des 10 000 emplois - Marseille	2017-05-16 10:37:27	2117	http://www.jobrencontres.fr/

	address	category	created_at	date	dateend	description	favorite	id	latitude	longitude	organiser	price	slug	status	subscription_link	time	title	updated_at	user_id	website
count	13	13	13	13	13	13	0.0	13.0	13.0	13.0	13	13.0	13	13	13	13	13	13	13.0	13
unique	13	5	13	13	13	12	NaN	NaN	NaN	NaN	8	NaN	13	1	11	6	12	13	NaN	8
top	26 RUE SERPOLLET, 26 RUE SERPOLLET, 75020 PAR...	["Trouver sa voie", "Trouver un job", "Changer...	2017-08-08 21:34:51	2017-09-30 00:00:00	2017-09-28	<div>Notre devise : ensemble, transformons le ...	NaN	NaN	NaN	NaN	Activ'Action	NaN	job-boost-3-atelier-coaching-emploi-et-reconve...	approved	http://www.jobrencontres.fr/salon-recrutement-...	2000-01-01T12:45:00Z	Activ'Boost @Paris 12 ème (12h45-16h)	2017-08-03 09:38:18	NaN	activaction.org