Author: Marie Laure, marielaure@bayesimpact.org

Application modes (IMT) Dataset retrieved from emploi-store-dev

The IMT dataset provides regional statistics about different jobs. Here we are interested in the "application modes" subset of this dataset. It gathers the means by which people find jobs. Previously, we retrieved IMT data by scraping the IMT website. Concerning application modes, the present dataset not only proposes application modes ranks (as before) but also percentages per FAP codes. As an exploratory step, we are interested in understanding what is the added value of the percentages compared to the ranks.

This dataset can be obtained with the following command: docker-compose run --rm data-analysis-prepare make data/imt/application_modes.csv

Loading and General View

First let's load the csv file:


In [1]:
import os
from os import path

import pandas as pd
import seaborn as _

DATA_FOLDER = os.getenv('DATA_FOLDER')

modes = pd.read_csv(path.join(DATA_FOLDER, 'imt/application_modes.csv'))
modes.head()


Out[1]:
APPLICATION_TYPE_CODE APPLICATION_TYPE_NAME APPLICATION_TYPE_ORDER FAP_CODE FAP_NAME RECRUT_PERCENT
0 R3 Candidature spontanée 1 A0Z00 Agriculteurs indépendants 100.00
1 R2 Réseau personnel ou professionnel 1 A0Z40 Agriculteurs salariés 41.76
2 R3 Candidature spontanée 2 A0Z40 Agriculteurs salariés 29.26
3 R4 Autres canaux 3 A0Z40 Agriculteurs salariés 15.22
4 R1 Intermédiaires du placement 4 A0Z40 Agriculteurs salariés 13.76

Yeah! Here are the percentages!

Dataset Sanity

We will check for data discrepancies. Some variables (FAP, Application type) are expressed as codes and name. So we will check for their agreement. We'll also investigate missing or weird data.

First, let's have a quick summary of the data.


In [2]:
modes.describe(include='all').head(2)


Out[2]:
APPLICATION_TYPE_CODE APPLICATION_TYPE_NAME APPLICATION_TYPE_ORDER FAP_CODE FAP_NAME RECRUT_PERCENT
count 676 676 676.0 676 676 676.0
unique 4 4 NaN 195 195 NaN

Good news, Everything seems to be up here! Let's see if there are any discrepancies between names and codes. Starting with FAP (Job types).


In [3]:
modes.groupby('FAP_CODE').FAP_NAME.nunique().value_counts()


Out[3]:
1    195
Name: FAP_NAME, dtype: int64

So far so good, Perfect concordance between FAP codes and FAP names. It is worthy to note that these 195 FAP represents only a subset of the entire FAP (225) as they are described here.

What about concordance between application type codes and names?


In [4]:
modes.groupby('APPLICATION_TYPE_CODE').APPLICATION_TYPE_NAME.nunique().value_counts()


Out[4]:
1    4
Name: APPLICATION_TYPE_NAME, dtype: int64

We also have a 1 to 1 correspondance between application type codes and names.

Is there anything going weird between FAP code and the application type rank? Like two #1?


In [5]:
modes.groupby('FAP_CODE').APPLICATION_TYPE_CODE.value_counts().value_counts()


Out[5]:
1    676
Name: APPLICATION_TYPE_CODE, dtype: int64

Nothing like that here.

Are there any weird values for ranks?


In [6]:
modes.APPLICATION_TYPE_ORDER.unique()


Out[6]:
array([1, 2, 3, 4])

Nope, they are going from first to fourth and we've already seen that there are set for every FAP.

However there may be misassigned (e.g. an application type with fourth rank showing up alone).


In [7]:
def check_order(fap_modes):
    num_modes = len(fap_modes) 
        
    if num_modes == 1:
        if fap_modes.iloc[0].RECRUT_PERCENT != 100:
            raise Exception ('Single observations should have 100% percentage')
        if fap_modes.iloc[0].APPLICATION_TYPE_ORDER != 1:
            raise Exception ('Single observations should be ranked first')
        return
    for i in range(num_modes - 1):
        if int(fap_modes.APPLICATION_TYPE_ORDER.iloc[i]) != i + 1:
            raise Exception ('Rank order not consistent')
        if fap_modes.RECRUT_PERCENT.iloc[i] < \
            fap_modes.RECRUT_PERCENT.iloc[i + 1]:
            raise Exception ('Percentage order not consistent')    

modes.sort_values(\
    'APPLICATION_TYPE_ORDER').groupby('FAP_CODE').apply(check_order);

Everything is in order!

Let's take care of the new comer... The percentage. Basic stats?


In [8]:
modes.RECRUT_PERCENT.describe()


Out[8]:
count    676.000000
mean      28.845769
std       18.307057
min        0.590000
25%       17.030000
50%       25.635000
75%       36.837500
max      100.000000
Name: RECRUT_PERCENT, dtype: float64

Here application modes not observed (0%) are not represented. So note that the mean has no real meaning here.

Let's end by a manual check for dataset adequacy with Pôle Emploi website. Application modes rank for "Tuyauteurs" on the 09/01/2017 at IMT is:

  1. Candidature spontanée

Here we have:


In [9]:
modes[modes.FAP_CODE == 'D2Z41']


Out[9]:
APPLICATION_TYPE_CODE APPLICATION_TYPE_NAME APPLICATION_TYPE_ORDER FAP_CODE FAP_NAME RECRUT_PERCENT
131 R3 Candidature spontanée 1 D2Z41 Tuyauteurs 100.0

Yay! We have a match!

Maybe we should check that the sum of the percentages are close to 100%?


In [10]:
def sum_percentages(fap_modes):
    num_modes = len(fap_modes) 
    sum = 0.0
    for i in range(num_modes):
        sum += fap_modes.RECRUT_PERCENT.iloc[i]
    if sum < 99.9 or sum > 100.1:
        print('{} {}'.format(fap_modes.FAP_CODE, sum))

modes.groupby('FAP_CODE').apply(sum_percentages)


Out[10]:

For one FAP, the percentages sum to ~100%.

Conclusion

I guess we're done for basic sanity checks. Everything looks sane!

Basic Overview

First let's see what are these application modes.


In [11]:
pd.options.display.max_colwidth = 100
modes.APPLICATION_TYPE_NAME.drop_duplicates().to_frame()


Out[11]:
APPLICATION_TYPE_NAME
0 Candidature spontanée
1 Réseau personnel ou professionnel
3 Autres canaux
4 Intermédiaires du placement

The different possibilities are not super precise... Thus, only 'Candidature spontanée', 'Réseau...' and 'Intermédiaires du placement can be directly useful. For some FAP, only one, two or three application modes have been observed.

Let's have a look to how often this appears.


In [12]:
modes.groupby('FAP_CODE').size().value_counts(normalize=True).plot(kind='bar');


70% the job types have data for the 4 application modes. But we can still find some for which only 1 (<10%) or 2 modes (~10%) are observed.

So what is the application mode that is the most frequently ranked first?


In [13]:
modes[modes.APPLICATION_TYPE_ORDER == 1]\
    .APPLICATION_TYPE_NAME.value_counts(normalize=True)\
    .plot.pie(figsize=(6, 6), label='');


Network seems to be slightly better than other modes. Nothing new for the Bayesian people... However placement agencies gathers also 30% of the first rank modes.

Let's use percentages now by having a glimpse on the modes that represent more than half of the observations per job type.


In [14]:
modes[modes.RECRUT_PERCENT >= 50].APPLICATION_TYPE_NAME\
    .value_counts(normalize=True)\
    .plot.pie(figsize=(6, 6), label='');


No doubt, when one channel is doing most of the job it's pretty often the network (36%). But except for the "others" category, other modes appears also to be successful.

Conclusion

Application mode definitions are not very granular. As we already knew, network ranks first, both when considering the ranks and taking into account application modes with more than 50% of the observations. This make us think that there is space of personalisation. Next step will pursue on hos this dataset could be useful to us.

Recommendations

1. Basic

For example, we could investigate what are the job types for which application mode really makes the difference? Let's start with the easiest. When only one mode shows up.


In [15]:
total_modes = modes.groupby('FAP_CODE').size()
modes['total_modes'] = modes.FAP_CODE.map(total_modes)
modes[modes.total_modes == 1][['APPLICATION_TYPE_NAME','FAP_NAME']]


Out[15]:
APPLICATION_TYPE_NAME FAP_NAME
0 Candidature spontanée Agriculteurs indépendants
131 Candidature spontanée Tuyauteurs
197 Autres canaux Ouvriers qualifiés du travail artisanal du textile et du cuir
224 Intermédiaires du placement Carrossiers automobiles
225 Candidature spontanée Mécaniciens et électroniciens de véhicules
279 Candidature spontanée Contrôleurs des transports
294 Intermédiaires du placement Cadres des transports
295 Candidature spontanée Personnels navigants de l'aviation
358 Intermédiaires du placement Ingénieurs et cadres d'administration, maintenance en informatique
389 Candidature spontanée Magistrats
484 Réseau personnel ou professionnel Charcutiers, traiteurs
675 Réseau personnel ou professionnel Professionnels de la politique

The case of independant farmers that mostly apply spontaneously for jobs opens the question of the scope of this application mode ('Candidature spontanée'). Maybe it also includes people that create or take over companies. Here, for clear cut combinations of job type/application modes, use of professional or personal network is less represented.

What are the other jobs for which the application modes are really determinant? First we'll have a look to the gap between the modes ranked first and second. How clear cut it is?


In [16]:
def compute_top2_diff(fap_modes):
    if len(fap_modes) == 1:
        return 100
    return fap_modes.iloc[0] - fap_modes.iloc[1]
top2_diff = modes.sort_values(\
    'APPLICATION_TYPE_ORDER').groupby('FAP_CODE').RECRUT_PERCENT.apply(compute_top2_diff)
top2_diff.hist();


Most of the time (124/195) there are less than a 20% difference between the first and the second application mode. But on right hand we can clearly see thats some application modes are highly recommended for certain jobs.

Let's have a look to application modes that gather more than 60% difference between first and second.


In [17]:
modes['top2_diff'] = modes.FAP_CODE.map(top2_diff)
modes[modes.top2_diff >= 60].APPLICATION_TYPE_NAME.value_counts()


Out[17]:
Intermédiaires du placement          8
Candidature spontanée                7
Réseau personnel ou professionnel    7
Autres canaux                        3
Name: APPLICATION_TYPE_NAME, dtype: int64

Network is still there. But there are some job types for which we could recommend to have a look to placement agencies. Same for spontaneous application, maybe we could suggest hints to investigate the region specific ecosystem or job boards.

One of the main question was, what is the added value of percentages over ranks. So let's have a look to the first rank. How diverse it is?


In [18]:
modes[modes.APPLICATION_TYPE_ORDER == 1].RECRUT_PERCENT.describe()


Out[18]:
count    195.000000
mean      49.633949
std       18.674589
min       26.920000
25%       37.280000
50%       42.510000
75%       56.470000
max      100.000000
Name: RECRUT_PERCENT, dtype: float64

For half of the job types, the application mode ranked first represents less than half of the recruitment channels observed.

Is it more relevant to propose more than one, let's say two?


In [19]:
def compute_top2_sum(fap_modes):
    if len(fap_modes) == 1:
        return 100
    return fap_modes.iloc[0] + fap_modes.iloc[1]
top2_sum = modes.sort_values(\
    'APPLICATION_TYPE_ORDER').groupby('FAP_CODE').RECRUT_PERCENT.apply(compute_top2_sum)
top2_sum.hist();


It may be like pushing open doors here, but some users may benefits for more than one application mode suggestion (2). And for most job types, the first two modes gather more than 65% of the observations.

What are the jobs for which the first modes are almost equally relevant?


In [20]:
modes['top2_sum'] = modes.FAP_CODE.map(top2_sum)
modes[(modes.top2_sum > 70) & (modes.top2_diff < 15) & (modes.APPLICATION_TYPE_ORDER < 3)].\
    sort_values(['FAP_CODE', 'top2_sum'], ascending = False)


Out[20]:
APPLICATION_TYPE_CODE APPLICATION_TYPE_NAME APPLICATION_TYPE_ORDER FAP_CODE FAP_NAME RECRUT_PERCENT total_modes top2_diff top2_sum
664 R4 Autres canaux 1 W0Z91 Directeurs d'établissement scolaire et inspecteurs 48.14 3 2.67 93.61
665 R3 Candidature spontanée 2 W0Z91 Directeurs d'établissement scolaire et inspecteurs 45.47 3 2.67 93.61
656 R4 Autres canaux 1 W0Z80 Professeurs des écoles 39.55 4 2.38 76.72
657 R3 Candidature spontanée 2 W0Z80 Professeurs des écoles 37.17 4 2.38 76.72
652 R1 Intermédiaires du placement 1 V5Z84 Surveillants d'établissements scolaires 37.00 4 3.14 70.86
653 R3 Candidature spontanée 2 V5Z84 Surveillants d'établissements scolaires 33.86 4 3.14 70.86
648 R2 Réseau personnel ou professionnel 1 V5Z82 Sportifs et animateurs sportifs 40.63 4 10.75 70.51
649 R3 Candidature spontanée 2 V5Z82 Sportifs et animateurs sportifs 29.88 4 10.75 70.51
604 R3 Candidature spontanée 1 V2Z90 Médecins 49.83 3 10.30 89.36
605 R4 Autres canaux 2 V2Z90 Médecins 39.53 3 10.30 89.36
566 R1 Intermédiaires du placement 1 U0Z91 Cadres et techniciens de la documentation 37.96 4 0.99 74.93
567 R2 Réseau personnel ou professionnel 2 U0Z91 Cadres et techniciens de la documentation 36.97 4 0.99 74.93
553 R2 Réseau personnel ou professionnel 1 T6Z61 Employés des services divers 37.11 4 3.28 70.94
554 R4 Autres canaux 2 T6Z61 Employés des services divers 33.83 4 3.28 70.94
549 R1 Intermédiaires du placement 1 T4Z62 Ouvriers de l'assainissement et du traitement des déchets 40.61 4 10.09 71.13
550 R2 Réseau personnel ou professionnel 2 T4Z62 Ouvriers de l'assainissement et du traitement des déchets 30.52 4 10.09 71.13
545 R3 Candidature spontanée 1 T4Z61 Agents de services hospitaliers 39.97 4 8.65 71.29
546 R2 Réseau personnel ou professionnel 2 T4Z61 Agents de services hospitaliers 31.32 4 8.65 71.29
501 R2 Réseau personnel ou professionnel 1 S2Z61 Serveurs de cafés restaurants 40.09 4 6.61 73.57
502 R3 Candidature spontanée 2 S2Z61 Serveurs de cafés restaurants 33.48 4 6.61 73.57
497 R2 Réseau personnel ou professionnel 1 S2Z60 Employés de l'hôtellerie 37.27 4 2.69 71.85
498 R3 Candidature spontanée 2 S2Z60 Employés de l'hôtellerie 34.58 4 2.69 71.85
422 R3 Candidature spontanée 1 R0Z61 Caissiers 39.04 4 6.29 71.79
423 R2 Réseau personnel ou professionnel 2 R0Z61 Caissiers 32.75 4 6.29 71.79
398 R3 Candidature spontanée 1 P4Z80 Cadres intermédiaires de la police et de l'armée 50.63 2 1.26 100.00
399 R1 Intermédiaires du placement 2 P4Z80 Cadres intermédiaires de la police et de l'armée 49.37 2 1.26 100.00
390 R3 Candidature spontanée 1 P4Z60 Agents de sécurité et de l'ordre public 41.42 4 5.43 77.41
391 R4 Autres canaux 2 P4Z60 Agents de sécurité et de l'ordre public 35.99 4 5.43 77.41
367 R4 Autres canaux 1 N0Z91 Chercheurs (sauf industrie et enseignement supérieur) 44.08 3 13.14 75.02
368 R2 Réseau personnel ou professionnel 2 N0Z91 Chercheurs (sauf industrie et enseignement supérieur) 30.94 3 13.14 75.02
... ... ... ... ... ... ... ... ... ...
267 R2 Réseau personnel ou professionnel 1 J3Z42 Conducteurs et livreurs sur courte distance 39.16 4 7.29 71.03
268 R1 Intermédiaires du placement 2 J3Z42 Conducteurs et livreurs sur courte distance 31.87 4 7.29 71.03
189 R2 Réseau personnel ou professionnel 1 E2Z80 Agents de maîtrise et assimilés des industries de process 43.50 4 1.89 85.11
190 R1 Intermédiaires du placement 2 E2Z80 Agents de maîtrise et assimilés des industries de process 41.61 4 1.89 85.11
185 R1 Intermédiaires du placement 1 E2Z70 Techniciens des industries de process 40.58 4 10.30 70.86
186 R2 Réseau personnel ou professionnel 2 E2Z70 Techniciens des industries de process 30.28 4 10.30 70.86
173 R1 Intermédiaires du placement 1 E1Z42 Autres ouvriers qualifiés des industries agro-alimentaires (hors transformation des viandes) 37.65 3 4.89 70.41
174 R2 Réseau personnel ou professionnel 2 E1Z42 Autres ouvriers qualifiés des industries agro-alimentaires (hors transformation des viandes) 32.76 3 4.89 70.41
127 R1 Intermédiaires du placement 1 D2Z40 Chaudronniers, tôliers, traceurs, serruriers, métalliers, forgerons 42.41 4 1.32 83.50
128 R3 Candidature spontanée 2 D2Z40 Chaudronniers, tôliers, traceurs, serruriers, métalliers, forgerons 41.09 4 1.32 83.50
123 R4 Autres canaux 1 D1Z40 Régleurs 52.98 2 5.96 100.00
124 R1 Intermédiaires du placement 2 D1Z40 Régleurs 47.02 2 5.96 100.00
109 R3 Candidature spontanée 1 C1Z40 Ouvriers qualifiés de l'électricité et de l'électronique 38.84 3 4.60 73.08
110 R2 Réseau personnel ou professionnel 2 C1Z40 Ouvriers qualifiés de l'électricité et de l'électronique 34.24 3 4.60 73.08
75 R4 Autres canaux 1 B4Z44 Ouvriers qualifiés de la peinture et de la finition du bâtiment 39.53 4 6.19 72.87
76 R2 Réseau personnel ou professionnel 2 B4Z44 Ouvriers qualifiés de la peinture et de la finition du bâtiment 33.34 4 6.19 72.87
65 R4 Autres canaux 1 B4Z41 Plombiers, chauffagistes 38.64 3 6.48 70.80
66 R1 Intermédiaires du placement 2 B4Z41 Plombiers, chauffagistes 32.16 3 6.48 70.80
58 R1 Intermédiaires du placement 1 B2Z43 Charpentiers (bois) 51.34 3 13.73 88.95
59 R3 Candidature spontanée 2 B2Z43 Charpentiers (bois) 37.61 3 13.73 88.95
45 R1 Intermédiaires du placement 1 B0Z21 Ouvriers non qualifiés du gros oeuvre du bâtiment 38.51 4 5.81 71.21
46 R2 Réseau personnel ou professionnel 2 B0Z21 Ouvriers non qualifiés du gros oeuvre du bâtiment 32.70 4 5.81 71.21
33 R3 Candidature spontanée 1 A3Z40 Pêcheurs, aquaculteurs salariés 41.72 4 6.44 77.00
34 R1 Intermédiaires du placement 2 A3Z40 Pêcheurs, aquaculteurs salariés 35.28 4 6.44 77.00
28 R4 Autres canaux 1 A2Z70 Techniciens et agents d'encadrement d'exploitations agricoles 36.86 4 0.51 73.21
30 R2 Réseau personnel ou professionnel 2 A2Z70 Techniciens et agents d'encadrement d'exploitations agricoles 36.35 4 0.51 73.21
16 R2 Réseau personnel ou professionnel 1 A1Z40 Maraîchers, horticulteurs salariés 44.37 4 12.87 75.87
17 R1 Intermédiaires du placement 2 A1Z40 Maraîchers, horticulteurs salariés 31.50 4 12.87 75.87
1 R2 Réseau personnel ou professionnel 1 A0Z40 Agriculteurs salariés 41.76 4 12.50 71.02
2 R3 Candidature spontanée 2 A0Z40 Agriculteurs salariés 29.26 4 12.50 71.02

66 rows × 9 columns

33 job types have less than a 15% difference between the first and the second application mode while both modes gather more than 70% of the observations. They include some public service job types (school principals or professors) that you can access by "concours" (permanent position) or spontaneous application/placement agencies (fixed-term contract). Users are already aware of these possibilities. But for other jobs like "cashiers", pushing both advice (spontaneous application and network) seems like a good strategy!

We know that for 70% of the job types the 4 application modes have been observed. For which job type the latest mode may be interesting? Let's see what is the distribution of the percentages of the least observed mode.


In [21]:
last_modes = modes[modes.APPLICATION_TYPE_ORDER == 4]
last_modes.RECRUT_PERCENT.plot(kind ='box');


The maximum percentage observed for the application mode ranked last, is 21%.

What are the job types for which the latest mode percentage is not that ridiculous (75th percentile)?


In [22]:
last_modes[last_modes.RECRUT_PERCENT > 16][['APPLICATION_TYPE_NAME', 'FAP_NAME', 'RECRUT_PERCENT']].\
sort_values('RECRUT_PERCENT', ascending = False)


Out[22]:
APPLICATION_TYPE_NAME FAP_NAME RECRUT_PERCENT
635 Candidature spontanée Professionnels de l'orientation 21.16
631 Intermédiaires du placement Psychologues, psychothérapeutes 21.05
455 Intermédiaires du placement Représentants auprès des particuliers 20.94
239 Candidature spontanée Ingénieurs et cadres de fabrication et de la production 19.69
299 Candidature spontanée Ingénieurs et cadres de la logistique, du planning et de l'ordonnancement 19.69
262 Autres canaux Conducteurs de véhicules légers 19.37
353 Autres canaux Techniciens de production, d'exploitation, d'installation, et de maintenance, support et service... 19.29
663 Intermédiaires du placement Professeurs du secondaire 19.08
317 Autres canaux Agents d'accueil et d'information 18.99
403 Réseau personnel ou professionnel Employés de la banque et des assurances 18.97
328 Candidature spontanée Cadres administratifs, comptables et financiers (hors juristes) 18.83
357 Réseau personnel ou professionnel Ingénieurs et cadres d'étude, recherche et développement en informatique, chefs de projets infor... 18.69
388 Réseau personnel ou professionnel Professionnels du droit 18.41
407 Réseau personnel ou professionnel Techniciens de la banque 18.40
467 Intermédiaires du placement Cadres commerciaux, acheteurs et cadres de la mercatique 18.37
313 Autres canaux Employés de la comptabilité 18.08
512 Réseau personnel ou professionnel Maîtrise de l'hôtellerie 17.85
291 Intermédiaires du placement Employés des transports et du tourisme 17.69
12 Autres canaux Bûcherons, sylviculteurs salariés et agents forestiers 17.66
516 Intermédiaires du placement Cadres de l'hôtellerie et de la restauration 17.65
496 Autres canaux Chefs cuisiniers 17.56
366 Intermédiaires du placement Ingénieurs et cadres d'étude, recherche et développement (industrie) 17.30
433 Intermédiaires du placement Vendeurs en ameublement, équipement du foyer, bricolage 17.16
463 Réseau personnel ou professionnel Professions intermédiaires commerciales 17.03
532 Autres canaux Assistantes maternelles 17.03
479 Intermédiaires du placement Agents immobiliers, syndics 16.83
172 Intermédiaires du placement Autres ouvriers qualifiés des industries chimiques et plastiques 16.82
248 Candidature spontanée Ingénieurs des méthodes de production, du contrôle qualité 16.60
321 Autres canaux Agents administratifs divers 16.28

Some of these results seem to be contradictory with our knowledge: e.g. the fact that network is ranked last for law professionals. Thus, we keep our main strategy of promoting the Network. However it seems relevant to drop spontaneous advice when users are interested in jobs in which this mode is almost never observed (here less than 16%).

2. Network

Here at Bayes, we are conviced that Network is really important. Let's see if and how the application modes reported by newly recruited people enforce our statement. First, how often is the network reported as the way newly recruited people?


In [23]:
modes[modes.APPLICATION_TYPE_CODE == 'R2'].APPLICATION_TYPE_ORDER.value_counts()


Out[23]:
1    63
2    60
3    36
4    16
Name: APPLICATION_TYPE_ORDER, dtype: int64

There is only 20 job types for which Network has not been reported as an application mode. When it has been reported, it is usually ranked as the first or second application mode.

Among the jobs for which Network is ranked second for application mode


In [24]:
network_ranked_second = modes[(modes.APPLICATION_TYPE_ORDER == 2) & (modes.APPLICATION_TYPE_CODE == 'R2')]
network_ranked_second.RECRUT_PERCENT.hist();


There are only 2 cases where the Network is ranked second with more than 36% of the observations. Even if it won't concern that much job types, a threshold at 40% seems reasonable.

What about job types for which the Network mode does not reach this threshold?


In [25]:
network_ranked_second[network_ranked_second.RECRUT_PERCENT < 40].\
sort_values('RECRUT_PERCENT', ascending=False)[['FAP_NAME', 'RECRUT_PERCENT']]


Out[25]:
FAP_NAME RECRUT_PERCENT
567 Cadres et techniciens de la documentation 36.97
30 Techniciens et agents d'encadrement d'exploitations agricoles 36.35
110 Ouvriers qualifiés de l'électricité et de l'électronique 34.24
76 Ouvriers qualifiés de la peinture et de la finition du bâtiment 33.34
174 Autres ouvriers qualifiés des industries agro-alimentaires (hors transformation des viandes) 32.76
423 Caissiers 32.75
46 Ouvriers non qualifiés du gros oeuvre du bâtiment 32.70
42 Ouvriers non qualifiés des travaux publics, du béton et de l'extraction 32.23
141 Monteurs, ajusteurs et autres ouvriers qualifiés de la mécanique 31.96
518 Coiffeurs, esthéticiens 31.81
546 Agents de services hospitaliers 31.32
62 Ouvriers non qualifiés du second oeuvre du bâtiment 31.30
481 Apprentis et ouvriers non qualifiés de l'alimentation (hors industries agro-alimentaires) 31.12
84 Géomètres 31.11
368 Chercheurs (sauf industrie et enseignement supérieur) 30.94
435 Vendeurs en habillement et accessoires, articles de luxe, de sport, de loisirs et culturels 30.76
21 Jardiniers salariés 30.56
550 Ouvriers de l'assainissement et du traitement des déchets 30.52
621 Spécialistes de l'appareillage médical 30.49
583 Graphistes, dessinateurs, stylistes, décorateurs et créateurs de supports de communication visuelle 30.38
186 Techniciens des industries de process 30.28
271 Conducteurs routiers 30.24
364 Ingénieurs et cadres d'étude, recherche et développement (industrie) 29.66
637 Educateurs spécialisés 29.65
221 Ouvriers qualifiés polyvalents d'entretien du bâtiment 29.00
227 Techniciens et agents de maîtrise de la maintenance et de l'environnement 28.92
382 Cadres A de la fonction publique (hors spécialités juridiques) et assimilés 28.64
346 Employés et opérateurs en informatique 28.40
334 Cadres des ressources humaines et du recrutement 27.91
214 Ouvriers qualifiés de la maintenance en électricité et en électronique 27.90
597 Aides-soignants 27.38
98 Architectes 27.33
610 Vétérinaires 26.66
431 Vendeurs en ameublement, équipement du foyer, bricolage 26.32
395 Agents de polices municipales 26.31
633 Professionnels de l'orientation 26.18
116 Dessinateurs en électricité et en électronique 25.95
237 Ingénieurs et cadres de fabrication et de la production 25.90
69 Menuisiers et ouvriers de l'agencement et de l'isolation 25.59
330 Juristes 25.42
514 Cadres de l'hôtellerie et de la restauration 25.25
164 Autres ouvriers non qualifiés de type industriel 25.17
453 Représentants auprès des particuliers 25.05
137 Ouvriers non qualifiés métallerie, serrurerie, montage 24.98
180 Agents qualifiés de laboratoire 24.78
157 Ouvriers non qualifiés en métallurgie, verre, céramique et matériaux de construction 24.77
315 Agents d'accueil et d'information 23.88
149 Ouvriers non qualifiés des industries chimiques et plastiques 23.81
199 Ouvriers non qualifiés du travail du bois et de l'ameublement 23.13
661 Professeurs du secondaire 22.81
120 Ouvriers non qualifiés travaillant par enlèvement ou formage de métal 22.66
250 Ouvriers non qualifiés de l'emballage et manutentionnaires 21.72
297 Ingénieurs et cadres de la logistique, du planning et de l'ordonnancement 21.35
133 Soudeurs 21.02
153 Ouvriers non qualifiés des industries agro-alimentaires 19.19
118 Agents de maîtrise et assimilés en fabrication de matériel électrique, électronique 12.09
235 Agents de maîtrise en entretien 4.22
254 Ouvriers qualifiés du magasinage et de la manutention 2.99

It sounds sensible that these job types do not put network as their top mode of application. We expect the advice to be included in the two stars section. However, it appears that we could improve the phrasing for less qualified workers.

The Job types that are above the 40% threshold are:


In [26]:
network_ranked_second[network_ranked_second.RECRUT_PERCENT >= 40].\
FAP_NAME.to_frame()


Out[26]:
FAP_NAME
305 Artisans et ouvriers qualifiés divers de type artisanal
565 Cadres de la communication

For these jobs, it appears relevant that the network advice is put in the user priorities.

Another way to personnalize the Network advice could be to put am emphasis on the observed percentage for the job types for which the advantage is striking.


In [27]:
network_ranked_first = modes[(modes.APPLICATION_TYPE_ORDER == 1) & (modes.APPLICATION_TYPE_CODE ==  'R2')]
network_ranked_first_ordered = network_ranked_first[['FAP_NAME', 'RECRUT_PERCENT', 'top2_diff', 'total_modes']].\
    sort_values('RECRUT_PERCENT', ascending=False)
network_ranked_first_ordered.head(10)


Out[27]:
FAP_NAME RECRUT_PERCENT top2_diff total_modes
675 Professionnels de la politique 100.00 100.00 1
484 Charcutiers, traiteurs 100.00 100.00 1
590 Écrivains 93.29 86.58 2
370 Agents des impôts et des douanes 82.86 69.21 3
379 Autres cadres B de la fonction publique 77.01 54.02 2
277 Agents d'exploitation des transports 73.86 47.72 2
24 Viticulteurs, arboriculteurs salariés 71.21 53.56 4
533 Concierges 70.53 58.58 4
39 Cadres et maîtres d'équipage de la marine 70.21 40.42 2
570 Journalistes et cadres de l'édition 64.48 48.31 4

As an example, for the housekeepers (concierges), the network not only ranks first (70.53%) but also has a 58% difference with the application mode ranked second.

Conclusion

This dataset allows to refine our understanding of application modes importance. Network is definitely a key point but, some job types have specific recruitment channels and for others, proposing more than one could be beneficial. As an example, jobs for which first and second modes have been observed at similar rates and cover a large amount of the observations. Concerning the Network advice, when Network is ranked second, we can use a 40% threshold to distinguish users for which Network can still be consider higher priority. We can also use Spontaneous Application ranking or percentage to disable this advice for user for which it would not be relevant.

General Conclusion

The dataset is super clean and ready to use.

Unfortunately, application modes definitions aren't very precise. The "Spontaneous Application" (candidature spontanée) mode might include also creating or taking over companies. Even if, there are job types for which there is a super successful application mode, most of the time there is less than a 20% difference between the applicatiod modes ranked first and second. Furthermore, in fifty percent of the cases the application mode ranked first gathers only 42% of the successful modes. Thus, percentages may help us to get a better coverage of what is working for a given job type. Focusing on the network, it seems reasonable to set up a 40% threshold for the second ranked application modes to include a relevant runner-up.

Finally, we should definetely investigate if switching from FAP to ROME codes influences consistency of the application modes ranking/percentage.