EventVestor: CEO Changes

In this notebook, we'll take a look at EventVestor's CEO Changes dataset, available on the Quantopian Store. This dataset spans January 01, 2007 through the current day.

Blaze

Before we dig into the data, we want to tell you about how you generally access Quantopian Store data sets. These datasets are available through an API service known as Blaze. Blaze provides the Quantopian user with a convenient interface to access very large datasets.

Blaze provides an important function for accessing these datasets. Some of these sets are many millions of records. Bringing that data directly into Quantopian Research directly just is not viable. So Blaze allows us to provide a simple querying interface and shift the burden over to the server side.

It is common to use Blaze to reduce your dataset in size, convert it over to Pandas and then to use Pandas for further computation, manipulation and visualization.

Helpful links:

Once you've limited the size of your Blaze object, you can convert it to a Pandas DataFrames using:

from odo import odo
odo(expr, pandas.DataFrame)

Free samples and limits

One other key caveat: we limit the number of results returned from any given expression to 10,000 to protect against runaway memory usage. To be clear, you have access to all the data server side. We are limiting the size of the responses back from Blaze.

There is a free version of this dataset as well as a paid one. The free one includes about three years of historical data, though not up to the current day.

With preamble in place, let's get started:


In [3]:
# import the dataset
from quantopian.interactive.data.eventvestor import ceo_change
# or if you want to import the free dataset, use:
# from quantopian.data.eventvestor import ceo_change_free

# import data operations
from odo import odo
# import other libraries we will use
import pandas as pd

In [4]:
# Let's use blaze to understand the data a bit using Blaze dshape()
ceo_change.dshape


Out[4]:
dshape("""var * {
  event_id: ?float64,
  asof_date: datetime,
  trade_date: ?datetime,
  symbol: ?string,
  event_type: ?string,
  event_headline: ?string,
  change_status: ?string,
  change_scenario: ?string,
  change_type: ?string,
  change_source: ?string,
  change_reason: ?string,
  in_ceoname: ?string,
  in_ceogender: ?string,
  out_ceoname: ?string,
  out_ceogender: ?string,
  effective_date: ?datetime,
  event_rating: ?float64,
  timestamp: datetime,
  sid: ?int64
  }""")

In [5]:
# And how many rows are there?
# N.B. we're using a Blaze function to do this, not len()
ceo_change.count()


Out[5]:
4324

In [6]:
# Let's see what the data looks like. We'll grab the first three rows.
ceo_change[:3]


Out[6]:
event_id asof_date trade_date symbol event_type event_headline change_status change_scenario change_type change_source change_reason in_ceoname in_ceogender out_ceoname out_ceogender effective_date event_rating timestamp sid
0 134628 2007-01-03 2007-01-03 HD CEO Change Home Depot CEO Steps Down Declaration In/Out Permanent Succession Resign Frank Blake, Male Robert Nardelli Male 2007-01-02 1 2007-01-04 3496
1 1133605 2007-01-04 2007-01-04 RAIL CEO Change FreightCar America CEO John E. Carroll to Reti... Proposal In/Out Permanent Outsider Resign Christian Ragot Male John E. Carroll, Jr. Male 2007-04-30 1 2007-01-05 27161
2 950064 2007-01-04 2007-01-04 VIRL CEO Change Virage Logic CEO Adam Kablanian Resigns; Appoi... Declaration In/Out Permanent Succession Out + Retained Dan McCranie Male Adam Kablanian Male 2007-01-04 1 2007-01-05 21957

Let's go over the columns:

  • event_id: the unique identifier for this CEO Change.
  • asof_date: EventVestor's timestamp of event capture.
  • trade_date: for event announcements made before trading ends, trade_date is the same as event_date. For announcements issued after market close, trade_date is next market open day.
  • symbol: stock ticker symbol of the affected company.
  • event_type: this should always be CEO Change.
  • event_headline: a short description of the event.
  • change_status: indicates whether the change is a proposal or a confirmation.
  • change_scenario: indicates if the CEO Change is in, out, or both.
  • change_type: indicates if the incoming CEO is interim or permanent.
  • change_source: is the incoming CEO an internal candidate, or recruited from the outside?
  • change_reason: reason for the CEO transition
  • in_ceoname: name of the incoming CEO
  • in_ceoname: gender of the incoming CEO
  • out_ceoname: name of the outgoing CEO
  • out_ceogender: gender of the outgoing CEO
  • effective_date: date as of which the CEO change is effective.
  • event_rating: this is always 1. The meaning of this is uncertain.
  • timestamp: this is our timestamp on when we registered the data.
  • sid: the equity's unique identifier. Use this instead of the symbol.

We've done much of the data processing for you. Fields like timestamp and sid are standardized across all our Store Datasets, so the datasets are easy to combine. We have standardized the sid across all our equity databases.

We can select columns and rows with ease. Below, we'll fetch all entries for Microsoft. We're really only interested in the CEO coming in, the CEO going out, and the date, so we'll display only those columns.


In [7]:
# get the sid for MSFT
symbols('MSFT')


Out[7]:
Equity(5061, symbol=u'MSFT', asset_name=u'MICROSOFT CORP', exchange=u'NASDAQ GLOBAL SELECT MARKET', start_date=u'Mon, 04 Jan 1993 00:00:00 GMT', end_date=u'Tue, 29 Sep 2015 00:00:00 GMT', first_traded=None)

In [8]:
# knowing that the MSFT sid is 5061:
msft = ceo_change[ceo_change.sid==5061][['timestamp','in_ceoname', 'out_ceoname','change_status']].sort('timestamp')
msft


Out[8]:
timestamp in_ceoname out_ceoname change_status
0 2013-08-24 NaN Steve Ballmer Declaration
1 2014-02-05 Satya Nadella NaN Declaration

Note that the in_ceoname and out_ceoname in these cases were NaNs because there was a long transition period. Steve Ballmer announced his resignation on 2013-08-24, and formally stepped down on 2014-02-05.

Let's try another one:


In [9]:
# get the sid for AMD
sid_amd = symbols('AMD').sid
amd = ceo_change[ceo_change.sid==sid_amd][['timestamp','in_ceoname', 'out_ceoname','change_status']].sort('timestamp')
amd


Out[9]:
timestamp in_ceoname out_ceoname change_status
0 2008-07-18 Dirk Meyer Hector Ruiz Declaration
1 2011-01-11 Thomas Seifert Dirk Meyer Declaration
2 2011-08-26 Rory P. Read NaN Declaration
3 2014-10-09 Lisa Su Rory Read Declaration

Now suppose want to know how many CEO changes there were in the past year in which a female CEO was incoming.


In [10]:
females_in = ceo_change[ceo_change['in_ceogender']=='Female']
# Note that whenever you print a Blaze Data Object here, it will be automatically truncated to ten rows.
females_in = females_in[females_in.asof_date > '2014-09-17']
len(females_in)


Out[10]:
27

Finally, suppose want this as a DataFrame:


In [11]:
females_in_df = odo(females_in, pd.DataFrame)
females_in_df.sort('symbol', inplace=True)
# let's get the first three rows
females_in_df[:3]


Out[11]:
event_id asof_date trade_date symbol event_type event_headline change_status change_scenario change_type change_source change_reason in_ceoname in_ceogender out_ceoname out_ceogender effective_date event_rating timestamp sid
17 1890286 2015-06-01 2015-06-01 AJRD CEO Change Aerojet Rocketdynes Holdings CEO Scott Seymour... Declaration In/Out Permanent Succession Retire Eileen Drake Female Scott Seymour Male NaT 1 2015-06-02 3424
2 1783327 2014-10-08 2014-10-09 AMD CEO Change Advanced Micro Devices CEO Rory Read Steps Dow... Declaration In/Out Permanent Succession Out + Retained Lisa Su Female Rory Read Male 2014-10-08 1 2014-10-09 351
15 1846426 2015-03-04 2015-03-05 AMSF CEO Change AMERISAFE Promotes COO, G. Janelle Frost to CE... Declaration In/Out Permanent Succession Out + Retained G. Janelle Frost Female C. Allen Bradley Jr. Male 2015-04-01 1 2015-03-05 27819

In [ ]: