EventVestor: Issue Equity

In this notebook, we'll take a look at EventVestor's Issue Equity dataset, available on the Quantopian Store. This dataset spans January 01, 2007 through the current day, and documents events and announcements covering secondary equity issues by companies.

Blaze

Before we dig into the data, we want to tell you about how you generally access Quantopian Store data sets. These datasets are available through an API service known as Blaze. Blaze provides the Quantopian user with a convenient interface to access very large datasets.

Blaze provides an important function for accessing these datasets. Some of these sets are many millions of records. Bringing that data directly into Quantopian Research directly just is not viable. So Blaze allows us to provide a simple querying interface and shift the burden over to the server side.

It is common to use Blaze to reduce your dataset in size, convert it over to Pandas and then to use Pandas for further computation, manipulation and visualization.

Helpful links:

Once you've limited the size of your Blaze object, you can convert it to a Pandas DataFrames using:

from odo import odo
odo(expr, pandas.DataFrame)

Free samples and limits

One other key caveat: we limit the number of results returned from any given expression to 10,000 to protect against runaway memory usage. To be clear, you have access to all the data server side. We are limiting the size of the responses back from Blaze.

There is a free version of this dataset as well as a paid one. The free one includes about three years of historical data, though not up to the current day.

With preamble in place, let's get started:


In [2]:
# import the dataset
from quantopian.interactive.data.eventvestor import issue_equity
# or if you want to import the free dataset, use:
# from quantopian.interactive.data.eventvestor import issue_equity_free

# import data operations
from odo import odo
# import other libraries we will use
import pandas as pd

In [3]:
# Let's use blaze to understand the data a bit using Blaze dshape()
issue_equity.dshape


Out[3]:
dshape("""var * {
  event_id: ?float64,
  asof_date: datetime,
  trade_date: ?datetime,
  symbol: ?string,
  event_type: ?string,
  event_headline: ?string,
  issue_amount: ?float64,
  issue_units: ?string,
  issue_stage: ?string,
  event_rating: ?float64,
  timestamp: datetime,
  sid: ?int64
  }""")

In [4]:
# And how many rows are there?
# N.B. we're using a Blaze function to do this, not len()
issue_equity.count()


Out[4]:
15985

In [5]:
# Let's see what the data looks like. We'll grab the first three rows.
issue_equity[:3]


Out[5]:
event_id asof_date trade_date symbol event_type event_headline issue_amount issue_units issue_stage event_rating timestamp sid
0 131337 2007-01-03 2007-01-03 WM Issue Equity Washington Mutual to convert its 4% convertib... 0.0000 NaN NaN 1 2007-01-04 19181
1 132940 2007-01-04 2007-01-04 PSA Issue Equity Public Storage Prices 20M depositary shares at... 500.0000 $M NaN 1 2007-01-05 24962
2 1158828 2007-01-06 2007-01-06 AXTI Issue Equity AXT Issues 0.9M Shares to Underwriters 0.8625 MShares Underwriters Exercise 1 2007-01-07 18661

Let's go over the columns:

  • event_id: the unique identifier for this event.
  • asof_date: EventVestor's timestamp of event capture.
  • trade_date: for event announcements made before trading ends, trade_date is the same as event_date. For announcements issued after market close, trade_date is next market open day.
  • symbol: stock ticker symbol of the affected company.
  • event_type: this should always be Issue Equity.
  • event_headline: a brief description of the event
  • issue_amount: value of the equity issued in issue_units
  • issue_units: units of the issue_amount: most commonly millions of dollars or millions of shares
  • issue_stage: phase of the issue process: announcement, closing, pricing, etc. Note: currently, there appear to be unrelated entries in this column. We are speaking with the data vendor to amend this.
  • event_rating: this is always 1. The meaning of this is uncertain.
  • timestamp: this is our timestamp on when we registered the data.
  • sid: the equity's unique identifier. Use this instead of the symbol.

We've done much of the data processing for you. Fields like timestamp and sid are standardized across all our Store Datasets, so the datasets are easy to combine. We have standardized the sid across all our equity databases.

We can select columns and rows with ease. Below, we'll fetch all 2015 equity issues smaller than $20M.


In [6]:
issues = issue_equity[('2014-12-31' < issue_equity['asof_date']) & 
                        (issue_equity['asof_date'] <'2016-01-01') & 
                        (issue_equity.issue_amount < 20)&
                        (issue_equity.issue_units  == "$M")]
# When displaying a Blaze Data Object, the printout is automatically truncated to ten rows.
issues.sort('asof_date')


Out[6]:
event_id asof_date trade_date symbol event_type event_headline issue_amount issue_units issue_stage event_rating timestamp sid
0 1820118 2015-01-05 2015-01-05 STAG Issue Equity STAG Industrial Issues $18.5 Stock 18.500 $M Announcement 1 2015-01-06 41271
1 1821470 2015-01-08 2015-01-09 CERU Issue Equity Cerulean Pharma Issues $1M Common Stock in Pri... 1.000 $M Announcement 1 2015-01-09 46730
2 1821647 2015-01-09 2015-01-09 ADMP Issue Equity Adamis Pharmaceuticals Corp. Prices $10M Commo... 10.000 $M Pricing 1 2015-01-10 13331
3 1822765 2015-01-13 2015-01-13 ALDX Issue Equity Aldeyra Therapeutics to Issue $7.79M Common Stock 7.790 $M Announcement 1 2015-01-14 46746
4 1823486 2015-01-15 2015-01-15 ALDX Issue Equity Aldeyra Therapeutics Completes $7.79M Common S... 7.790 $M Closure 1 2015-01-16 46746
5 1823866 2015-01-16 2015-01-16 PRKR Issue Equity ParkerVision Closes Sale of $1.3M Warrants 1.300 $M Closure 1 2015-01-17 10485
6 1824453 2015-01-20 2015-01-20 ALDX Issue Equity Aldeyra Therapeutics to Issue $2M Common Stock... 2.000 $M Announcement 1 2015-01-21 46746
7 1824465 2015-01-20 2015-01-20 ANH Issue Equity Anworth Mortgage Asset Corp. Prices $7.35M Pre... 7.350 $M Pricing 1 2015-01-21 18380
8 1825591 2015-01-22 2015-01-22 ALDX Issue Equity Aldeyra Therapeutics Completes $2M Common Stoc... 2.000 $M Closure 1 2015-01-23 46746
9 1826474 2015-01-26 2015-01-26 CTP Issue Equity CTPartners Executive Search Announces $12.5M C... 12.500 $M Announcement 1 2015-01-27 40551
10 1827204 2015-01-27 2015-01-27 WAVX Issue Equity Wave Systems Corp. Closes $3.6M Common Stock O... 3.583 $M Announcement 1 2015-01-28 11869

Now suppose we want a DataFrame of the Blaze Data Object above, want to filter it further down to the announcements only, and we only want the sid, issue_amount, and the asof_date.


In [7]:
df = odo(issues, pd.DataFrame)
df = df[df.issue_stage == "Announcement"]
df = df[['sid', 'issue_amount', 'asof_date']].dropna()
# When printing a pandas DataFrame, the head 30 and tail 30 rows are displayed. The middle is truncated.
df


Out[7]:
sid issue_amount asof_date
1 41271 18.500 2015-01-05
2 46730 1.000 2015-01-08
4 46746 7.790 2015-01-13
7 46746 2.000 2015-01-20
10 40551 12.500 2015-01-26
11 11869 3.583 2015-01-27
15 40551 5.000 2015-01-30
16 46498 10.000 2015-02-03
17 16176 8.000 2015-02-04
18 32415 4.200 2015-02-04
19 22702 4.200 2015-02-04
24 46309 10.000 2015-02-09
27 21423 11.000 2015-02-11
30 35335 15.000 2015-02-11
32 24470 4.900 2015-02-13
33 8732 1.830 2015-02-17
35 32481 2.500 2015-02-20
36 36189 3.500 2015-02-23
41 36318 9.500 2015-02-26
42 41717 1.800 2015-02-26
43 39350 17.900 2015-02-27
45 31163 6.000 2015-03-03
47 39287 13.500 2015-03-04
51 3428 5.560 2015-03-10
55 5264 2.500 2015-03-13
58 17902 0.400 2015-03-20
59 9621 10.000 2015-03-20
60 8732 5.000 2015-03-20
63 34779 3.400 2015-03-30
64 34779 3.400 2015-03-30
... ... ... ...
172 46327 7.500 2015-07-16
173 29117 12.000 2015-07-17
175 23290 16.200 2015-07-23
176 48079 10.000 2015-07-27
177 42536 1.500 2015-07-27
178 28471 8.500 2015-07-28
181 27226 12.000 2015-07-29
182 45218 3.000 2015-07-30
184 41233 4.500 2015-08-03
185 45239 10.000 2015-08-03
186 26855 0.350 2015-08-06
187 15506 0.190 2015-08-07
189 37336 10.000 2015-08-13
190 41717 5.000 2015-08-14
191 39270 2.000 2015-08-17
195 28835 10.000 2015-08-19
197 42536 2.100 2015-08-21
200 45995 17.000 2015-08-24
204 30504 5.000 2015-09-01
205 21254 4.500 2015-09-04
206 46343 15.000 2015-09-08
207 30365 10.000 2015-09-08
209 1144 8.580 2015-09-14
210 48026 10.000 2015-09-14
211 28030 0.500 2015-09-17
213 48545 5.000 2015-09-23
214 9700 5.700 2015-09-24
215 16607 15.700 2015-09-24
216 28030 0.450 2015-09-24
217 31461 3.000 2015-09-25

118 rows × 3 columns


In [ ]: