Puzzle H1

Question:

Download the following files:

The daily_prices file contains stock ticker symbols and opening, high, low and closing prices, as well as sales volumes. The dividends file contains stock ticker symbols and dividend amounts. Only a subset of stocks have dividends, so the stocks in the files do not match one-for-one. Line one of each file is a header row that shows which column is which. For every stock that pays a dividend, calculate the Market Capitalization (volume * closing price) for each day at the stock market’s closing bell. Identify which dividend-paying stock had the highest Market Capitalization of them all.

By way of example, based on the samples below, NGI and NAV are dividend-paying stocks, but NEX is not. Calculating the Market Capitalizations (MC) for NGI and NAV, the highest MC is NGI’s with an 200592, thus the answer is NGI

35.05 * 5500 = 192774.9
35.02 * 5400 = 189108.0
41.22 * 3900 = 160758.0
41.64 * 4800 = 199872.0
41.79 * 4800 = 200592.0

SAMPLE prices file:

exchange,symbol,date,opening_price,high_price,low_price,close_price,volume,price_adj_close

AMEX,NAV,2010-04-15,35.00,35.01,35.00,35.05,5500,35.00 AMEX,NAV,2010-04-16,35.00,35.01,35.00,35.02,5400,35.00 AMEX,NEX,2010-03-15,12.00,12.01,12.00,12.00,300,12.00 AMEX,NEX,2010-03-16,12.00,12.01,12.00,12.00,310,12.00 AMEX,NEX,2010-03-17,12.00,12.01,12.00,12.00,3.11,12.00 AMEX,NGI,2011-03-07,41.49,41.49,41.22,41.22,3900,41.22 AMEX,NGI,2011-03-04,41.62,41.64,41.00,41.64,4800,41.64 AMEX,NGI,2011-03-02,40.96,40.96,41.73,41.79,4800,41.79 SAMPLE dividends file: exchange,stock_symbol,date,dividends
AMEX,NAV,2011-01-21,0.08 AMEX,NGI,2010-12-31,0.06


In [1]:
import csv

In [2]:
from itertools import islice

In [3]:
from collections import defaultdict

In [4]:
# Works, but if/else stuff is ugly.

filename = 'AMEX_daily_prices_N.csv'
max_market_caps = {}

with open(filename) as f:
    reader = csv.DictReader(f)
    for row in reader:
        # print(row)
        company = row['stock_symbol']
        volume = float(row['stock_volume'])
        price = float(row['stock_price_close'])
        today_market_cap = volume * price
        # print(company, volume, price, today_market_cap)
        if company in max_market_caps:
            if today_market_cap > max_market_caps[company]:
                max_market_caps[company] = today_market_cap
        else:
            max_market_caps[company] = today_market_cap

# print(max_market_caps)
for company, max_market_cap in sorted(
        max_market_caps.items(),
        key=lambda item: item[1])[-3:]:
    print(company, max_market_cap)
print(max(max_market_caps.values()))


NGS 87652530.0
NG 263577500.0
NVR 308978000.0
308978000.0

In [5]:
# Used .get() method of dictionary to avoid if/else
# and make code more readable.

filename = 'AMEX_daily_prices_N.csv'
max_market_caps = {}

with open(filename) as f:
    reader = csv.DictReader(f)
    for row in reader:
        # print(row)
        company = row['stock_symbol']
        volume = float(row['stock_volume'])
        price = float(row['stock_price_close'])
        today_market_cap = volume * price
        # print(company, volume, price, today_market_cap)
        if today_market_cap > max_market_caps.get(company, 0.):
            max_market_caps[company] = today_market_cap

# print(max_market_caps)
for company, max_market_cap in sorted(
        max_market_caps.items(),
        key=lambda item: item[1])[-3:]:
    print(company, max_market_cap)
print(max(max_market_caps.values()))


NGS 87652530.0
NG 263577500.0
NVR 308978000.0
308978000.0

In [6]:
# Used max() to get rid of the if statement altogether.
# Now the ugliest thing is the .get() method call.

filename = 'AMEX_daily_prices_N.csv'
max_market_caps = {}

with open(filename) as f:
    reader = csv.DictReader(f)
    for row in reader:
        # print(row)
        company = row['stock_symbol']
        volume = float(row['stock_volume'])
        price = float(row['stock_price_close'])
        today_market_cap = volume * price
        # print(company, volume, price, today_market_cap)
        max_market_caps[company] = max(
            max_market_caps.get(company, 0.), today_market_cap)
# print(max_market_caps)
for company, max_market_cap in sorted(
        max_market_caps.items(),
        key=lambda item: item[1])[-3:]:
    print(company, max_market_cap)
print(max(max_market_caps.values()))


NGS 87652530.0
NG 263577500.0
NVR 308978000.0
308978000.0

In [7]:
# Used defaultdict to avoid need for .get() method call.
# That's as readable as I know how to make that part of the code.

filename = 'AMEX_daily_prices_N.csv'
max_market_caps = defaultdict(float)

with open(filename) as f:
    reader = csv.DictReader(f)
    for row in reader:
        # print(row)
        company = row['stock_symbol']
        volume = float(row['stock_volume'])
        price = float(row['stock_price_close'])
        today_market_cap = volume * price
        # print(company, volume, price, today_market_cap)
        max_market_caps[company] = max(
            max_market_caps[company], today_market_cap)
# print(max_market_caps)
for company, max_market_cap in sorted(
        max_market_caps.items(),
        key=lambda item: item[1])[-3:]:
    print(company, max_market_cap)
print(max(max_market_caps.values()))


NGS 87652530.0
NG 263577500.0
NVR 308978000.0
308978000.0

In [8]:
filename = 'AMEX_dividends_N.csv'

earning_companies = set()

with open(filename) as f:
    reader = csv.DictReader(f)
    for row in reader:
        # print(row)
        company = row['stock_symbol']
        dividends = float(row['dividends'])
        if dividends > 0:
            earning_companies |= {company}
    
print('dividend companies:', earning_companies)
print(
    'dividend companies without market cap:',
    earning_companies - set(max_market_caps))
print(
    'non-dividend companies:',
    set(max_market_caps) - earning_companies)


dividend companies: {'NLR', 'NZF', 'NWF', 'NPN', 'NVJ', 'NEN', 'NWI', 'NFC', 'NCB', 'NVX', 'NHC', 'NMB', 'NRB', 'NXE', 'NZX', 'NGK', 'NXK', 'NXI', 'NKG', 'NKR', 'NOX', 'NGO', 'NYF', 'NPG', 'NDD', 'NKL', 'NBO', 'NKX', 'NBJ', 'NFM', 'NLP', 'NXJ', 'NZW', 'NGX', 'NVY', 'NUJ', 'NPS', 'NCU', 'NNB', 'NEA', 'NFZ', 'NYH', 'NBH', 'NBW', 'NZH', 'NMZ', 'NOM', 'NXM', 'NZR', 'NNO', 'NVG', 'NXZ', 'NII', 'NFO', 'NRO', 'NJV', 'NKO', 'NYV', 'NGB', 'NRK', 'NVR'}
dividend companies without market cap: set()
non-dividend companies: {'NEP', 'NBS', 'NGD', 'NSU', 'NWD', 'NOG', 'NTN', 'NIV', 'NG', 'NGS', 'NXG', 'NAK', 'NBY', 'NHR', 'NOW', 'NVD'}

In [9]:
max(
    max_market_caps[company]
    for company in earning_companies & set(max_market_caps))


Out[9]:
308978000.0

In [10]:
best_companies = []
highest_market_cap = 0.
for company, market_cap in max_market_caps.items():
    if company not in earning_companies:
        continue
    if market_cap > highest_market_cap:
        highest_market_cap = market_cap
        best_companies = [company]
    elif market_cap == highest_market_cap:
        best_companies.append(company)

print(best_companies, highest_market_cap)


['NVR'] 308978000.0

This would have been a more interesting problem if one of the non-earning companies had the highest market cap.

If I cared more about really close ties, I would use decimal.Decimal instead of floats. Another approach, would have been to scale prices to integer pennies.