Question:
Download the following files:
The daily_prices file contains stock ticker symbols and opening, high, low and closing prices, as well as sales volumes. The dividends file contains stock ticker symbols and dividend amounts. Only a subset of stocks have dividends, so the stocks in the files do not match one-for-one. Line one of each file is a header row that shows which column is which. For every stock that pays a dividend, calculate the Market Capitalization (volume * closing price) for each day at the stock market’s closing bell. Identify which dividend-paying stock had the highest Market Capitalization of them all.
By way of example, based on the samples below, NGI and NAV are dividend-paying stocks, but NEX is not. Calculating the Market Capitalizations (MC) for NGI and NAV, the highest MC is NGI’s with an 200592, thus the answer is NGI
35.05 * 5500 = 192774.9
35.02 * 5400 = 189108.0
41.22 * 3900 = 160758.0
41.64 * 4800 = 199872.0
41.79 * 4800 = 200592.0
SAMPLE prices file:
exchange,symbol,date,opening_price,high_price,low_price,close_price,volume,price_adj_close
AMEX,NAV,2010-04-15,35.00,35.01,35.00,35.05,5500,35.00
AMEX,NAV,2010-04-16,35.00,35.01,35.00,35.02,5400,35.00
AMEX,NEX,2010-03-15,12.00,12.01,12.00,12.00,300,12.00
AMEX,NEX,2010-03-16,12.00,12.01,12.00,12.00,310,12.00
AMEX,NEX,2010-03-17,12.00,12.01,12.00,12.00,3.11,12.00
AMEX,NGI,2011-03-07,41.49,41.49,41.22,41.22,3900,41.22
AMEX,NGI,2011-03-04,41.62,41.64,41.00,41.64,4800,41.64
AMEX,NGI,2011-03-02,40.96,40.96,41.73,41.79,4800,41.79
SAMPLE dividends file:
exchange,stock_symbol,date,dividends
AMEX,NAV,2011-01-21,0.08
AMEX,NGI,2010-12-31,0.06
In [1]:
import csv
In [2]:
from itertools import islice
In [3]:
from collections import defaultdict
In [4]:
# Works, but if/else stuff is ugly.
filename = 'AMEX_daily_prices_N.csv'
max_market_caps = {}
with open(filename) as f:
reader = csv.DictReader(f)
for row in reader:
# print(row)
company = row['stock_symbol']
volume = float(row['stock_volume'])
price = float(row['stock_price_close'])
today_market_cap = volume * price
# print(company, volume, price, today_market_cap)
if company in max_market_caps:
if today_market_cap > max_market_caps[company]:
max_market_caps[company] = today_market_cap
else:
max_market_caps[company] = today_market_cap
# print(max_market_caps)
for company, max_market_cap in sorted(
max_market_caps.items(),
key=lambda item: item[1])[-3:]:
print(company, max_market_cap)
print(max(max_market_caps.values()))
In [5]:
# Used .get() method of dictionary to avoid if/else
# and make code more readable.
filename = 'AMEX_daily_prices_N.csv'
max_market_caps = {}
with open(filename) as f:
reader = csv.DictReader(f)
for row in reader:
# print(row)
company = row['stock_symbol']
volume = float(row['stock_volume'])
price = float(row['stock_price_close'])
today_market_cap = volume * price
# print(company, volume, price, today_market_cap)
if today_market_cap > max_market_caps.get(company, 0.):
max_market_caps[company] = today_market_cap
# print(max_market_caps)
for company, max_market_cap in sorted(
max_market_caps.items(),
key=lambda item: item[1])[-3:]:
print(company, max_market_cap)
print(max(max_market_caps.values()))
In [6]:
# Used max() to get rid of the if statement altogether.
# Now the ugliest thing is the .get() method call.
filename = 'AMEX_daily_prices_N.csv'
max_market_caps = {}
with open(filename) as f:
reader = csv.DictReader(f)
for row in reader:
# print(row)
company = row['stock_symbol']
volume = float(row['stock_volume'])
price = float(row['stock_price_close'])
today_market_cap = volume * price
# print(company, volume, price, today_market_cap)
max_market_caps[company] = max(
max_market_caps.get(company, 0.), today_market_cap)
# print(max_market_caps)
for company, max_market_cap in sorted(
max_market_caps.items(),
key=lambda item: item[1])[-3:]:
print(company, max_market_cap)
print(max(max_market_caps.values()))
In [7]:
# Used defaultdict to avoid need for .get() method call.
# That's as readable as I know how to make that part of the code.
filename = 'AMEX_daily_prices_N.csv'
max_market_caps = defaultdict(float)
with open(filename) as f:
reader = csv.DictReader(f)
for row in reader:
# print(row)
company = row['stock_symbol']
volume = float(row['stock_volume'])
price = float(row['stock_price_close'])
today_market_cap = volume * price
# print(company, volume, price, today_market_cap)
max_market_caps[company] = max(
max_market_caps[company], today_market_cap)
# print(max_market_caps)
for company, max_market_cap in sorted(
max_market_caps.items(),
key=lambda item: item[1])[-3:]:
print(company, max_market_cap)
print(max(max_market_caps.values()))
In [8]:
filename = 'AMEX_dividends_N.csv'
earning_companies = set()
with open(filename) as f:
reader = csv.DictReader(f)
for row in reader:
# print(row)
company = row['stock_symbol']
dividends = float(row['dividends'])
if dividends > 0:
earning_companies |= {company}
print('dividend companies:', earning_companies)
print(
'dividend companies without market cap:',
earning_companies - set(max_market_caps))
print(
'non-dividend companies:',
set(max_market_caps) - earning_companies)
In [9]:
max(
max_market_caps[company]
for company in earning_companies & set(max_market_caps))
Out[9]:
In [10]:
best_companies = []
highest_market_cap = 0.
for company, market_cap in max_market_caps.items():
if company not in earning_companies:
continue
if market_cap > highest_market_cap:
highest_market_cap = market_cap
best_companies = [company]
elif market_cap == highest_market_cap:
best_companies.append(company)
print(best_companies, highest_market_cap)
This would have been a more interesting problem if one of the non-earning companies had the highest market cap.
If I cared more about really close ties, I would use decimal.Decimal instead of floats. Another approach, would have been to scale prices to integer pennies.