Big 3 statistics

uslisted.txt: Downloaded from Orbis, file containing the following fields:

'Company name'
'Country ISO Code',
'NACE Rev. 2 Core code (4 digits)',
'Operating revenue (Turnover) th USD Last avail. yr',
'Number of employees Last avail. yr',
'GUO - Name',
'Ticker symbol',
'Stock exchange(s) listed',
'Shareholder - BvD ID number',
'Shareholder - Direct %',
'Shareholder - Total %', 'BvD ID number',
'Current market capitalisation th USD',
'Shareholder - Name',
'Total assets (last value) th USD',
'Type of entity'

IMPORTANT: This data is not provided, only the end result. big3_position.csv, that can be used to replicate all figures (step 3-5). big3_position.csv has the following data:

Company_name:
Company_ID: Orbis ID
Big3Share: Share of the big3 together
Position: Position of the big 3 among all shareholders
Revenue: Firm's revenue
Assets: Firm's assets
Employees: Firm's number of employees
MarketCap: Firm's market capitalization
Exchange: Firm's exchange
TypeEnt: Firm's type of entity

We excluded:

Non-US exhanges and private US exchanges. The exchanges that were remaining were: "'NYSE MKT','NYSE ARCA','NASDAQ/NMS (Global Market)','NASDAQ National Market', 'New York Stock Exchange (NYSE)'
Private equity firms and Funds. We added JPMORGAN CHASE & CO (BvD ID US132624428) by hand since Orbis has mistakenly classified this very large US bank as a Private Equity firm.
US041867445 (State Street Bank and Trust Co) because this subsidiary of State Street acts as a custodian, holding the shares for the ultimate owners.
Public ownership (many small ownership combines) and owners whose ID start by ZZ (no people or companies).



In [10]:

    
#Install libraries needed
!pip install --upgrade pip
!pip install pandas
!pip install numpy









    



Requirement already up-to-date: pip in /home/shared/anaconda3/lib/python3.5/site-packages
Requirement already satisfied (use --upgrade to upgrade): pandas in /home/shared/anaconda3/lib/python3.5/site-packages
Requirement already satisfied (use --upgrade to upgrade): python-dateutil>=2 in /home/shared/anaconda3/lib/python3.5/site-packages (from pandas)
Requirement already satisfied (use --upgrade to upgrade): pytz>=2011k in /home/shared/anaconda3/lib/python3.5/site-packages (from pandas)
Requirement already satisfied (use --upgrade to upgrade): numpy>=1.7.0 in /home/shared/anaconda3/lib/python3.5/site-packages (from pandas)
Requirement already satisfied (use --upgrade to upgrade): six>=1.5 in /home/shared/anaconda3/lib/python3.5/site-packages (from python-dateutil>=2->pandas)
Requirement already satisfied (use --upgrade to upgrade): numpy in /home/shared/anaconda3/lib/python3.5/site-packages



In [8]:

    
#Import libraries needed
import pandas as pd
import numpy as np
from collections import Counter

Show types of companies



In [111]:

    
df = pd.read_csv("uslisted.txt",encoding="utf-16",sep="\t",na_values = ["-","n.a."],thousands=",")
df = df.loc[df["Stock exchange(s) listed"].
            isin([ 'NYSE MKT','NYSE ARCA','NASDAQ/NMS (Global Market)','NASDAQ National Market',
                  'New York Stock Exchange (NYSE)'])]
df = df.drop_duplicates(subset="BvD ID number")
c= Counter(df["Type of entity"])
c









    Out[111]:





Counter({'Bank': 448,
         'Financial company': 337,
         'Foundation/Research institute': 3,
         'Industrial company': 3002,
         'Insurance company': 96,
         'Mutual and pension fund/Nominee/Trust/Trustee': 932,
         'Private equity firm': 38,
         'Venture capital': 5})

Step 1. Include only publicly listed companies from the US

Keep only larger shareholder percentage between direct and total shareholder percentages
In case of a name code instead of percentage convert as follow: {"NG": 0.01,"MO": 50.01, "WO": 98.01, "GP": 50.01,">50.00":50.01}



In [19]:

    
df = pd.read_csv("uslisted.txt",encoding="utf-16",sep="\t",na_values = ["-","n.a."],thousands=",")
df = df.loc[df["Stock exchange(s) listed"].isin([ 'NYSE MKT','NYSE ARCA','NASDAQ/NMS (Global Market)','NASDAQ National Market','New York Stock Exchange (NYSE)'])]
df = df.loc[df["Type of entity"].isin(['Foundation/Research institute', 'Bank', 'Venture capital', 'Financial company', 
                                'Industrial company', 'Insurance company']),:]
df = df.loc[df["Shareholder - BvD ID number"] != "US041867445",:]
df = df.loc[df["Shareholder - Name"] != "PUBLIC",:]


companies = df.loc[:,["Company name","BvD ID number","Operating revenue (Turnover) th USD Last avail. yr",
                "Total assets (last value) th USD",'Number of employees Last avail. yr',
                "Current market capitalisation th USD","Stock exchange(s) listed","Type of entity"]].drop_duplicates()
ownership = df.loc[:,["Company name","BvD ID number","Shareholder - Name","Shareholder - BvD ID number",
                "Shareholder - Direct %","Shareholder - Total %"]]

        

d = {np.NaN: np.NaN,"NG": 0.01,"MO": 50.01, "WO": 98.01, "GP": 50.01,">50.00":50.01}


#NG:0 MO: 50.01 WO: 98.1 GP"
for i in sorted([_ for _ in set(ownership["Shareholder - Direct %"]) | set(ownership["Shareholder - Total %"]) if isinstance(_,str)]):
    if not d.get(i): 
        try:
            d[i] = float(i)
        except:
            d[i] = float(i[1:])
ownership["Shareholder - Direct %"] = ownership["Shareholder - Direct %"].apply(lambda x: d[x])
ownership["Shareholder - Total %"] = ownership["Shareholder - Total %"].apply(lambda x: d[x])

ownership["max"] = ownership.apply(lambda x: np.nanmax([x["Shareholder - Direct %"],x["Shareholder - Total %"]]),axis=1)









    



/home/shared/anaconda3/lib/python3.5/site-packages/numpy/lib/nanfunctions.py:326: RuntimeWarning: All-NaN slice encountered
  warnings.warn("All-NaN slice encountered", RuntimeWarning)

Step2: Create file with the sum of big3 and the largest shareholder



In [20]:

    
#"US041867445"
with open("big3_position.csv","w+") as f:
    f.write("{}\t{}\t{}\t{}\t{}\t{}\t{}\t{}\t{}\t{}\n".
            format("Company_name","Company_ID","Big3Share","Position","Revenue","Assets","Employees","MarketCap","Exchange","TypeEnt"))
    for id,g in ownership.groupby("BvD ID number"):
        sum_big3 = g.loc[g["Shareholder - BvD ID number"].isin(['US149144472L', 'US320174431', 'US042456637']),"max"].sum()
        t = g.loc[g["Shareholder - BvD ID number"] != "US041867445",:].sort_values(by="max",ascending=False,na_position="last")
        if sum_big3 == 0: position = 100
        else: position = 1
        for i,values in t.iterrows():
            if isinstance(values["Shareholder - BvD ID number"],float): continue
            if values.values[3][:2] != "ZZ":
                if values.values[-1] >=sum_big3: position+=1
                else: break
        r,a,e,m,exchange,typeent = companies.loc[companies["BvD ID number"] == values["BvD ID number"],:].values[0][-6:]
        #print(companies.loc[companies["BvD ID number"] == values["BvD ID number"],:].values[0])
        
        f.write("{}\t{}\t{}\t{}\t{}\t{}\t{}\t{}\t{}\t{}\n".format(values["Company name"],values["BvD ID number"],sum_big3,position,r,a,e,m,exchange,typeent))

Step 3: Print stats to create the figure on ownership A and B



In [21]:

    
df = pd.read_csv("big3_position.csv",sep="\t")
from collections import Counter
c = Counter(df["Position"])

sumc = np.sum([c[_] for _ in range(0,1007)])/100
print("Percentage of companies and percentage of market capitalization")
print(c[1]/sumc,100*np.sum(df.loc[df["Position"] == 1,"MarketCap"]/np.sum(df["MarketCap"])))
print(c[2]/sumc,100*np.sum(df.loc[df["Position"] == 2,"MarketCap"]/np.sum(df["MarketCap"])))
print(c[3]/sumc,100*np.sum(df.loc[df["Position"] == 3,"MarketCap"]/np.sum(df["MarketCap"])))
print((np.sum([c[_] for _ in range(4,1007)])/sumc),
      (100*np.sum(df.loc[~df["Position"].isin([1,2,3]),"MarketCap"]/np.sum(df["MarketCap"]))))
print()
print("Number of companies and capitalization (billions)")
print(c[1],np.sum(df.loc[df["Position"] == 1,"MarketCap"])*1000/1E9)
print(c[2],np.sum(df.loc[df["Position"] == 2,"MarketCap"])*1000/1E9)
print(c[3],np.sum(df.loc[df["Position"] == 3,"MarketCap"])*1000/1E9)
print(sumc*100-c[1]-c[2]-c[3], (np.sum(df["MarketCap"])
      -np.sum(df.loc[df["Position"] == 1,"MarketCap"])
      -np.sum(df.loc[df["Position"] == 2,"MarketCap"])
      -np.sum(df.loc[df["Position"] == 3,"MarketCap"]))*1000/1E9)
print()

c.most_common(10)









    



Percentage of companies and percentage of market capitalization
42.8129829985 77.90619792198156
8.65533230294 10.097049721301994
7.5476558475 6.306088395705048
40.9840288511 5.6906639610114595

Number of companies and capitalization (billions)
1662 17258.42913
336 2236.78246
293 1396.977167
1591.0 1260.643226







    Out[21]:





[(1, 1662),
 (2, 336),
 (3, 293),
 (4, 210),
 (5, 166),
 (6, 117),
 (7, 97),
 (8, 88),
 (10, 75),
 (11, 72)]

Step 4: Create percentages for figure on ownership C



In [24]:

    
print("Ownership of each member of the big three and sum of means")
for i in [1,2,3]:
    df2 = df.loc[df["Position"] == i,:]
    o = pd.merge(ownership,companies,on="BvD ID number")
    o = pd.merge(o,df2,left_on="BvD ID number",right_on="Company_ID")
    o["x"] = o["max"]*o["Current market capitalisation th USD"]

    v = o.loc[o["Shareholder - BvD ID number"] == 'US149144472L',"max"].mean() #V
    b = o.loc[o["Shareholder - BvD ID number"] ==  'US320174431',"max"].mean() #BLK
    s = o.loc[o["Shareholder - BvD ID number"] == 'US042456637',"max"].mean() #SS

    print(1*v,1*b,1*s,1*(v+b+s))

df2 = df.loc[~df["Position"].isin([1,2,3]),:]
o = pd.merge(ownership,companies,on="BvD ID number")
o = pd.merge(o,df2,left_on="BvD ID number",right_on="Company_ID")
o["x"] = o["max"]*o["Current market capitalisation th USD"]

v = o.loc[o["Shareholder - BvD ID number"] == 'US149144472L',"max"].mean() #V
b = o.loc[o["Shareholder - BvD ID number"] ==  'US320174431',"max"].mean() #BLK
s = o.loc[o["Shareholder - BvD ID number"] == 'US042456637',"max"].mean() #SS

print(1*v,1*b,1*s,1*(v+b+s))
print()

print("Mean of sum of Ownership of each member of the big three")
for i in [1,2,3]:
    print(1*df.loc[df["Position"] == i,"Big3Share"].mean())
    
print(1*df.loc[~df["Position"].isin([1,2,3]),"Big3Share"].mean())









    



Ownership of each member of the big three and sum of means
7.018266908212566 7.6068421052631665 3.0280876979293576 17.65319671140509
4.533942028985508 5.1637681159420294 1.8135483870967744 11.511258532024312
3.733272727272727 4.224657534246572 1.4846963562753035 9.442626617794602
1.9575968109339408 2.059178217821779 0.8350976562499998 4.851872685005719

Mean of sum of Ownership of each member of the big three
17.637190132370637
11.630684523809517
8.965767918088739
2.656260213702078

Step 5: Create network of ownership based on position in the network



In [29]:

    
df = pd.read_csv("big3_position.csv",sep="\t")

endogenous_own = ownership.copy()
                                                                                  
endogenous_own = endogenous_own.loc[endogenous_own["max"]>=3.]
endogenous_own = endogenous_own.loc[endogenous_own["BvD ID number"] != endogenous_own["Shareholder - BvD ID number"],:]
edges = endogenous_own.loc[:,["BvD ID number","Shareholder - BvD ID number","max"]]

edges.columns = ["Source","Target","Weight"]
edges["Type"] = "Directed"


e1 = endogenous_own[["BvD ID number","Company name"]]
e1.columns = ["Id","Label"]
e2 = endogenous_own[["Shareholder - BvD ID number","Shareholder - Name"]]
e2.columns = ["Id","Label"]
nodes = pd.concat([e1,e2]).drop_duplicates()
nodes = pd.merge(nodes,df,left_on="Id",right_on="Company_ID")

nodes = nodes[["Id","Label","Position","Exchange","TypeEnt"]]

for i in range(4,1000):
    d[i] = 4
d[1] = 1
d[2] = 2
d[3] = 3
    

nodes["Position"] = nodes["Position"].apply(lambda x: d[x])

nodes.loc[nodes["Id"] == "US320174431","Position"] = 0
nodes.loc[nodes["Id"] == "US042456637","Position"] = 0
#VAnguard and FMR added by hand to have their names in the file
nodes.loc[122312] = ["US149144472L","VANGUARD INC via its funds",0,"None","Bank"]
nodes.loc[122313] = ["US126246544L","FMR LLC",0,"None","Bank"]

nodes.to_csv("nodes_allmarket.csv",sep="\t",index = None)
edges.to_csv("edges_allmarket.csv",sep="\t",index = None)









    



False
False



In [ ]:



In [ ]: