notebook.community

Edit and run



In [1]:

    
import pandas as pd
import matplotlib
import numpy
%matplotlib inline



In [2]:

    
df = pd.read_csv("heights_weights_genders.csv")



In [12]:

    
df.head()



In [20]:

    
df.groupby('Gender').corr()



In [19]:

    
df.groupby('Gender').plot(kind='scatter', x='Height', y='Weight')









    Out[19]:





Gender
Female    Axes(0.125,0.125;0.775x0.775)
Male      Axes(0.125,0.125;0.775x0.775)
dtype: object

In both cases, there is a highly positive correlation between height and weight, with a slightly higher correlation for men (r = 0.862979) than for women (r = 0.849609).

	Gender	Height	Weight
0	Male	73.847017	241.893563
1	Male	68.781904	162.310473
2	Male	74.110105	212.740856
3	Male	71.730978	220.042470
4	Male	69.881796	206.349801

		Height	Weight
Gender
Female	Height	1.000000	0.849609
Female	Weight	0.849609	1.000000
Male	Height	1.000000	0.862979
Male	Weight	0.862979	1.000000