In [156]:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np
import seaborn as sns
%matplotlib inline
We have a Dataset of Pokemons. They have various values which indicate there strongness, total is simply the sum of these values.
Got this set from https://www.kaggle.com/abcsds/pokemon
Got the swarmplot idea from https://www.kaggle.com/ndrewgele/visualizing-pok-mon-stats-with-seaborn
Have a look:
In [157]:
data.head()
Out[157]:
Okay, please try to create the following images:
use seaborn.distplot, seaborn.swarmplot, seaborn.heatmap
... for the swarmplot, you maybe need pandas.melt
, its a reshape routine similar to the one in reshape2
in R.
In [142]:
from IPython.display import display, HTML
display(HTML("<h1>Okay, you want not to do this on your own.. then now: How to do this (scroll down)</h1>"))
for i in range(20):
display(HTML("<br />"))
Okay, let's go!
In [134]:
numerical_cols = [col for col in data.columns if data[col].dtype == 'int64']
numerical_cols.pop(0)
f, ax = plt.subplots(len(numerical_cols) / 2 + len(numerical_cols) % 2, 2, figsize=(20,20))
for i, col in enumerate(numerical_cols):
axx = ax[i / 2, i % 2]
sns.distplot(data[col], ax=axx)
axx.set_title(col, fontsize=20)
f.suptitle("Distributions of Columns in Pokemon Data Set", fontsize=24)
f.savefig("figures/distributions.svg")
Okay, nice.
Now, some Heatmap of the mean of the type1 to type 2 pokemons:
In [117]:
numerical_cols.remove("Generation")
In [ ]:
In [136]:
for col in ['Type 1', 'Type 2']:
data[col].fillna("Type not set", inplace=True)
mean_power = data.groupby(['Type 1', 'Type 2']).Total.mean().unstack()
f = plt.figure(figsize=(20,10))
with sns.axes_style("white"):
sns.heatmap(
mean_power, linewidths=0.5, cmap='coolwarm'
)
plt.gcf().savefig("figures/example_heatmap.svg")
Nice, we need to choose ground & fire pokemon for the maximum total power. **_Do never choose Pokemon with the Bug&Ghost Kombi!_**!!!!!
In [139]:
f, ax = plt.subplots(len(numerical_cols) / 2 + len(numerical_cols) %2, 2, figsize=(20,30))
for i, col in enumerate(numerical_cols):
axx = ax[i / 2, i % 2]
with sns.axes_style("white"):
sns.heatmap(data.groupby(['Type 1', 'Type 2'])[col].mean().unstack(),
linewidths=0.5, cmap='coolwarm', ax=axx, square=True)
axx.set_title(col, fontsize=20)
axx.set_xticklabels(axx.xaxis.get_majorticklabels(), rotation=45)
axx.set_xlabel("")
axx.set_ylabel("")
f.suptitle("Distributions of Columns in Pokemon Data Set", fontsize=20)
Out[139]:
Now, let's try the melt feature we all know from R's reshape2 package:
In [110]:
pkmn = pd.melt(data,
id_vars=["Name", "Type 1", "Type 2"],
value_vars = ['HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed'],
var_name="Stat")
In [111]:
pkmn.sample(20)
Out[111]:
And one fancy, so-called "swarmplot":
In [137]:
plt.figure(figsize=(12,10))
plt.ylim(0, 275)
sns.swarmplot(x="Stat", y="value", data=pkmn, hue="Type 1", split=True, size=7)
plt.legend(bbox_to_anchor=(1, 1), loc=2, borderaxespad=0.)
plt.gcf().savefig("figures/example_swarmplot.svg")