In [2]:
%matplotlib inline
In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
In [4]:
sns.set_context('notebook', font_scale=1.5)
1. (40 points) Read in the CSV file pokemon.csv
in the local directory (Source: Kaggle). Do the following:
Type 2
without creating a copy of the data frame i.e. in-place (5 points)Speed
in-place (5 points)value = 3*HP + 2*Attack + 1*Defense
(5 points)Forme
in the Name
column in-place (5 points)Attack
and Defense
attributes of all the Type 1 AND Generation subgroups. For instance, one such group would be (Grass, 1). (10 points)Note: If you change the data frame, print out the first 3 rows after each change with the head
method.
In [5]:
# Your answer here
2. (30 points) Using the same Pokemon data frame, do the following:
Name
, Type 1
, Generation
, Feature
, Score
where Name
, Type 1
, Generation
have the same meaning as in the original data frame, Feature
is a column containing one of the following strings HP
, Attack
, Defense
, Sp. Atk
, Sp. Def
, Speed
and Score
is the numerical value of the feature. This is known as going from wide-to-tall formats. In R, this operation can be done using the gatehr
function from the tidyr
package. (10 points)seaborn
package, create a grid of box plots where the x-axis the Features, the y-axis shows the 'Score', the rows are the Type 1 values, and the columns are the Generation values. (10 points)seaborn
, make a cluster map
showing the mean values of HP
, Attack
, Defense
, Sp. Atk
, Sp. Def
and Speed
for each Type 1
Pokemon. Rotate the Type 1 lables so they are readable. (10 points)
In [14]:
# Your answer here
3. (30 points) Read in the CSV file pokemonGo.csv
in the local directory (Source: Kaggle). Do the following:
pokemon.csv
and pokemonGO.csv
files. Drop any row that does not have Name
, Type 1
and Type 2
values that are exactly the same in both data frames. (10 points)
In [10]:
# Your answer here