This is a "closed book" examination - in particular, you are not to use any resources outside of this notebook (except possibly pen and paper). You may consult help from within the notebook using ? but not any online references. You should turn wireless off or set your laptop in "Airplane" mode prior to taking the exam.
You have 2 hours to complete the exam.
In [24]:
%matplotlib inline
Q1 (10 points).
data/iris.csv data set into a Pandas DataFrame. Dispaly the first 4 lines of the DataFrame. (2 points)SepalLength, SepalWidth, PetalLength and PetalWidth for the 3 different types of irises. (4 points)SepalLength against PetalLength where each species is assigned a different color. (4 points)
In [ ]:
Q2 (10 points)
Write a function peek(df, n) to display a random selection of $n$ rows of any dataframe (without repetition). Use it to show 5 random rows from the iris data set. The function should take as inputs a dataframe and an integer. Do not use the pandas sample method.
In [ ]:
Q3 (10 points)
Write a function that when given $m$ vectors of length $k$ and another $n$ vectors of length $k$, returns an $m \times n$ matrix of the cosine distance between each pair of vectors. Take the cosine distance to be $$ \frac{A \cdot B}{\|A\} \|B\|} $$ for any two vectors $A$ and $B$.
Do not use the scipy.spatial.distance.cosine function or any functions from np.linalg or scipy.llnalg.
In [ ]:
Q4 (10 points)
Consider the following matrix $A$ with dimensions (4,6), to be interpreted as 4 rows of the measurements of 6 features.
np.array([[5, 5, 2, 6, 2, 0],
[8, 6, 7, 8, 9, 7],
[9, 5, 0, 4, 6, 8],
[8, 7, 9, 3, 6, 1]])
v = np.array([1,2,3,4]) and broadcasting. (2 points)y = np.array([1,2,3,4]).T (2 points)
In [ ]:
Q10 (10 points)
We want to calculate the first 100 Catalan numbers. The $n^\text{th}$ Catalan number is given by $$ C_n = \prod_{k=2}^n \frac{n+k}{k} $$ for $n \ge 0$.
numpy to find the first 100 Catalan number - the function should take a single argument $n$ and return an array [Catalan(1), Catalan(2), ..., Catalan(n)] (4 points).numba to find the first 100 Catalan numbers (starting from 1) fast using a JIT compilation 4 points)cython to find the first 100 Catalan numbers (starting from 1) fast both AOT compilation (4 points)In each case, code readability and efficiency is important.
In [ ]: