Dot Product


Multiplying Numpy Arrays: dot vs __mul__

Data Science has everything to do with linear algebra.

When we want to do a weighted sum, we can put the weights in a row vector, and what they multiply in a column vector.

Assigning weights, usually iteratively, in response to back propagation, is at the heart of machine learning, from logistic regression to neural networks.

Lets go over the basics of creating row and column vectors, such that dot products become possible.

You will find np.dot(A, B) works the same as A.dot(B) when it comes to numpy arrays.


In [ ]:
import numpy as np

In [ ]:
one_dimensional = np.array([1,1,1,2,3,3,3,3,3])

In [ ]:
one_dimensional

In [ ]:
one_dimensional.shape  # not yet rows & columns

In [ ]:
one_dimensional.reshape((9,-1)) # let numpy figure out how many columns

In [ ]:
one_dimensional  # still the same

In [ ]:
one_dimensional.ndim

In [ ]:
two_dimensional = one_dimensional.reshape(1,9)  # recycle same name

In [ ]:
two_dimensional.shape  # is now 2D even if just the one row

In [ ]:
two_dimensional.ndim

In [ ]:
class M:
    """Symbolic representation of multiply, add"""
    def __init__(self, s):
        self.s = str(s)
    def __mul__(self, other):
        return M(self.s + " * " + other.s)  # string
    def __add__(self, other):
        return M(self.s + " + " + other.s)
    def __repr__(self):
        return self.s
    
#Demo
one = M(1)
two = M(2)
print(one * two)

In [ ]:
A,B,C = map(M, ['A','B','C'])  # create three M type objects

In [ ]:
m_array = np.array([A,B,C])    # put them in a numpy array

In [ ]:
m_array.dtype  # infers type (Object)

In [ ]:
m_array = m_array.reshape((-1, len(m_array)))  # make this 2 dimensional

In [ ]:
m_array.shape # transpose works for > 1 dimension

In [ ]:
m_array.T   # stand it up (3,1) vs (1,3) shape

In [ ]:
m_array.dot(m_array.T)  # row dot column i.e. self * self.T

In [ ]:
m_array.T[1,0] = M('Z') # transpose is not a copy

In [ ]:
m_array  # original has changes

In [ ]:
m_array * m_array  # dot versus element-wise

LAB CHALLENGE:

Create two arrays of compatiable dimensions and form their dot product.

numpy.random.randint is a good source of random numbers (for data).


In [ ]:
from pandas import Series

In [ ]:
A = Series(np.arange(10))

LAB CHALLENGE:

Does the pandas Series have a dot product method? Check it out!