Matrix Decomposition Example for Machine Learning for Computational Linguistics

(C) 2017-2019 by Damir Cavar

Version: 1.1, September 2019

This is a tutorial related to the discussion of matrix decomposition of feature sets in classification tasksin the textbook Machine Learning: The Art and Science of Algorithms that Make Sense of Data by Peter Flach.

This tutorial was developed as part of my course material for the course Machine Learning for Computational Linguistics in the Computational Linguistics Program of the Department of Linguistics at Indiana University.

Matrix operations


In [1]:
from numpy import array

ratings = array([
        [1, 0, 1, 0],
        [0, 2, 2, 2],
        [0, 0, 0, 1],
        [1, 2, 3, 2],
        [1, 0, 1, 1],
        [0, 2, 2, 3]])

print(ratings)


[[1 0 1 0]
 [0 2 2 2]
 [0 0 0 1]
 [1 2 3 2]
 [1 0 1 1]
 [0 2 2 3]]

Decomposition of the matrix into sub-matrices:


In [2]:
from numpy import dot

filmsGenres = array([
        [1, 0, 1, 0],
        [0, 1, 1, 1],
        [0, 0, 0, 1]
    ])

preferencesGenres = array([
        [1, 0, 0],
        [0, 1, 0],
        [0, 0, 1],
        [1, 1, 0],
        [1, 0, 1],
        [0, 1, 1]
    ])

importanceGenres = array([
        [1, 0, 0],
        [0, 2, 0],
        [0, 0, 1]
    ])

print(dot(preferencesGenres, dot(importanceGenres, filmsGenres)))


[[1 0 1 0]
 [0 2 2 2]
 [0 0 0 1]
 [1 2 3 2]
 [1 0 1 1]
 [0 2 2 3]]