Introduction to Python NumPy Arrays


Goals:

  • Learn the basics of Python Numpy Arrays

What is NumPy?

  • NumPy is short for "Numerical Python" and it is a fundamental python package for scientific computing.
  • It uses a high-performance data structure known as the n-dimensional array or ndarray, a multi-dimensional array object, for efficient computation of arrays and matrices.

What is an Array?

  • Python arrays are data structures that store data similar to a list, except the type of objects stored in them is constrained.
  • Elements of an array are all of the same type and indexed by a tuple of positive integers.
  • The python module array allows you to specify the type of array at object creation time by using a type code, which is a single character. You can read more about each type code here: https://docs.python.org/3/library/array.html?highlight=array#module-array

In [1]:
import array

In [2]:
array_one = array.array('i',[1,2,3,4])
type(array_one)

In [3]:
type(array_one[0])


Out[3]:
int

What is a NumPy N-Dimensional Array (ndarray)?

  • It is an efficient multidimensional array providing fast array-oriented arithmetic operations.
  • An ndarray as any other array, it is a container for homogeneous data (Elements of the same type)
  • In NumPy, data in an ndarray is simply referred to as an array.
  • As with other container objects in Python, the contents of an ndarray can be accessed and modified by indexing or slicing operations.
  • For numerical data, NumPy arrays are more efficient for storing and manipulating data than the other built-in Python data structures.

In [4]:
import numpy as np
np.__version__


Out[4]:
'1.16.2'

In [5]:
list_one = [1,2,3,4,5]

In [6]:
numpy_array = np.array(list_one)
type(numpy_array)


Out[6]:
numpy.ndarray

In [7]:
numpy_array


Out[7]:
array([1, 2, 3, 4, 5])

Advantages of NumPy Arrays

Vectorized Operations

  • The key difference between an array and a list is, arrays are designed to handle vectorized operations while a python list is not.
  • NumPy operations perform complex computations on entire arrays without the need for Python for loops.
  • In other words, if you apply a function to an array, it is performed on every item in the array, rather than on the whole array object.
  • In a python list, you will have to perform a loop over the elements of the list.

In [8]:
list_two = [1,2,3,4,5]
# The following will throw an error:
list_two + 2


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-03923fe34c76> in <module>
      1 list_two = [1,2,3,4,5]
      2 # The following will throw an error:
----> 3 list_two + 2

TypeError: can only concatenate list (not "int") to list
  • Performing a loop to add 2 to every integer in the list

In [9]:
for index, item in enumerate(list_two):
    list_two[index] = item + 2
list_two


Out[9]:
[3, 4, 5, 6, 7]
  • With a NumPy array, you can do the same simply by doing the following:

In [10]:
numpy_array


Out[10]:
array([1, 2, 3, 4, 5])

In [11]:
numpy_array + 2


Out[11]:
array([3, 4, 5, 6, 7])
  • Any arithmetic operations between equal-size arrays applies the operation element-wise:

In [12]:
numpy_array_one = np.array([1,2])
numpy_array_two = np.array([4,6])

In [13]:
numpy_array_one + numpy_array_two


Out[13]:
array([5, 8])

In [14]:
numpy_array_one > numpy_array_two


Out[14]:
array([False, False])

Memory.

  • NumPy internally stores data in a contiguous block of memory, independent of other built-in Python objects.
  • NumPy arrays takes significantly less amount of memory as compared to python lists.

In [15]:
import numpy as np
import sys

In [16]:
python_list = [1,2,3,4,5,6]
python_list_size = sys.getsizeof(1) * len(python_list)
python_list_size


Out[16]:
168

In [17]:
python_numpy_array = np.array([1,2,3,4,5,6])
python_numpy_array_size = python_numpy_array.itemsize * python_numpy_array.size
python_numpy_array_size


Out[17]:
48

Basic Indexing and Slicing

One Dimensional Array

  • When it comes down to slicing and indexing, one-dimensional arrays are the same as python lists

In [18]:
numpy_array


Out[18]:
array([1, 2, 3, 4, 5])

In [19]:
numpy_array[1]


Out[19]:
2

In [20]:
numpy_array[1:4]


Out[20]:
array([2, 3, 4])
  • You can slice the array and pass it to a variable. Remember that variables just reference objects.
  • Any change that you make to the array slice, it will be technnically done on the original array object. Once again, variables just reference objects.

In [21]:
numpy_array_slice = numpy_array[1:4]
numpy_array_slice


Out[21]:
array([2, 3, 4])

In [22]:
numpy_array_slice[1] = 10
numpy_array_slice


Out[22]:
array([ 2, 10,  4])

In [23]:
numpy_array


Out[23]:
array([ 1,  2, 10,  4,  5])

Two-Dimensional Array

  • In a two-dimensional array, elements of the array are one-dimensional arrays

In [24]:
numpy_two_dimensional_array = np.array([[1,2,3],[4,5,6],[7,8,9]])

In [25]:
numpy_two_dimensional_array


Out[25]:
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [26]:
numpy_two_dimensional_array[1]


Out[26]:
array([4, 5, 6])
  • Instead of looping to the one-dimensional arrays to access specific elements, you can just pass a second index value

In [27]:
numpy_two_dimensional_array[1][2]


Out[27]:
6

In [28]:
numpy_two_dimensional_array[1,2]


Out[28]:
6
  • Slicing two-dimensional arrays is a little different than one-dimensional ones.

In [29]:
numpy_two_dimensional_array


Out[29]:
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [30]:
numpy_two_dimensional_array[:1]


Out[30]:
array([[1, 2, 3]])

In [31]:
numpy_two_dimensional_array[:2]


Out[31]:
array([[1, 2, 3],
       [4, 5, 6]])

In [32]:
numpy_two_dimensional_array[:3]


Out[32]:
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [33]:
numpy_two_dimensional_array[:2,1:]


Out[33]:
array([[2, 3],
       [5, 6]])

In [34]:
numpy_two_dimensional_array[:2,:1]


Out[34]:
array([[1],
       [4]])

In [35]:
numpy_two_dimensional_array[2][1:]


Out[35]:
array([8, 9])

In [ ]: