Arrays

NumPy deals just perfect with arrays, because of

  • advanced overload of __getitem__ operator for indexing, which is handy;
  • overload of other operators for comfortable shortcuts and intuitive interface;
  • methods and functions implemented in C language, which is fast;
  • rich library of functions and methods, which allows you to do almost whatever you want.

Creating arrays

It's not so easy to create them, but it's worth it


In [1]:
from numpy import array

arr = array([1, 2, 3])
print(arr)


[1 2 3]

You just have to provide array constructor of numpy module with iterable type.

More examples


In [2]:
ten = array(range(10))
matrix = array([[1,2], [3, 4]])
nested_matrix = array([matrix, matrix])
strange_array = array([[1], 2])

print('Range demo:', ten)
print('Matrix demo:', matrix)
print('Array of NumPy arrays:', nested_matrix)
print('Something strange:', strange_array)


Range demo: [0 1 2 3 4 5 6 7 8 9]
Matrix demo: [[1 2]
 [3 4]]
Array of NumPy arrays: [[[1 2]
  [3 4]]

 [[1 2]
  [3 4]]]
Something strange: [[1] 2]

Types

NumPy can be fast, because it allows you to create arrays with elements of C language types

Shorthands

Here you can see intuitive names of types to use

Data type Description
bool_ Boolean (True or False) stored as a byte
int_ Default integer type (same as C long; normally either int64 or int32)
intc Identical to C int (normally int32 or int64)
intp Integer used for indexing (same as C ssize_t; normally either int32 or int64)
float_ Shorthand for float64.
complex_ Shorthand for complex128.

Note Underscore suffix is not mandatory

Integers

If you need to use integers of specific sizes

  • use its size in bits as a suffix;
  • add u prefix to denote unsigned value.
Data type Description
int8 Byte (-128 to 127)
int16 Integer (-32768 to 32767)
int32 Integer (-2147483648 to 2147483647)
int64 Integer (-9223372036854775808 to 9223372036854775807)
uint8 Unsigned integer (0 to 255)
uint16 Unsigned integer (0 to 65535)
uint32 Unsigned integer (0 to 4294967295)
uint64 Unsigned integer (0 to 18446744073709551615)

Floating points and complex

There is IEEE-754 standard for floating point arithmetics, which describes format of half (16 bits), single (32 bits), double (64 bits), quadruple (128 bits) and octuple (256 bits) numbers

Standard C has single precision float, double precision double and additional long double which is at least as accurate as regular double

Data type Description
float16 Half precision float: sign bit, 5 bits exponent, 10 bits mantissa
float32 Single precision float: sign bit, 8 bits exponent, 23 bits mantissa
float64 Double precision float: sign bit, 11 bits exponent, 52 bits mantissa
complex64 Complex number, represented by two 32-bit floats (real and imaginary components)
complex128 Complex number, represented by two 64-bit floats (real and imaginary components)

Specify data type

NumPy array has a string dtype property to store and specify data type


In [3]:
int_array = array([1., 2.5, -0.7], dtype='int')
print('You have {0} array of type {0.dtype}'.format(int_array))


You have [1 2 0] array of type int64

Note that typecast was made automatically

NumPy will not allow you to create wrong array with specific type


In [4]:
array([[0], 1], dtype='int')



ValueErrorTraceback (most recent call last)
<ipython-input-4-71cbb8304a64> in <module>()
----> 1 array([[0], 1], dtype='int')

ValueError: setting an array element with a sequence.

NumPy assigned data type automatically, if it was not specified


In [5]:
arrays = [
    array([1, 2, 3]),
    array(((1, 2), (3, 4.))),
    array([[0], 1]),
    array('Hello world')
]
for a in arrays:
    print('{0.dtype}: {0}'.format(a))


int64: [1 2 3]
float64: [[ 1.  2.]
 [ 3.  4.]]
object: [[0] 1]
<U11: Hello world

Interesting thing: we explored new types!

Type object is used when we cannot say for sure that we have n-dimensional array of numbers

Operations on arrays

You can simply apply elementwise operations


In [6]:
LENGTH = 4
a, b = array(range(LENGTH)), array(range(LENGTH, LENGTH*2))
print('Arighmetic')
print('{} +  {} = {}'.format(a, b, a + b))
print('{} *  {} = {}'.format(a, b, a * b))
print('{} ** {} = {}'.format(a, b, a ** b))
print('{} /  {} = {}'.format(a, b, a / b))
print('Binary')
print('{} ^  {} = {}'.format(a, b, a ^ b))
print('{} |  {} = {}'.format(a, b, a | b))
print('{} &  {} = {}'.format(a, b, a & b))


Arighmetic
[0 1 2 3] +  [4 5 6 7] = [ 4  6  8 10]
[0 1 2 3] *  [4 5 6 7] = [ 0  5 12 21]
[0 1 2 3] ** [4 5 6 7] = [   0    1   64 2187]
[0 1 2 3] /  [4 5 6 7] = [ 0.          0.2         0.33333333  0.42857143]
Binary
[0 1 2 3] ^  [4 5 6 7] = [4 4 4 4]
[0 1 2 3] |  [4 5 6 7] = [4 5 6 7]
[0 1 2 3] &  [4 5 6 7] = [0 1 2 3]

Indexing

Indexing of NumPy arrays is very agile

Just look on entitites which can be used as an index

  • boolean arrays;
  • integer arrays;
  • numbers;
  • Ellipsis;
  • tuples of them;
  • etc.

Integers array

You can get values from array by iterable (but not tuples) of indices


In [7]:
arr = array(range(10))
indices_list = [
    [1, 5, 8],
    range(1, 6, 2),
    array([8, 2, 0, -1])
]

for indices in indices_list:
    print('Indexed by {:<14}: {}'.format(str(indices), arr[indices]))


Indexed by [1, 5, 8]     : [1 5 8]
Indexed by range(1, 6, 2): [1 3 5]
Indexed by [ 8  2  0 -1] : [8 2 0 9]

Boolean array

Boolean arrays can be result of comparison and syntax of its usage is very handy


In [8]:
arr = array(range(5))

print('Items more than 2:', arr > 2)
print(arr[arr>2])


Items more than 2: [False False False  True  True]
[3 4]

This can be read as "Give me the numbers which are greater than two"

What you actually asked for

  • get elementwise comparison of array with scalar
  • provide me with array of results
  • fetch elements from the array, which are correspond to True

This means that you can use another array to get values from this one


In [9]:
a, b = array(range(0, 5)), array(range(5, 10))

print(a[b>7])


[3 4]

This gives you elements from array a, corresponding elements from b of which are greater than 7

Tuple

Tuples are used to access n-dimensional array elements


In [10]:
matrix = array([range(3*i, 3*(i+1)) for i in range(3)])

print('We have a matrix of shape', matrix.shape)
print('Regular Python indexing   ', matrix[0][2])
print('Implicit tuple declaration', matrix[0, 2])
print('Explicit tuple declaration', matrix[(0, 2)])


We have a matrix of shape (3, 3)
Regular Python indexing    2
Implicit tuple declaration 2
Explicit tuple declaration 2

It was noted that we can use "tuple of them"

This means that it can contain not only numbers but arrays and slices


In [11]:
print('All elements of the first column', matrix[:, 0])
print('Get elements of the second column', matrix[:, 1])
print('Pick first and last column', matrix[:, 0:3:2])
print('Get only first row', matrix[0, :])
print('You could do this easier but nevermind', matrix[0])
print('Get first two elements of the third column', matrix[0:2, 2])


All elements of the first column [0 3 6]
Get elements of the second column [1 4 7]
Pick first and last column [[0 2]
 [3 5]
 [6 8]]
Get only first row [0 1 2]
You could do this easier but nevermind [0 1 2]
Get first two elements of the third column [2 5]

Ellipsis

Ellipsis is a type for ellipsis constant written as ... (three dots)

It's very handy to be used in your own iterable types indexing when you want to skip some entries

Аollowing example usage of ellipsis is useдуыы because it behaves just like fetch of all elements


In [12]:
a = array(range(5))

print(a)
print(a[:])
print(a[...])


[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]

Though it's useful for n-dimensional arrays when you want to skip multiple dimensions


In [13]:
array3d = array([[range(3*(i+j), 3*(i+j+1)) for i in range(3)] for j in range(3)])
print('Here is array of shape {0.shape}: {0}'.format(array3d))


Here is array of shape (3, 3, 3): [[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]]

 [[ 3  4  5]
  [ 6  7  8]
  [ 9 10 11]]

 [[ 6  7  8]
  [ 9 10 11]
  [12 13 14]]]

Element of this array is a matrix, element of which is an array


In [14]:
print('Item is a matrix of shape {0.shape}: {0}'.format(array3d[0]))
print('Item of the matrix is an array of shape {0.shape}: {0}'.format(array3d[0][0]))
print('Don`t forget about tuples: {0}'.format(array3d[0, 0]))


Item is a matrix of shape (3, 3): [[0 1 2]
 [3 4 5]
 [6 7 8]]
Item of the matrix is an array of shape (3,): [0 1 2]
Don`t forget about tuples: [0 1 2]

If you want to get only last elements of each row in this huge thing you can do following


In [15]:
array3d[:, :, -1]


Out[15]:
array([[ 2,  5,  8],
       [ 5,  8, 11],
       [ 8, 11, 14]])

Also you can avoid these slices and use ellipsis


In [16]:
array3d[..., -1]


Out[16]:
array([[ 2,  5,  8],
       [ 5,  8, 11],
       [ 8, 11, 14]])

Ellipsis can be placed in the middle or in the end

It will mean that you fetch all elements from not specified dimensions


In [17]:
print('First matrix with all elements', array3d[0, ...])
print('First elements of all rows of the second matrix', array3d[1, ..., 0])


First matrix with all elements [[0 1 2]
 [3 4 5]
 [6 7 8]]
First elements of all rows of the second matrix [3 6 9]