Transforming Arrays (Relabeling, Renaming, Reordering, Sorting, ...)

Import the LArray library:


In [ ]:
from larray import *

Import the population array from the demography_eurostat dataset:


In [ ]:
demography_eurostat = load_example_data('demography_eurostat')
population = demography_eurostat.population

# display the 'population' array
population

Manipulating axes

The Array class offers several methods to manipulate the axes and labels of an array:

  • set_labels: to replace all or some labels of one or several axes.
  • rename: to replace one or several axis names.
  • set_axes: to replace one or several axes.
  • transpose: to modify the order of axes.
  • drop: to remove one or several labels.
  • combine_axes: to combine axes.
  • split_axes: to split one or several axes by splitting their labels and names.
  • reindex: to reorder, add and remove labels of one or several axes.
  • insert: to insert a label at a given position.

Relabeling

Replace some labels of an axis:


In [ ]:
# replace only one label of the 'gender' axis by passing a dict
population_new_labels = population.set_labels('gender', {'Male': 'Men'})
population_new_labels

In [ ]:
# set all labels of the 'country' axis to uppercase by passing the function str.upper()
population_new_labels = population.set_labels('country', str.upper)
population_new_labels

See set_labels for more details and examples.

Renaming axes

Rename one axis:


In [ ]:
# 'rename' returns a copy of the array
population_new_names = population.rename('time', 'year')
population_new_names

Rename several axes at once:


In [ ]:
population_new_names = population.rename({'gender': 'sex', 'time': 'year'})
population_new_names

See rename for more details and examples.

Replacing Axes

Replace one axis:


In [ ]:
new_gender = Axis('sex=Men,Women')
population_new_axis = population.set_axes('gender', new_gender)
population_new_axis

Replace several axes at once:


In [ ]:
new_country = Axis('country_codes=BE,FR,DE') 
population_new_axes = population.set_axes({'country': new_country, 'gender': new_gender})
population_new_axes

Reordering axes

Axes can be reordered using transpose method. By default, transpose reverse axes, otherwise it permutes the axes according to the list given as argument. Axes not mentioned come after those which are mentioned(and keep their relative order). Finally, transpose returns a copy of the array.


In [ ]:
# starting order : country, gender, time
population

In [ ]:
# no argument --> reverse all axes
population_transposed = population.transpose()

# .T is a shortcut for .transpose()
population_transposed = population.T

population_transposed

In [ ]:
# reorder according to list
population_transposed = population.transpose('gender', 'country', 'time')
population_transposed

In [ ]:
# move 'time' axis at first place
# not mentioned axes come after those which are mentioned (and keep their relative order)
population_transposed = population.transpose('time')
population_transposed

In [ ]:
# move 'gender' axis at last place
# not mentioned axes come before those which are mentioned (and keep their relative order)
population_transposed = population.transpose(..., 'gender')
population_transposed

See transpose for more details and examples.

Dropping Labels


In [ ]:
population_labels_dropped = population.drop([2014, 2016])
population_labels_dropped

See drop for more details and examples.

Combine And Split Axes

Combine two axes:


In [ ]:
population_combined_axes = population.combine_axes(('country', 'gender'))
population_combined_axes

Split an axis:


In [ ]:
population_split_axes = population_combined_axes.split_axes('country_gender')
population_split_axes

See combine_axes and split_axes for more details and examples.

Reordering, adding and removing labels

The reindex method allows to reorder, add and remove labels along one axis:


In [ ]:
# reverse years + remove 2013 + add 2018 + copy data for 2017 to 2018
population_new_time = population.reindex('time', '2018..2014', fill_value=population[2017])
population_new_time

or several axes:


In [ ]:
population_new = population.reindex({'country': 'country=Luxembourg,Belgium,France,Germany', 
                       'time': 'time=2018..2014'}, fill_value=0)
population_new

See reindex for more details and examples.

Another way to insert new labels is to use the insert method:


In [ ]:
# insert a new country before 'France' with all values set to 0
population_new_country = population.insert(0, before='France', label='Luxembourg')
# or equivalently
population_new_country = population.insert(0, after='Belgium', label='Luxembourg')

population_new_country

See insert for more details and examples.

Sorting


In [ ]:
# get a copy of the 'population_benelux' array
population_benelux = demography_eurostat.population_benelux.copy()
population_benelux

Sort an axis (alphabetically if labels are strings)


In [ ]:
population_sorted = population_benelux.sort_axes('gender')
population_sorted

Give labels which would sort the axis


In [ ]:
population_benelux.labelsofsorted('country')

Sort according to values


In [ ]:
population_sorted = population_benelux.sort_values(('Male', 2017))
population_sorted

Aligning Arrays

The align method align two arrays on their axes with a specified join method. In other words, it ensure all common axes are compatible.


In [ ]:
# get a copy of the 'births' array
births = demography_eurostat.births.copy()

# align the two arrays with the 'inner' join method
population_aligned, births_aligned = population_benelux.align(births, join='inner')

In [ ]:
print('population_benelux before align:')
print(population_benelux)
print()
print('population_benelux after align:')
print(population_aligned)

In [ ]:
print('births before align:')
print(births)
print()
print('births after align:')
print(births_aligned)

Aligned arrays can then be used in arithmetic operations:


In [ ]:
population_aligned - births_aligned

See align for more details and examples.