See #212 for more information.
In [1]:
import numpy as np
In [2]:
import zarr
zarr.__version__
Out[2]:
In [3]:
import numcodecs
numcodecs.__version__
Out[3]:
Creation of an object array requires providing new object_codec
argument:
In [4]:
z = zarr.empty(10, chunks=5, dtype=object, object_codec=numcodecs.MsgPack())
z
Out[4]:
To maintain backwards compatibility with previously-created data, the object codec is treated as a filter and inserted as the first filter in the chain:
In [5]:
z.info
Out[5]:
In [6]:
z[0] = 'foo'
z[1] = b'bar' # msgpack doesn't support bytes objects correctly
z[2] = 1
z[3] = [2, 4, 6, 'baz']
z[4] = {'a': 'b', 'c': 'd'}
a = z[:]
a
Out[6]:
If no object_codec
is provided, a ValueError
is raised:
In [7]:
z = zarr.empty(10, chunks=5, dtype=object)
For API backward-compatibility, if object codec is provided via filters, issue a warning but don't raise an error.
In [8]:
z = zarr.empty(10, chunks=5, dtype=object, filters=[numcodecs.MsgPack()])
If a user tries to subvert the system and create an object array with no object codec, a runtime check is added to ensure no object arrays are passed down to the compressor (which could lead to nasty errors and/or segfaults):
In [9]:
z = zarr.empty(10, chunks=5, dtype=object, object_codec=numcodecs.MsgPack())
z._filters = None # try to live dangerously, manually wipe filters
In [10]:
z[0] = 'foo'
Here is another way to subvert the system, wiping filters after storing some data. To cover this case a runtime check is added to ensure no object arrays are handled inappropriately during decoding (which could lead to nasty errors and/or segfaults).
In [11]:
from numcodecs.tests.common import greetings
z = zarr.array(greetings, chunks=5, dtype=object, object_codec=numcodecs.MsgPack())
z[:]
Out[11]:
In [12]:
z._filters = [] # try to live dangerously, manually wipe filters
z[:]