In [1]:
import json
import jsonschema
simple_schema = {
"type": "object",
"properties": {
"foo": {"type": "string"},
"bar": {"type": "number"}
}
}
In [2]:
good_instance = {
"foo": "hello world",
"bar": 3.141592653,
}
In [3]:
bad_instance = {
"foo" : 42,
"bar" : "string"
}
In [4]:
# Should succeed
jsonschema.validate(good_instance, simple_schema)
In [5]:
# Should fail
try:
jsonschema.validate(bad_instance, simple_schema)
except jsonschema.ValidationError as err:
print(err)
OK, now let's write a traitlets class that does the same thing:
In [6]:
import traitlets as T
class SimpleInstance(T.HasTraits):
foo = T.Unicode()
bar = T.Float()
In [7]:
# Should succeed
SimpleInstance(**good_instance)
Out[7]:
In [8]:
# Should fail
try:
SimpleInstance(**bad_instance)
except T.TraitError as err:
print(err)
Start by recognizing all simple JSON types in the schema ("string", "number", "integer", "boolean", "null")
Next recognize objects containing simple types
Next recognize compound simple types (i.e. where type is a list of simple types)
Next recognize arrays & enums
Next recognize "$ref" definitions
Next recognize "anyOf", "oneOf", "allOf" definitions... first is essentially a traitlets Union, second is a Union where only one must match, and "allOf" is essentially a composite object (not sure if traitlets has that...) Note that among these, Vega-Lite only uses "anyOf"
Catalog all validation keywords... Implement custom traitlets that support all the various validation keywords for each type. (Validation keywords listed here)
Use hypothesis for testing?
JSONSchema ignores any keys/properties which are not listed in the schema. Traitlets warns, and in the future will raise an error for undefined keys/properties
JSON allows undefined values as well as explicit nulls, which map to None. Traitlets treats None
as undefined. How to resolve this?
undefined
sentinel within the traitlets structure, such that the code knows when to ignore keys & produces dicts which translate directly to the correct JSONWill probably need to define some custom trait types, e.g. Null
, and also extend simple trait types to allow for the more extensive validations allowed in JSON Schema.
What version of the schema should we target? Perhaps try to target multiple versions?
T.HasTraits
class
In [9]:
import jinja2
OBJECT_TEMPLATE = """
{%- for import in cls.imports %}
{{ import }}
{%- endfor %}
class {{ cls.classname }}({{ cls.baseclass }}):
{%- for (name, prop) in cls.properties.items() %}
{{ name }} = {{ prop.trait_code }}
{%- endfor %}
"""
class JSONSchema(object):
"""A class to wrap JSON Schema objects and reason about their contents"""
object_template = OBJECT_TEMPLATE
def __init__(self, schema, root=None):
self.schema = schema
self.root = root or schema
@property
def type(self):
# TODO: should the default type be considered object?
return self.schema.get('type', 'object')
@property
def trait_code(self):
type_dict = {'string': 'T.Unicode()',
'number': 'T.Float()',
'integer': 'T.Integer()',
'boolean': 'T.Bool()'}
if self.type not in type_dict:
raise NotImplementedError()
return type_dict[self.type]
@property
def classname(self):
# TODO: deal with non-root schemas somehow...
if self.schema is self.root:
return "RootInstance"
else:
raise NotImplementedError("Non-root object schema")
@property
def baseclass(self):
return "T.HasTraits"
@property
def imports(self):
return ["import traitlets as T"]
@property
def properties(self):
return {key: JSONSchema(val) for key, val in self.schema.get('properties', {}).items()}
def object_code(self):
return jinja2.Template(self.object_template).render(cls=self)
In [10]:
code = JSONSchema(simple_schema).object_code()
print(code)
In [11]:
exec(code) # defines RootInstance
In [12]:
# Good instance should validate correctly
RootInstance(**good_instance)
Out[12]:
In [13]:
# Bad instance should raise a TraitError
try:
RootInstance(**bad_instance)
except T.TraitError as err:
print(err)
Seems to work 😀
We'll start with something like this in the package, and then build from there.