Advanced attributes

ReGraph implements a collection of data structures for representation of potentially infinite sets of attribute values, together with all the default set operations on them (such as union, intersection, inclusion test etc.):

FiniteSet – wrapper for Python finite sets.
RegexSet – a class for possibly infinite sets of strings given by regular expressions.
IntegerSet – a class for possibly infinite sets of integers defined by a set of disjoint intervals.

In [1]:
from math import inf

import regraph.attribute_sets as atsets

Define an infinite integer set:


In [2]:
ints = atsets.IntegerSet({(0, 8), (11, inf)})

Test if interger value is in the set:


In [3]:
print(ints.contains(5))
print(ints.contains(9))


True
False

Test if another integer set is a subset:


In [4]:
a = atsets.IntegerSet({(0, 3), (20, 30)})
print(a.issubset(ints))

b = atsets.IntegerSet({(0, 10)})
print(b.issubset(ints))


True
False

Find the intersection of two IntegerSet objects:


In [5]:
a_and_ints = a.intersection(ints)
print(a_and_ints)

b_and_ints = b.intersection(ints)
print(b_and_ints)


[0, 3], [20, 30]
[0, 8]

Find the union of two IntegerSet objects:


In [6]:
a_or_ints = a.union(ints)
print(a_or_ints)

b_or_ints = b.union(ints)
print(b_or_ints)


[0, 8], [11, inf]
[0, inf]

We can also find unions and intersections of integer sets with ordinary Python sets, as long as these sets contain integer values:


In [7]:
a.union({13, 14})
print(a)


[0, 3], [20, 30]

The following line of code with cause the AttributeSetError exception:


In [8]:
try:
    a.union({13, 14, "a"})
except Exception as e:
    print("Error message: ", e)
    print("Type: ", type(e))


Error message:  Set '{'a', 13, 14}' contains non-integer element 'a'
Type:  <class 'regraph.exceptions.AttributeSetError'>

Now, define objects of RegexSet:


In [9]:
words = atsets.RegexSet("[A-Za-z]+")
integers = atsets.RegexSet("\d+")
alphanums = atsets.RegexSet("[A-Za-z\d]+")

Test if strings are matched by regex's defining our RegexSet objects:


In [10]:
print(words.match("42"))
print(integers.match("42"))
print(words.match("hello"))
print(integers.match("hello"))


False
True
True
False

Test if one regex set is a subset of another:


In [11]:
print(integers.issubset(words))
print(integers.issubset(alphanums))


False
True

Find the intersection of two regex sets:


In [12]:
print(integers.intersection(words))
print(integers.intersection(alphanums))


[]
\d+

Find the union of two regex sets:


In [13]:
print(integers.union(words))


\d+|[A-Za-z]+

Subtract a finite set of strings from a regex set:


In [14]:
print(words.difference({"hi", "bye"}))


([A-Zac-gi-z]|b([A-Za-xz]|y([A-Za-df-z]|e[A-Za-z]))|h([A-Za-hj-z]|i[A-Za-z]))[A-Za-z]*|by?|h

The result may be not extremely readable, but we can test it in the following way:


In [15]:
no_hi_bye = words.difference({"hi", "bye"})
print(no_hi_bye.match("hi"))
print(no_hi_bye.match("bye"))
print(no_hi_bye.match("afternoon"))


False
False
True

Now, we can also wrap Python set objects into FiniteSet class provided in ReGraph.


In [16]:
a = atsets.FiniteSet({1, 2, 3})

It allows us to apply to them any set operations from the common interface of ReGraph’s attribute sets. For example:


In [17]:
int_regex = atsets.RegexSet("\d+")
positive_integers = atsets.IntegerSet([(0, inf)])
print(a.issubset(int_regex))
print(a.issubset(positive_integers))


True
True

ReGraph provides two special classes of attribute sets: UniversalSet and EmptySet, which in their essence are static classes. These classes contain all standard set theoretic operations and follow the common interface defined in the base class AttributeSet (as all previously presented attribute set classes). Consider a couple of examples illustrating the behaviour of UniversalSet and EmptySet:


In [18]:
univ = atsets.UniversalSet()
empty = atsets.EmptySet()
print(univ.union(empty))
print(univ.intersection(empty))


UniversalSet
EmptySet

In [19]:
a = atsets.FiniteSet({1, 2, 3})
print(a.issubset(univ))
print(a.issubset(empty))
print(univ.intersection(a))
print(univ.union(a))


True
False
{1, 2, 3}
UniversalSet