Primitive generators

This notebook contains tests for tohu's primitive generators.


In [1]:
import tohu
from tohu.v6.primitive_generators import *
from tohu.v6.generator_dispatch import *
from tohu.v6.utils import print_generated_sequence

In [2]:
print(f'Tohu version: {tohu.__version__}')


Tohu version: v0.5.2+316.g62e480a

Constant

Constant simply returns the same, constant value every time.


In [3]:
g = Constant('quux')

In [4]:
print_generated_sequence(g, num=10, seed=12345)


Generated sequence: quux, quux, quux, quux, quux, quux, quux, quux, quux, quux

Boolean

Boolean returns either True or False, optionally with different probabilities.


In [5]:
g1 = Boolean()
g2 = Boolean(p=0.8)

In [6]:
print_generated_sequence(g1, num=20, seed=12345)
print_generated_sequence(g2, num=20, seed=99999)


Generated sequence: True, True, False, True, True, True, False, True, True, True, False, True, False, True, False, True, False, True, False, True
Generated sequence: True, True, False, True, True, True, True, False, True, False, True, True, True, True, True, True, True, True, False, True

Incremental

Incremental returns a sequence of numbers that increase in regular steps.


In [7]:
g = Incremental(start=200, step=4)

In [8]:
print_generated_sequence(g, num=20, seed=12345)


Generated sequence: 200, 204, 208, 212, 216, 220, 224, 228, 232, 236, 240, 244, 248, 252, 256, 260, 264, 268, 272, 276

Integer

Integer returns a random integer between low and high (both inclusive).


In [9]:
g = Integer(low=100, high=200)

In [10]:
print_generated_sequence(g, num=20, seed=12345)


Generated sequence: 102, 164, 118, 185, 182, 124, 149, 158, 100, 160, 162, 179, 145, 109, 122, 196, 197, 141, 147, 106

Float

Float returns a random float between low and high (both inclusive).


In [11]:
g = Float(low=2.3, high=4.2)

In [12]:
print_generated_sequence(g, num=10, sep='\n', fmt='.12f', seed=12345)


Generated sequence:

3.091577757836
2.319321421968
3.867892367582
2.867415724879
2.999982210028
2.667956563186
3.375415520585
2.607206865466
2.536107080139
3.122578909219

CharString


In [13]:
g = CharString(length=15)
print_generated_sequence(g, num=5, seed=12345)
print_generated_sequence(g, num=5, seed=99999)


Generated sequence: bFj7lCDM5eUVwz8, QG5ThX0t5TMklKn, Qule67xq5QaV597, SA4TteJc6OZuDxy, HxzQkefvT0jmCgC
Generated sequence: Ylx3SYjPqrPO0vC, udVUmJ5f2xi6RRv, 8ZYmUYrEgjY5INZ, B9cgzt0nNwfbstm, h84ObqDckapVKgd

It is possible to explicitly specify the character set.


In [14]:
g = CharString(length=12, charset="ABCDEFG")
print_generated_sequence(g, num=5, sep='\n', seed=12345)


Generated sequence:

ADBGBDDEGAFF
CCGEDGFAFFCG
FEBBEBECBAGG
CBGEAFGGGFDG
FCAEAGEFCDCC

There are also a few pre-defined character sets.


In [15]:
g1 = CharString(length=12, charset="<lowercase>")
g2 = CharString(length=12, charset="<alphanumeric_uppercase>")
print_generated_sequence(g1, num=5, sep='\n', seed=12345); print()
print_generated_sequence(g2, num=5, sep='\n', seed=12345)


Generated sequence:

andyelmqybtt
jkzrnytduvhy
tqeepfrifbyz
jgyratyzzslx
sibpayqvimjk

Generated sequence:

ASF8GQRW7C11
NO9YS70E24L7
0WGGVHYMGC78
NJ7YA1798ZP6
0LCUB8X4MRNN

DigitString

DigitString is the same as CharString with charset='0123456789'.


In [16]:
g = DigitString(length=15)
print_generated_sequence(g, num=5, seed=12345)
print_generated_sequence(g, num=5, seed=99999)


Generated sequence: 051914469077349, 659717839761152, 631099329607999, 749730509683433, 534610037812414
Generated sequence: 813878162266834, 307715908319673, 988278241189568, 490143826300232, 199602401027500

Sequential

Generates a sequence of sequentially numbered strings with a given prefix.


In [17]:
g = Sequential(prefix='Foo_', digits=3)

Calling reset() on the generator makes the numbering start from 1 again.


In [18]:
g.reset()
print_generated_sequence(g, num=5)
print_generated_sequence(g, num=5)
print()
g.reset()
print_generated_sequence(g, num=5)


Generated sequence: Foo_001, Foo_002, Foo_003, Foo_004, Foo_005
Generated sequence: Foo_006, Foo_007, Foo_008, Foo_009, Foo_010

Generated sequence: Foo_001, Foo_002, Foo_003, Foo_004, Foo_005

Note that the method Sequential.reset() supports the seed argument for consistency with other generators, but its value is ignored - the generator is simply reset to its initial value. This is illustrated here:


In [19]:
g.reset(seed=12345); print_generated_sequence(g, num=5)
g.reset(seed=99999); print_generated_sequence(g, num=5)


Generated sequence: Foo_001, Foo_002, Foo_003, Foo_004, Foo_005
Generated sequence: Foo_001, Foo_002, Foo_003, Foo_004, Foo_005

HashDigest

HashDigest returns hex strings representing hash digest values (or alternatively raw bytes).

HashDigest hex strings (uppercase)


In [20]:
g = HashDigest(length=6)

In [21]:
print_generated_sequence(g, num=10, seed=12345)


Generated sequence: E251FB, E52DE1, 1DFDFD, 810876, A44D15, A9AD2D, FE0F5E, 7E5191, 656D56, 224236

HashDigest hex strings (lowercase)


In [22]:
g = HashDigest(length=6, uppercase=False)

In [23]:
print_generated_sequence(g, num=10, seed=12345)


Generated sequence: e251fb, e52de1, 1dfdfd, 810876, a44d15, a9ad2d, fe0f5e, 7e5191, 656d56, 224236

HashDigest byte strings


In [24]:
g = HashDigest(length=10, as_bytes=True)

In [25]:
print_generated_sequence(g, num=5, seed=12345, sep='\n')


Generated sequence:

b'\xe2Q\xfb\xed\xe5-\xe1\xe3\x1d\xfd'
b'\x81\x08v!\xa4M\x15/\xa9\xad'
b'\xfe\x0f^4~Q\x91\xd3em'
b'"B6\x88\x1d\x9eu\x98\x01\xbb'
b'vl\xea\xf6q\xcd@v;\x9d'

NumpyRandomGenerator

This generator can produce random numbers using any of the random number generators supported by numpy.


In [26]:
g1 = NumpyRandomGenerator(method="normal", loc=3.0, scale=5.0)
g2 = NumpyRandomGenerator(method="poisson", lam=30)
g3 = NumpyRandomGenerator(method="exponential", scale=0.3)

In [27]:
g1.reset(seed=12345); print_generated_sequence(g1, num=4)
g2.reset(seed=12345); print_generated_sequence(g2, num=15)
g3.reset(seed=12345); print_generated_sequence(g3, num=4)


Generated sequence: 1.9764617025764353, 5.394716690287741, 0.40280642471630923, 0.22134847826254989
Generated sequence: 40, 24, 31, 34, 27, 32, 29, 29, 35, 38, 30, 32, 38, 36, 36
Generated sequence: 0.7961371899305246, 0.11410397056571128, 0.060972430042086474, 0.06865806254932436

FakerGenerator

FakerGenerator gives access to any of the methods supported by the faker module. Here are a couple of examples.

Example: random names


In [28]:
g = FakerGenerator(method='name')

In [29]:
print_generated_sequence(g, num=8, seed=12345)


Generated sequence: Adam Bryan, Jacob Lee, Candice Martinez, Justin Thompson, Heather Rubio, William Jenkins, Brittany Ball, Glenn Johnson

Example: random addresses


In [30]:
g = FakerGenerator(method='address')

In [31]:
print_generated_sequence(g, num=8, seed=12345, sep='\n---\n')


Generated sequence:

453 Ryan Islands
Greenstad, FL 97251
---
USS Irwin
FPO AA 66552
---
55075 William Rest
North Elizabeth, NH 38062
---
926 Alexandra Road
Romanberg, HI 99597
---
8202 Michelle Branch
Baileyborough, AL 08481
---
205 William Coves
Alexanderport, WI 72565
---
821 Patricia Hill Apt. 242
Apriltown, MO 24730
---
486 Karen Lodge Apt. 205
West Gregory, MT 33130

Timestamp


In [32]:
g = Timestamp(start="2018-01-01 11:22:33", end="2018-02-13 12:23:34")

In [33]:
type(next(g))


Out[33]:
datetime.datetime

In [34]:
print_generated_sequence(g, num=10, seed=12345, sep='\n')


Generated sequence:

2018-01-02 13:07:20
2018-01-26 00:51:38
2018-02-10 08:57:07
2018-01-08 16:01:04
2018-02-02 22:03:09
2018-02-01 18:16:12
2018-01-10 19:21:16
2018-01-20 02:51:48
2018-01-23 19:07:20
2018-01-01 17:56:48

In [35]:
g = Timestamp(start="2018-01-01 11:22:33", end="2018-02-13 12:23:34").strftime("%-d %b %Y, %H:%M (%a)")

In [36]:
type(next(g))


Out[36]:
str

In [37]:
print_generated_sequence(g, num=10, seed=12345, sep='\n')


Generated sequence:

2 Jan 2018, 13:07 (Tue)
26 Jan 2018, 00:51 (Fri)
10 Feb 2018, 08:57 (Sat)
8 Jan 2018, 16:01 (Mon)
2 Feb 2018, 22:03 (Fri)
1 Feb 2018, 18:16 (Thu)
10 Jan 2018, 19:21 (Wed)
20 Jan 2018, 02:51 (Sat)
23 Jan 2018, 19:07 (Tue)
1 Jan 2018, 17:56 (Mon)

Date


In [38]:
g = Date(start="2018-01-01", end="2018-02-13")

In [39]:
type(next(g))


Out[39]:
datetime.date

In [40]:
print_generated_sequence(g, num=10, seed=12345, sep='\n')


Generated sequence:

2018-01-02
2018-02-02
2018-01-10
2018-02-12
2018-02-11
2018-01-13
2018-01-25
2018-01-30
2018-01-01
2018-01-31

In [41]:
g = Date(start="2018-01-01", end="2018-02-13").strftime("%-d %b %Y")

In [42]:
type(next(g))


Out[42]:
str

In [43]:
print_generated_sequence(g, num=10, seed=12345, sep='\n')


Generated sequence:

2 Jan 2018
25 Jan 2018
9 Feb 2018
8 Jan 2018
2 Feb 2018
1 Feb 2018
10 Jan 2018
19 Jan 2018
23 Jan 2018
1 Jan 2018

In [ ]: