Domain and auxiliary classes (KV, Option, ConfigAlias) are used to define combinations of parameters to try in Research.
We start with some useful imports and constant definitions
In [1]:
import sys
import os
import shutil
import matplotlib
%matplotlib inline
In [2]:
sys.path.append('../../..')
from batchflow import NumpySampler as NS
from batchflow.research import KV, Option, Domain
In [3]:
def drop_repetition(config_alias):
res = []
for item in config_alias:
item.pop_alias('repetition')
res.append(item)
return res
Option is a class for parameter name and values that will be used in a Research. Values of the Option can be defined as array or Sampler. Can be easily transformed to Domain to construct iterator which will produce configs.
In [4]:
domain = Domain(Option('p', ['v1', 'v2']))
Each instance of Domain class has attribute iterator: generator which produces configs from the domain.
In [5]:
list(domain.iterator)
Out[5]:
Each item is ConfigAlias: wrapper for Config with methods config and alias, methods return wrapped Config and corresponding dict with str representations of values.
To set or reset iterator use set_iter method. It also accepts some parameters that will be described below.
If you get attribute iterator without set_iter, firstly it will be called with default parameters.
In [6]:
domain.set_iter()
config = next(domain.iterator)
config.config(), config.alias()
Out[6]:
Alias is used to create str representation of each value of the domain because they will be used as folder names and to have more readable representation of configs with non-string values.
Alias is __name__ attribute of the value or str representation. One can define custom alias by using KV class.
In [7]:
domain = Domain(Option('p', [ KV('v1', 'alias'), NS]))
config = next(domain.iterator)
print('alias: {:14} value: {}'.format(config.alias()['p'], config.config()['p']))
config = next(domain.iterator)
print('alias: {:14} value: {}'.format(config.alias()['p'], config.config()['p']))
You can define the number of times to produce each item of the domain as n_reps parameter of set_iter. Each produced ConfigAlias will have 'repetition' key.
In [8]:
domain.set_iter(n_reps=2)
list(domain.iterator)
Out[8]:
Also you can define n_iters parameter to define the number of configs that we will get from Domain. By default it is equel to the actual number of unique elements.
In [9]:
domain.set_iter(n_iters=3, n_reps=2)
list(domain.iterator)
Out[9]:
In [10]:
domain = Option('p1', ['v1', 'v2']) * Option('p2', ['v3', 'v4'])
drop_repetition(domain.iterator)
Out[10]:
In [11]:
domain = Option('p1', ['v1', 'v2']) + Option('p2', ['v3', 'v4'])
drop_repetition(domain.iterator)
Out[11]:
Result is a scalar product of options.
In [12]:
op1 = Option('p1', ['v1', 'v2'])
op2 = Option('p2', ['v3', 'v4'])
op3 = Option('p3', ['v5', 'v6'])
domain = op1 @ op2 @ op3
drop_repetition(domain.iterator)
Out[12]:
You also can combine all operations because all of them can be applied to resulting domains.
In [13]:
op1 = Option('p1', ['v1', 'v2'])
op2 = Option('p2', ['v3', 'v4'])
op3 = Option('p3', list(range(2)))
op4 = Option('p4', list(range(3, 5)))
domain = (op1 @ op2 + op3) * op4
drop_repetition(domain.iterator)
Out[13]:
size attribute will return the size of resulting domain
In [14]:
print(domain.size)
Note that you will get the total number of produced confgs. For example, if you have one Option with two values and n_iters=5 and n_reps=2 in set_iter then the size will be 10.
In [15]:
domain = Domain(Option('p1', list(range(3))))
domain.set_iter(n_iters=5, n_reps=2)
domain.size
Out[15]:
Instead of array-like options you can use Sampler instances as Option value. Iterator will produce independent samples from domain.
In [16]:
domain = Domain(Option('p1', NS('n')))
domain.set_iter(n_iters=3)
drop_repetition(domain.iterator)
Out[16]:
If n_reps > 1 then samples will be repeated.
In [17]:
domain.set_iter(n_iters=3, n_reps=2)
list(domain.iterator)
Out[17]:
If set_iter will be called with n_iters=None then resulting iterator will be infinite.
In [18]:
domain.set_iter(n_iters=None)
print('size: ', domain.size)
for _ in range(5):
print(next(domain.iterator))
repeat_each parameter defines how often elements from infinite generator will be repeated (by default, repeat_each=100).
In [19]:
domain.set_iter(n_iters=None, n_reps=2, repeat_each=2)
print('Domain size: {} \n'.format(domain.size))
for _ in range(8):
print(next(domain.iterator))
If one multiply array-like options and sampler options, resulting iterator will produce combinations of array-like options with independent sampler from sampler options.
In [20]:
domain = Option('p1', NS('n')) * Option('p2', NS('u')) * Option('p3', [1, 2, 3])
drop_repetition(domain.iterator)
Out[20]:
By default configs are consequently produced from option in a sum from the left to the right.
In [21]:
op1 = Option('p1', ['v1', 'v2'])
op2 = Option('p2', ['v3', 'v4'])
op3 = Option('p3', ['v5', 'v6'])
domain = op1 + op2 + op3
drop_repetition(domain.iterator)
Out[21]:
To sample options from sum independently with some probabilities you can multiply corresponding options by float.
In [22]:
domain = 0.3 * op1 + 0.2 * op2 + 0.5 * op3
drop_repetition(domain.iterator)
Out[22]:
If you sum options with and without weights,
In [23]:
domain = op1 + 1.0 * op2 + 1.0 * op3
drop_repetition(domain.iterator)
Out[23]:
Thus, we firstly get all configs from op1, then configs uniformly sampled from op2 and op3. Obviously, if we define some weight too large, firstly we get all samples from corresponding option.
In [24]:
domain = op1 + 1.0 * op2 + 100.0 * op3
drop_repetition(domain.iterator)
Out[24]:
Consider more dificult situation. We will get
options[0]1.2 * options[1] + 2.3 * options[2]options[3]1.7 * options[4] + 3.4 * options[5]
In [25]:
options = [Option('p'+str(i), ['v'+str(i)]) for i in range(6)]
domain = options[0] + 1.2 * options[1] + 2.3 * options[2] + options[3] + 1.7 * options[4] + 3.4 * options[5]
domain.set_iter(12)
drop_repetition(domain.iterator)
Out[25]: