In [1]:
import sys
sys.path.append('..')
Vocabularies are found in the submodule ngvoc.
For the moment, "matrix" and "sparse_matrix" are available. they both have fixed size (M,W), only difference is how the information is stored (numpy matrix and scipy sparse.lil_sparse).
To define a vocabulary type, the following things are needed:
you can add a visual method to visualize some aspects of your vocabulary. All those methods are inheritable from sparse_matrix.VocSparseMatrix and matrix.VocMatrix.
If you want to design a new Vocabulary class, just add it in the ngvoc folder, in a *.py file. Example is given in TEST.py.
To call it in your experiments, simply give as voc_cfg a configuration dict containing the key/value: 'voc_type':'\$pyfile.\$classname'. You can use any other keys you need, it will be fed directly to the __init__ of your class.
You should use the __init__ of the parent class, via super(), so do not forget to give the appropriate arguments.
Let's test the vocabulary class VocTest, in TEST.py. The class is a child of VocMatrix, uses its __init__ with M = number/2 and W = 15 (constant). The test method prints the 'testkey' key-value pair in the voc_cfg object. Feel free to test other modifications.
In [2]:
from lib import ngvoc
In [3]:
voc_cfg = {
'voc_type':'TEST.VocTest',
'testkey':'Test',
'number':12
}
In [4]:
voc_test = ngvoc.Vocabulary(**voc_cfg)
In [5]:
voc_test
Out[5]:
In [6]:
voc_test.test()
In [7]:
print(voc_test)
We need to define different things for this: