Initialization


In [1]:
import monk.core.api as ms
from monk.roles.configuration import default_config

In [2]:
config=default_config()


[2015-03-27 23:25:32,696][19717][monk.roles.configuration][INFO    ][93  ][configuration.py] : configuration done

In [3]:
ms.initialize(default_config())


[2015-03-27 23:25:34,661][19717][monk.roles.configuration][INFO    ][93  ][configuration.py] : configuration done
[2015-03-27 23:25:34,661][19717][monk.api    ][INFO    ][40  ][api.py  ] : ------start up------
[2015-03-27 23:25:34,668][19717][monk.uid    ][INFO    ][32  ][uid.py  ] : initializing uid store
[2015-03-27 23:25:34,669][19717][monk.crane  ][INFO    ][61  ][crane.py] : initializing Seattle 
[2015-03-27 23:25:34,671][19717][monk.crane  ][INFO    ][61  ][crane.py] : initializing UserStore 
[2015-03-27 23:25:34,675][19717][monk.crane  ][INFO    ][61  ][crane.py] : initializing EngineStore 
[2015-03-27 23:25:34,677][19717][monk.crane  ][INFO    ][61  ][crane.py] : initializing PandaStore 
[2015-03-27 23:25:34,679][19717][monk.crane  ][INFO    ][61  ][crane.py] : initializing MantisStore 
[2015-03-27 23:25:34,679][19717][monk.crane  ][INFO    ][61  ][crane.py] : initializing TurtleStore 
[2015-03-27 23:25:34,680][19717][monk.crane  ][INFO    ][61  ][crane.py] : initializing TigressStore 
Out[3]:
True

Convert entities to MONK objects

convert_entities(collectionName=None)

Parameters:

  • collectionName: the name of the collection.

In [4]:
ents = ms.convert_entities()

An entity

  • _raws : this is a field that stores temporary features for MONK. Keys are in string, values can be any objects.
  • _features : this is a field that stores the internal feature representations for MONK.
  • creator, createdTime, lastModified : fields to keep version information
  • monkType : type defined in MONK
  • generic : convert a MONKObject to a json blob

In [5]:
ents[0].generic()


Out[5]:
{'_features': [],
 '_raws': {},
 u'address': u'\n\t\t\t1620 BroadwaySeattle, WA 98122\n\t\t',
 u'category_str_list': [u'Ice Cream & Frozen Yogurt'],
 u'comment': u'I love all those options for yogurt flavors and toppings.',
 'createdTime': datetime.datetime(2015, 3, 21, 6, 32, 51, 571000),
 'creator': u'monk',
 u'desc': u'Yogurtland',
 u'google_geometry': [{u'address_components': [{u'long_name': u'1620',
     u'short_name': u'1620',
     u'types': [u'street_number']},
    {u'long_name': u'Broadway',
     u'short_name': u'Broadway',
     u'types': [u'route']},
    {u'long_name': u'Capitol Hill',
     u'short_name': u'Capitol Hill',
     u'types': [u'neighborhood', u'political']},
    {u'long_name': u'Seattle',
     u'short_name': u'Seattle',
     u'types': [u'locality', u'political']},
    {u'long_name': u'Seattle',
     u'short_name': u'Seattle',
     u'types': [u'administrative_area_level_3', u'political']},
    {u'long_name': u'King',
     u'short_name': u'King',
     u'types': [u'administrative_area_level_2', u'political']},
    {u'long_name': u'Washington',
     u'short_name': u'WA',
     u'types': [u'administrative_area_level_1', u'political']},
    {u'long_name': u'United States',
     u'short_name': u'US',
     u'types': [u'country', u'political']},
    {u'long_name': u'98122',
     u'short_name': u'98122',
     u'types': [u'postal_code']}],
   u'formatted_address': u'1620 Broadway, Seattle, WA 98122, USA',
   u'geometry': {u'location': {u'lat': 47.6162747, u'lng': -122.3205492},
    u'location_type': u'ROOFTOP',
    u'viewport': {u'northeast': {u'lat': 47.6176236802915,
      u'lng': -122.3192002197085},
     u'southwest': {u'lat': 47.6149257197085, u'lng': -122.3218981802915}}},
   u'types': [u'street_address']}],
 'lastModified': datetime.datetime(2015, 3, 21, 6, 34, 41, 806844),
 u'link': u'/biz/yogurtland-seattle',
 'monkType': 'Entity',
 'name': u'',
 u'price_range': 1,
 u'rating_string': u'4.5',
 u'review_count': u'275',
 u'title': u'Yogurtland'}

MONK Types

Turtle

Turtle is a collection of base models, i.e., Panda.


In [6]:
ms.find_type('Turtle')


Out[6]:
['UniGramTurtle',
 'StemTurtle',
 'POSTurtle',
 'SPNTurtle',
 'DictionaryTurtle',
 'Turtle',
 'RankingTurtle',
 'SingleTurtle',
 'MultiLabelTurtle']

Panda

Panda is the base model in MONK. Currently, it is a linear SVC.


In [7]:
ms.find_type('Panda')


Out[7]:
['RegexPanda', 'Panda', 'LinearPanda', 'ExistPanda']

Tigress

Tigress is the superviser to train the models(Pandas). The major funcations include:

  • Extract ground truth from data
  • Create training example, i.e., (input, output, weight) for each panda
  • Measure the accuracies of turtle's prediction

In [8]:
ms.find_type('Tigress')


Out[8]:
['Tigress',
 'MultiLabelTigress',
 'SPNTigress',
 'PatternTigress',
 'SelfTigress',
 'LexiconTigress',
 'CoTigress']

Create a Unigram Turtle

Unigram turtle extracts word frequency features from specified fields


In [4]:
unigramTS = ms.yaml2json('turtle_scripts/turtle_unigram.yml')

In [5]:
unigramTS


Out[5]:
{'description': 'extract unigram features',
 'monkType': 'UniGramTurtle',
 'name': 'travel_unigram'}

In [7]:
unigramT = ms.create_turtle(unigramTS)

Create a Stemming Turtle

Stemming turtle extracts word frequency features from specified fields but with stemming functionality


In [8]:
stemTS = ms.yaml2json('turtle_scripts/turtle_stem.yml')

In [9]:
stemT = ms.create_turtle(stemTS)

Feature Extraction


In [26]:
stemT = ms.load_turtle('travel_stem','monk')
stemT.generic()


Out[26]:
{'createdTime': datetime.datetime(2015, 3, 21, 7, 23, 10, 22000),
 'creator': u'monk',
 'description': u'extract stemmed features',
 'entityCollectionName': u'',
 'followers': [],
 'lastModified': datetime.datetime(2015, 3, 27, 23, 32, 29, 207966),
 'leader': None,
 'mapping': {},
 'monkType': u'StemTurtle',
 'name': u'travel_stem',
 'pEPS': 1e-06,
 'pMaxInferenceSteps': 1000,
 'pMaxPathLength': 1,
 'pMergeClock': 0,
 'pPartialBarrier': 50,
 'pPenalty': 1.0,
 'pTrainClock': 0,
 'pandas': [{'creator': 'monk', 'name': u'yogurtland'},
  {'creator': 'monk', 'name': u'yogurt'},
  {'creator': 'monk', 'name': u'flavor'},
  {'creator': 'monk', 'name': u'top'},
  {'creator': 'monk', 'name': u'love'},
  {'creator': 'monk', 'name': u'option'},
  {'creator': 'monk', 'name': u'list'},
  {'creator': 'monk', 'name': u'black'},
  {'creator': 'monk', 'name': u'cream'},
  {'creator': 'monk', 'name': u'gnocchi'},
  {'creator': 'monk', 'name': u'truffl'},
  {'creator': 'monk', 'name': u'lock'},
  {'creator': 'monk', 'name': u'chittenden'},
  {'creator': 'monk', 'name': u'hiram'},
  {'creator': 'monk', 'name': u'sure'},
  {'creator': 'monk', 'name': u'fish'},
  {'creator': 'monk', 'name': u'ladder'},
  {'creator': 'monk', 'name': u'salmon'},
  {'creator': 'monk', 'name': u'check'},
  {'creator': 'monk', 'name': u'make'},
  {'creator': 'monk', 'name': u'laura'},
  {'creator': 'monk', 'name': u'design'},
  {'creator': 'monk', 'name': u'bee'},
  {'creator': 'monk', 'name': u'purs'},
  {'creator': 'monk', 'name': u'jewelri'},
  {'creator': 'monk', 'name': u'great'},
  {'creator': 'monk', 'name': u'assort'},
  {'creator': 'monk', 'name': u'uniqu'},
  {'creator': 'monk', 'name': u'hair'},
  {'creator': 'monk', 'name': u'radar'},
  {'creator': 'monk', 'name': u'record'},
  {'creator': 'monk', 'name': u'&'},
  {'creator': 'monk', 'name': u'cut'},
  {'creator': 'monk', 'name': u','},
  {'creator': 'monk', 'name': u'martin'},
  {'creator': 'monk', 'name': u'need'},
  {'creator': 'monk', 'name': u'guy'},
  {'creator': 'monk', 'name': u'notch'},
  {'creator': 'monk', 'name': u'hotel'},
  {'creator': 'monk', 'name': u'1000'},
  {'creator': 'monk', 'name': u'floor'},
  {'creator': 'monk', 'name': u'ceil'},
  {'creator': 'monk', 'name': u'bay'},
  {'creator': 'monk', 'name': u'window'},
  {'creator': 'monk', 'name': u'stadium'},
  {'creator': 'monk', 'name': u'face'},
  {'creator': 'monk', 'name': u'gyro'},
  {'creator': 'monk', 'name': u'tri'},
  {'creator': 'monk', 'name': u'good'},
  {'creator': 'monk', 'name': u'anyon'},
  {'creator': 'monk', 'name': u'must'},
  {'creator': 'monk', 'name': u'glass'},
  {'creator': 'monk', 'name': u'chihuli'},
  {'creator': 'monk', 'name': u'garden'},
  {'creator': 'monk', 'name': u'right'},
  {'creator': 'monk', 'name': u"'s"},
  {'creator': 'monk', 'name': u'area'},
  {'creator': 'monk', 'name': u'museum'},
  {'creator': 'monk', 'name': u'space'},
  {'creator': 'monk', 'name': u'near'},
  {'creator': 'monk', 'name': u'emp'},
  {'creator': 'monk', 'name': u'needl'},
  {'creator': 'monk', 'name': u'cure'},
  {'creator': 'monk', 'name': u'meat'},
  {'creator': 'monk', 'name': u'salumi'},
  {'creator': 'monk', 'name': u'artisan'},
  {'creator': 'monk', 'name': u'enjoy'},
  {'creator': 'monk', 'name': u'porchetta'},
  {'creator': 'monk', 'name': u'fave'},
  {'creator': 'monk', 'name': u'hand'},
  {'creator': 'monk', 'name': u'theo'},
  {'creator': 'monk', 'name': u'chocol'},
  {'creator': 'monk', 'name': u'thought'},
  {'creator': 'monk', 'name': u'inform'},
  {'creator': 'monk', 'name': u'tour'},
  {'creator': 'monk', 'name': u'interest'},
  {'creator': 'monk', 'name': u'babeland'},
  {'creator': 'monk', 'name': u'help'},
  {'creator': 'monk', 'name': u'excel'},
  {'creator': 'monk', 'name': u'friendli'},
  {'creator': 'monk', 'name': u'posit'},
  {'creator': 'monk', 'name': u'sex'},
  {'creator': 'monk', 'name': u'select'},
  {'creator': 'monk', 'name': u'staff'},
  {'creator': 'monk', 'name': u'pike'},
  {'creator': 'monk', 'name': u'chowder'},
  {'creator': 'monk', 'name': u'place'},
  {'creator': 'monk', 'name': u'quit'},
  {'creator': 'monk', 'name': u'style'},
  {'creator': 'monk', 'name': u'clam'},
  {'creator': 'monk', 'name': u'england'},
  {'creator': 'monk', 'name': u"'ve"},
  {'creator': 'monk', 'name': u'easili'},
  {'creator': 'monk', 'name': u'new'},
  {'creator': 'monk', 'name': u'best'},
  {'creator': 'monk', 'name': u'doughnut'},
  {'creator': 'monk', 'name': u'compani'},
  {'creator': 'monk', 'name': u'daili'},
  {'creator': 'monk', 'name': u'dozen'},
  {'creator': 'monk', 'name': u'jose'},
  {'creator': 'monk', 'name': u'san'},
  {'creator': 'monk', 'name': u'wish'},
  {'creator': 'monk', 'name': u'powder'},
  {'creator': 'monk', 'name': u'wonder'},
  {'creator': 'monk', 'name': u'donut'},
  {'creator': 'monk', 'name': u'mighty-o'},
  {'creator': 'monk', 'name': u'vanilla'},
  {'creator': 'monk', 'name': u'raspberri'},
  {'creator': 'monk', 'name': u'glaze'},
  {'creator': 'monk', 'name': u'cake'},
  {'creator': 'monk', 'name': u'tea'},
  {'creator': 'monk', 'name': u'remedi'},
  {'creator': 'monk', 'name': u'organ'},
  {'creator': 'monk', 'name': u'el'},
  {'creator': 'monk', 'name': u'asadero'},
  {'creator': 'monk', 'name': u'taco'},
  {'creator': 'monk', 'name': u'bu'},
  {'creator': 'monk', 'name': u'abso-effing-lut'},
  {'creator': 'monk', 'name': u'bottlehous'},
  {'creator': 'monk', 'name': u'henri'},
  {'creator': 'monk', 'name': u'also'},
  {'creator': 'monk', 'name': u'owner'},
  {'creator': 'monk', 'name': u'meet'},
  {'creator': 'monk', 'name': u'pleasur'},
  {'creator': 'monk', 'name': u'art'},
  {'creator': 'monk', 'name': u'tabl'},
  {'creator': 'monk', 'name': u'us'},
  {'creator': 'monk', 'name': u'chef'},
  {'creator': 'monk', 'name': u'chat'},
  {'creator': 'monk', 'name': u'dustin'},
  {'creator': 'monk', 'name': u'chanc'},
  {'creator': 'monk', 'name': u'gave'},
  {'creator': 'monk', 'name': u'espresso'},
  {'creator': 'monk', 'name': u'monorail'},
  {'creator': 'monk', 'name': u'even'},
  {'creator': 'monk', 'name': u'latt'},
  {'creator': 'monk', 'name': u'sip'},
  {'creator': 'monk', 'name': u'nog'},
  {'creator': 'monk', 'name': u'egg'},
  {'creator': 'monk', 'name': u'amaz'},
  {'creator': 'monk', 'name': u'elliott'},
  {'creator': 'monk', 'name': u'book'},
  {'creator': 'monk', 'name': u'town'},
  {'creator': 'monk', 'name': u'simpli'},
  {'creator': 'monk', 'name': u'classic'},
  {'creator': 'monk', 'name': u'seattl'},
  {'creator': 'monk', 'name': u'bookstor'},
  {'creator': 'monk', 'name': u"'n"},
  {'creator': 'monk', 'name': u'jill'},
  {'creator': 'monk', 'name': u'super'},
  {'creator': 'monk', 'name': u'jock'},
  {'creator': 'monk', 'name': u'run'},
  {'creator': 'monk', 'name': u'back'},
  {'creator': 'monk', 'name': u'next'},
  {'creator': 'monk', 'name': u'shoe'},
  {'creator': 'monk', 'name': u'pair'},
  {'creator': 'monk', 'name': u"'ll"},
  {'creator': 'monk', 'name': u'come'},
  {'creator': 'monk', 'name': u'shop'},
  {'creator': 'monk', 'name': u'crumpet'},
  {'creator': 'monk', 'name': u'littl'},
  {'creator': 'monk', 'name': u'market'},
  {'creator': 'monk', 'name': u'outsid'},
  {'creator': 'monk', 'name': u'gem'},
  {'creator': 'monk', 'name': u'uwajimaya'},
  {'creator': 'monk', 'name': u'court'},
  {'creator': 'monk', 'name': u'ad'},
  {'creator': 'monk', 'name': u'incred'},
  {'creator': 'monk', 'name': u'food'},
  {'creator': 'monk', 'name': u'bonu'},
  {'creator': 'monk', 'name': u'delici'},
  {'creator': 'monk', 'name': u'gelato'},
  {'creator': 'monk', 'name': u"d'ambrosio"},
  {'creator': 'monk', 'name': u'lavend'},
  {'creator': 'monk', 'name': u'pistachio'},
  {'creator': 'monk', 'name': u'friend'},
  {'creator': 'monk', 'name': u'panier'},
  {'creator': 'monk', 'name': u'bakeri'},
  {'creator': 'monk', 'name': u'le'},
  {'creator': 'monk', 'name': u'french'},
  {'creator': 'monk', 'name': u'fruit'},
  {'creator': 'monk', 'name': u'equal'},
  {'creator': 'monk', 'name': u'mini-pastri'},
  {'creator': 'monk', 'name': u'tart'},
  {'creator': 'monk', 'name': u'handmad'},
  {'creator': 'monk', 'name': u'beecher'},
  {'creator': 'monk', 'name': u'chees'},
  {'creator': 'monk', 'name': u'ca'},
  {'creator': 'monk', 'name': u"n't"},
  {'creator': 'monk', 'name': u'mac'},
  {'creator': 'monk', 'name': u'wait'},
  {'creator': 'monk', 'name': u'mcphee'},
  {'creator': 'monk', 'name': u'archi'},
  {'creator': 'monk', 'name': u'set'},
  {'creator': 'monk', 'name': u'figur'},
  {'creator': 'monk', 'name': u'ladi'},
  {'creator': 'monk', 'name': u'cat'},
  {'creator': 'monk', 'name': u'crazi'},
  {'creator': 'monk', 'name': u'air'},
  {'creator': 'monk', 'name': u'action'},
  {'creator': 'monk', 'name': u'freshen'},
  {'creator': 'monk', 'name': u'corndog'},
  {'creator': 'monk', 'name': u'sun'},
  {'creator': 'monk', 'name': u'liquor'},
  {'creator': 'monk', 'name': u'old'},
  {'creator': 'monk', 'name': u'timey'},
  {'creator': 'monk', 'name': u'perform'},
  {'creator': 'monk', 'name': u'juicer'},
  {'creator': 'monk', 'name': u'squeez'},
  {'creator': 'monk', 'name': u'fresh'},
  {'creator': 'monk', 'name': u'juic'},
  {'creator': 'monk', 'name': u'flower'},
  {'creator': 'monk', 'name': u'mention'},
  {'creator': 'monk', 'name': u'neat'},
  {'creator': 'monk', 'name': u'stand'},
  {'creator': 'monk', 'name': u'bucket'},
  {'creator': 'monk', 'name': u'use'},
  {'creator': 'monk', 'name': u'everi'},
  {'creator': 'monk', 'name': u'michael'},
  {'creator': 'monk', 'name': u'job'},
  {'creator': 'monk', 'name': u'done'},
  {'creator': 'monk', 'name': u'time'},
  {'creator': 'monk', 'name': u'video'},
  {'creator': 'monk', 'name': u'scarecrow'},
  {'creator': 'monk', 'name': u'like'},
  {'creator': 'monk', 'name': u'movi'},
  {'creator': 'monk', 'name': u'well'},
  {'creator': 'monk', 'name': u'visit'},
  {'creator': 'monk', 'name': u'obscur'},
  {'creator': 'monk', 'name': u'worth'},
  {'creator': 'monk', 'name': u'brew'},
  {'creator': 'monk', 'name': u'georgetown'},
  {'creator': 'monk', 'name': u'manni'},
  {'creator': 'monk', 'name': u'9llb'},
  {'creator': 'monk', 'name': u'realli'},
  {'creator': 'monk', 'name': u'beer'},
  {'creator': 'monk', 'name': u'porter'},
  {'creator': 'monk', 'name': u'tat'},
  {'creator': 'monk', 'name': u'delicatessen'},
  {'creator': 'monk', 'name': u'pastrami'},
  {'creator': 'monk', 'name': u'show'},
  {'creator': 'monk', 'name': u'flight'},
  {'creator': 'monk', 'name': u'one'},
  {'creator': 'monk', 'name': u'concord'},
  {'creator': 'monk', 'name': u'forc'},
  {'creator': 'monk', 'name': u'get'},
  {'creator': 'monk', 'name': u'walk'},
  {'creator': 'monk', 'name': u'paseo'},
  {'creator': 'monk', 'name': u'cuban'},
  {'creator': 'monk', 'name': u'sandwich'},
  {'creator': 'monk', 'name': u'stop'},
  {'creator': 'monk', 'name': u'midnight'},
  {'creator': 'monk', 'name': u'press'},
  {'creator': 'monk', 'name': u'think'},
  {'creator': 'monk', 'name': u'junction'},
  {'creator': 'monk', 'name': u'morgan'},
  {'creator': 'monk', 'name': u'nouveau'},
  {'creator': 'monk', 'name': u'twice'},
  {'creator': 'monk', 'name': u'bake'},
  {'creator': 'monk', 'name': u'almond'},
  {'creator': 'monk', 'name': u'croissant'},
  {'creator': 'monk', 'name': u'golden'},
  {'creator': 'monk', 'name': u'summer'},
  {'creator': 'monk', 'name': u'spot'},
  {'creator': 'monk', 'name': u'go'},
  {'creator': 'monk', 'name': u'bar'},
  {'creator': 'monk', 'name': u'vivac'},
  {'creator': 'monk', 'name': u'sidewalk'},
  {'creator': 'monk', 'name': u'velvet'},
  {'creator': 'monk', 'name': u'mocha'},
  {'creator': 'monk', 'name': u'white'},
  {'creator': 'monk', 'name': u'ballard'},
  {'creator': 'monk', 'name': u'sunday'},
  {'creator': 'monk', 'name': u'farmer'},
  {'creator': 'monk', 'name': u'year'},
  {'creator': 'monk', 'name': u'round'},
  {'creator': 'monk', 'name': u'piroshki'},
  {'creator': 'monk', 'name': u'beef'},
  {'creator': 'monk', 'name': u'slave'},
  {'creator': 'monk', 'name': u'produc'},
  {'creator': 'monk', 'name': u'macpherson'},
  {'creator': 'monk', 'name': u'easi'},
  {'creator': 'monk', 'name': u'price'},
  {'creator': 'monk', 'name': u'park'},
  {'creator': 'monk', 'name': u'much'},
  {'creator': 'monk', 'name': u'local'},
  {'creator': 'monk', 'name': u'kingdom'},
  {'creator': 'monk', 'name': u'card'},
  {'creator': 'monk', 'name': u'mox'},
  {'creator': 'monk', 'name': u'everyth'},
  {'creator': 'monk', 'name': u'cafe'},
  {'creator': 'monk', 'name': u'haven'},
  {'creator': 'monk', 'name': u'vegan'},
  {'creator': 'monk', 'name': u'@'},
  {'creator': 'monk', 'name': u'still'},
  {'creator': 'monk', 'name': u'%'},
  {'creator': 'monk', 'name': u'*-n'},
  {'creator': 'monk', 'name': u'awesom'},
  {'creator': 'monk', 'name': u'total'},
  {'creator': 'monk', 'name': u'sidecar'},
  {'creator': 'monk', 'name': u'carpent'},
  {'creator': 'monk', 'name': u'walru'},
  {'creator': 'monk', 'name': u'return'},
  {'creator': 'monk', 'name': u'oyster'},
  {'creator': 'monk', 'name': u'grill'},
  {'creator': 'monk', 'name': u'metropolitan'},
  {'creator': 'monk', 'name': u'wagyu'},
  {'creator': 'monk', 'name': u'pricey'},
  {'creator': 'monk', 'name': u'rare'},
  {'creator': 'monk', 'name': u'treat'},
  {'creator': 'monk', 'name': u'honor'},
  {'creator': 'monk', 'name': u'amann'},
  {'creator': 'monk', 'name': u'definit'},
  {'creator': 'monk', 'name': u'crave'},
  {'creator': 'monk', 'name': u'kouign'},
  {'creator': 'monk', 'name': u'suppli'},
  {'creator': 'monk', 'name': u'craftsman'},
  {'creator': 'monk', 'name': u'artist'},
  {'creator': 'monk', 'name': u'canvas'},
  {'creator': 'monk', 'name': u'lot'},
  {'creator': 'monk', 'name': u'size'},
  {'creator': 'monk', 'name': u'shiro'},
  {'creator': 'monk', 'name': u'omakas'},
  {'creator': 'monk', 'name': u'24'},
  {'creator': 'monk', 'name': u'alley'},
  {'creator': 'monk', 'name': u'finish'},
  {'creator': 'monk', 'name': u'smooth'},
  {'creator': 'monk', 'name': u'start'},
  {'creator': 'monk', 'name': u'calozzi'},
  {'creator': 'monk', 'name': u'onion'},
  {'creator': 'monk', 'name': u'w/'},
  {'creator': 'monk', 'name': u'whiz'},
  {'creator': 'monk', 'name': u'tini'},
  {'creator': 'monk', 'name': u'confect'},
  {'creator': 'monk', 'name': u'creami'},
  {'creator': 'monk', 'name': u'cheesecak'},
  {'creator': 'monk', 'name': u'sooo'},
  {'creator': 'monk', 'name': u'heavi'},
  {'creator': 'monk', 'name': u'rich'},
  {'creator': 'monk', 'name': u'yet'},
  {'creator': 'monk', 'name': u'marketspic'},
  {'creator': 'monk', 'name': u'die'},
  {'creator': 'monk', 'name': u'orang'},
  {'creator': 'monk', 'name': u'other'},
  {'creator': 'monk', 'name': u'cinnamon'},
  {'creator': 'monk', 'name': u'blend'},
  {'creator': 'monk', 'name': u'map'},
  {'creator': 'monk', 'name': u'metsker'},
  {'creator': 'monk', 'name': u'everywher'},
  {'creator': 'monk', 'name': u'anoth'},
  {'creator': 'monk', 'name': u'globe'},
  {'creator': 'monk', 'name': u'turn'},
  {'creator': 'monk', 'name': u'qualiti'},
  {'creator': 'monk', 'name': u'bob'},
  {'creator': 'monk', 'name': u'sausag'},
  {'creator': 'monk', 'name': u'rabbit'},
  {'creator': 'monk', 'name': u'duck'},
  {'creator': 'monk', 'name': u'saw'},
  {'creator': 'monk', 'name': u'goat'},
  {'creator': 'monk', 'name': u'nishino'},
  {'creator': 'monk', 'name': u'alway'},
  {'creator': 'monk', 'name': u'spinass'},
  {'creator': 'monk', 'name': u'cascina'},
  {'creator': 'monk', 'name': u'butter'},
  {'creator': 'monk', 'name': u'tajarin'},
  {'creator': 'monk', 'name': u'sage'},
  {'creator': 'monk', 'name': u'-'},
  {'creator': 'monk', 'name': u'sauc'},
  {'creator': 'monk', 'name': u'sutra'},
  {'creator': 'monk', 'name': u'menu'},
  {'creator': 'monk', 'name': u'pare'},
  {'creator': 'monk', 'name': u'non-alcohol'},
  {'creator': 'monk', 'name': u'work'},
  {'creator': 'monk', 'name': u'coffe'},
  {'creator': 'monk', 'name': u'slow'},
  {'creator': 'monk', 'name': u'citi'},
  {'creator': 'monk', 'name': u'fairli'},
  {'creator': 'monk', 'name': u'experi'},
  {'creator': 'monk', 'name': u'vine'},
  {'creator': 'monk', 'name': u'harvest'},
  {'creator': 'monk', 'name': u'intim'},
  {'creator': 'monk', 'name': u'seat'},
  {'creator': 'monk', 'name': u'downstair'},
  {'creator': 'monk', 'name': u'romant'},
  {'creator': 'monk', 'name': u'corvo'},
  {'creator': 'monk', 'name': u'pasta'},
  {'creator': 'monk', 'name': u'il'},
  {'creator': 'monk', 'name': u'slight'},
  {'creator': 'monk', 'name': u'prepar'},
  {'creator': 'monk', 'name': u'perfectli'},
  {'creator': 'monk', 'name': u'al'},
  {'creator': 'monk', 'name': u'dent'},
  {'creator': 'monk', 'name': u'troll'},
  {'creator': 'monk', 'name': u'fremont'},
  {'creator': 'monk', 'name': u'live'},
  {'creator': 'monk', 'name': u'must-se'},
  {'creator': 'monk', 'name': u'tilt'},
  {'creator': 'monk', 'name': u'full'},
  {'creator': 'monk', 'name': u'ice'},
  {'creator': 'monk', 'name': u'made'},
  {'creator': 'monk', 'name': u'+'},
  {'creator': 'monk', 'name': u'game'},
  {'creator': 'monk', 'name': u'arcad'},
  {'creator': 'monk', 'name': u'home'},
  {'creator': 'monk', 'name': u'vintag'},
  {'creator': 'monk', 'name': u'blue'},
  {'creator': 'monk', 'name': u'highway'},
  {'creator': 'monk', 'name': u'personnel'},
  {'creator': 'monk', 'name': u'favorit'},
  {'creator': 'monk', 'name': u'board'},
  {'creator': 'monk', 'name': u'store'},
  {'creator': 'monk', 'name': u'gaucho'},
  {'creator': 'monk', 'name': u'foster'},
  {'creator': 'monk', 'name': u'banana'},
  {'creator': 'monk', 'name': u'hyatt'},
  {'creator': 'monk', 'name': u'grand'},
  {'creator': 'monk', 'name': u'north'},
  {'creator': 'monk', 'name': u'directli'},
  {'creator': 'monk', 'name': u'six'},
  {'creator': 'monk', 'name': u'block'},
  {'creator': 'monk', 'name': u'lick'},
  {'creator': 'monk', 'name': u'pure'},
  {'creator': 'monk', 'name': u'let'},
  {'creator': 'monk', 'name': u'test'},
  {'creator': 'monk', 'name': u'deli'},
  {'creator': 'monk', 'name': u'saigon'},
  {'creator': 'monk', 'name': u'pork'},
  {'creator': 'monk', 'name': u'mi'},
  {'creator': 'monk', 'name': u'ga'},
  {'creator': 'monk', 'name': u'chicken'},
  {'creator': 'monk', 'name': u'heo'},
  {'creator': 'monk', 'name': u'banh'},
  {'creator': 'monk', 'name': u'moor'},
  {'creator': 'monk', 'name': u'mexican'},
  {'creator': 'monk', 'name': u'chuki'},
  {'creator': 'monk', 'name': u'solid'},
  {'creator': 'monk', 'name': u'asada'},
  {'creator': 'monk', 'name': u'winner'},
  {'creator': 'monk', 'name': u'carn'},
  {'creator': 'monk', 'name': u'pollo'},
  {'creator': 'monk', 'name': u'bistro'},
  {'creator': 'monk', 'name': u'mediterranean'},
  {'creator': 'monk', 'name': u'petra'},
  {'creator': 'monk', 'name': u'choic'},
  {'creator': 'monk', 'name': u'hummu'},
  {'creator': 'monk', 'name': u'appet'},
  {'creator': 'monk', 'name': u'meal'},
  {'creator': 'monk', 'name': u'seriou'},
  {'creator': 'monk', 'name': u'pie'},
  {'creator': 'monk', 'name': u'mushroom'},
  {'creator': 'monk', 'name': u'chanterel'},
  {'creator': 'monk', 'name': u'roast'},
  {'creator': 'monk', 'name': u':'},
  {'creator': 'monk', 'name': u'leaf'},
  {'creator': 'monk', 'name': u'green'},
  {'creator': 'monk', 'name': u'restaur'},
  {'creator': 'monk', 'name': u'vietnames'},
  {'creator': 'monk', 'name': u'forget'},
  {'creator': 'monk', 'name': u'oh'},
  {'creator': 'monk', 'name': u'spring'},
  {'creator': 'monk', 'name': u'shrimp'},
  {'creator': 'monk', 'name': u'roll'},
  {'creator': 'monk', 'name': u'dont'},
  {'creator': 'monk', 'name': u'calf'},
  {'creator': 'monk', 'name': u'kid'},
  {'creator': 'monk', 'name': u'picnic'},
  {'creator': 'monk', 'name': u'sake'},
  {'creator': 'monk', 'name': u'umi'},
  {'creator': 'monk', 'name': u'hous'},
  {'creator': 'monk', 'name': u'boy'},
  {'creator': 'monk', 'name': u'casanova'},
  {'creator': 'monk', 'name': u'bad'},
  {'creator': 'monk', 'name': u'recommend'},
  {'creator': 'monk', 'name': u'highli'},
  {'creator': 'monk', 'name': u'shorti'},
  {'creator': 'monk', 'name': u'pinbal'},
  {'creator': 'monk', 'name': u'hall'},
  {'creator': 'monk', 'name': u'famou'},
  {'creator': 'monk', 'name': u'uli'},
  {'creator': 'monk', 'name': u'huge'},
  {'creator': 'monk', 'name': u'frye'},
  {'creator': 'monk', 'name': u'exhibit'},
  {'creator': 'monk', 'name': u'school'},
  {'creator': 'monk', 'name': u'genr'},
  {'creator': 'monk', 'name': u'paint'},
  {'creator': 'monk', 'name': u'varieti'},
  {'creator': 'monk', 'name': u'cool'},
  {'creator': 'monk', 'name': u'faint'},
  {'creator': 'monk', 'name': u'salt'},
  {'creator': 'monk', 'name': u'caramel'},
  {'creator': 'monk', 'name': u'nutella'},
  {'creator': 'monk', 'name': u'rachel'},
  {'creator': 'monk', 'name': u'pig'},
  {'creator': 'monk', 'name': u'piggi'},
  {'creator': 'monk', 'name': u'coolest'},
  {'creator': 'monk', 'name': u'world'},
  {'creator': 'monk', 'name': u'bank'},
  {'creator': 'monk', 'name': u'altay'},
  {'creator': 'monk', 'name': u'ethiopian'},
  {'creator': 'monk', 'name': u'order'},
  {'creator': 'monk', 'name': u'injera'},
  {'creator': 'monk', 'name': u'day'},
  {'creator': 'monk', 'name': u'altura'},
  {'creator': 'monk', 'name': u'tast'},
  {'creator': 'monk', 'name': u'flexibl'},
  {'creator': 'monk', 'name': u'fact'},
  {'creator': 'monk', 'name': u'falafel'},
  {'creator': 'monk', 'name': u'corner'},
  {'creator': 'monk', 'name': u'aladdin'},
  {'creator': 'monk', 'name': u'late'},
  {'creator': 'monk', 'name': u'open'},
  {'creator': 'monk', 'name': u'deck'},
  {'creator': 'monk', 'name': u'columbia'},
  {'creator': 'monk', 'name': u'center'},
  {'creator': 'monk', 'name': u'observ'},
  {'creator': 'monk', 'name': u'compar'},
  {'creator': 'monk', 'name': u'bean'},
  {'creator': 'monk', 'name': u'street'},
  {'creator': 'monk', 'name': u'consist'},
  {'creator': 'monk', 'name': u'drink'},
  {'creator': 'monk', 'name': u'caus'},
  {'creator': 'monk', 'name': u'quinn'},
  {'creator': 'monk', 'name': u'``'},
  {'creator': 'monk', 'name': u'crispi'},
  {'creator': 'monk', 'name': u'heard'},
  {'creator': 'monk', 'name': u'leav'},
  {'creator': 'monk', 'name': u'joe'},
  {'creator': 'monk', 'name': u'wild'},
  {'creator': 'monk', 'name': u'ever'},
  {'creator': 'monk', 'name': u'boar'},
  {'creator': 'monk', 'name': u'sloppi'},
  {'creator': 'monk', 'name': u'envi'},
  {'creator': 'monk', 'name': u'skin'},
  {'creator': 'monk', 'name': u'boutiqu'},
  {'creator': 'monk', 'name': u'alki'},
  {'creator': 'monk', 'name': u'far'},
  {'creator': 'monk', 'name': u'!'},
  {'creator': 'monk', 'name': u'noth'},
  {'creator': 'monk', 'name': u'farm'},
  {'creator': 'monk', 'name': u'taylor'},
  {'creator': 'monk', 'name': u'shellfish'},
  {'creator': 'monk', 'name': u'would'},
  {'creator': 'monk', 'name': u'level'},
  {'creator': 'monk', 'name': u'geoduck'},
  {'creator': 'monk', 'name': u'ponzu'},
  {'creator': 'monk', 'name': u'eat'},
  {'creator': 'monk', 'name': u'pink'},
  {'creator': 'monk', 'name': u'door'},
  {'creator': 'monk', 'name': u'zani'},
  {'creator': 'monk', 'name': u'enough'},
  {'creator': 'monk', 'name': u'trapez'},
  {'creator': 'monk', 'name': u'dahlia'},
  {'creator': 'monk', 'name': u'coconut'},
  {'creator': 'monk', 'name': u'palac'},
  {'creator': 'monk', 'name': u'kitchen'},
  {'creator': 'monk', 'name': u'bavarian'},
  {'creator': 'monk', 'name': u'lola'},
  {'creator': 'monk', 'name': u'eye'},
  {'creator': 'monk', 'name': u'greek'},
  {'creator': 'monk', 'name': u'tom'},
  {'creator': 'monk', 'name': u'dougla'},
  {'creator': 'monk', 'name': u'trophi'},
  {'creator': 'monk', 'name': u'cupcak'},
  {'creator': 'monk', 'name': u'parti'},
  {'creator': 'monk', 'name': u'michou'},
  {'creator': 'monk', 'name': u'perfect'},
  {'creator': 'monk', 'name': u'salad'},
  {'creator': 'monk', 'name': u'quinoa'},
  {'creator': 'monk', 'name': u'sierra'},
  {'creator': 'monk', 'name': u'chuck'},
  {'creator': 'monk', 'name': u'hop'},
  {'creator': 'monk', 'name': u'atmospher'},
  {'creator': 'monk', 'name': u'craft'},
  {'creator': 'monk', 'name': u'conveni'},
  {'creator': 'monk', 'name': u'gelatiamo'},
  {'creator': 'monk', 'name': u'got'},
  {'creator': 'monk', 'name': u'puff'},
  {'creator': 'monk', 'name': u'buona'},
  {'creator': 'monk', 'name': u'tavola'},
  {'creator': 'monk', 'name': u'la'},
  {'creator': 'monk', 'name': u'oil'},
  {'creator': 'monk', 'name': u'galor'},
  {'creator': 'monk', 'name': u'spread'},
  {'creator': 'monk', 'name': u'empir'},
  {'creator': 'monk', 'name': u'waffl'},
  {'creator': 'monk', 'name': u'ham'},
  {'creator': 'monk', 'name': u'station'},
  {'creator': 'monk', 'name': u'light'},
  {'creator': 'monk', 'name': u'locat'},
  {'creator': 'monk', 'name': u'rail'}],
 'requires': {}}

In [27]:
for ent in ents: 
    stemT.predict(ent,  fields)


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-27-beb518e60d47> in <module>()
      1 for ent in ents:
----> 2     stemT.predict(ent,  fields)

NameError: name 'fields' is not defined

load some data

load_entities(query={}, skip=0, num=100, collectionName=None)

Parameters:

  • query: MongoDB style query spec, e.g., {'tag':{'$in':['Shopping','Wine']}}.
  • skip: the index to start the retrieval
  • num: the number of documents to retrieve
  • collectionName: the name of the collection.

In [4]:
ents = ms.load_entities()
print len(ents)
ents[0].generic()


100
Out[4]:
{'_features': [(4097, 1.0),
  (4098, 1.0),
  (4099, 1.0),
  (4100, 1.0),
  (4101, 1.0),
  (4102, 1.0)],
 '_raws': {u'flavor': 1,
  u'love': 1,
  u'option': 1,
  u'top': 1,
  u'yogurt': 1,
  u'yogurtland': 1},
 u'address': u'\n\t\t\t1620 BroadwaySeattle, WA 98122\n\t\t',
 u'category_str_list': [u'Ice Cream & Frozen Yogurt'],
 u'comment': u'I love all those options for yogurt flavors and toppings.',
 'createdTime': datetime.datetime(2015, 3, 21, 6, 32, 51, 571000),
 'creator': u'monk',
 u'desc': u'Yogurtland',
 u'google_geometry': [{u'address_components': [{u'long_name': u'1620',
     u'short_name': u'1620',
     u'types': [u'street_number']},
    {u'long_name': u'Broadway',
     u'short_name': u'Broadway',
     u'types': [u'route']},
    {u'long_name': u'Capitol Hill',
     u'short_name': u'Capitol Hill',
     u'types': [u'neighborhood', u'political']},
    {u'long_name': u'Seattle',
     u'short_name': u'Seattle',
     u'types': [u'locality', u'political']},
    {u'long_name': u'Seattle',
     u'short_name': u'Seattle',
     u'types': [u'administrative_area_level_3', u'political']},
    {u'long_name': u'King',
     u'short_name': u'King',
     u'types': [u'administrative_area_level_2', u'political']},
    {u'long_name': u'Washington',
     u'short_name': u'WA',
     u'types': [u'administrative_area_level_1', u'political']},
    {u'long_name': u'United States',
     u'short_name': u'US',
     u'types': [u'country', u'political']},
    {u'long_name': u'98122',
     u'short_name': u'98122',
     u'types': [u'postal_code']}],
   u'formatted_address': u'1620 Broadway, Seattle, WA 98122, USA',
   u'geometry': {u'location': {u'lat': 47.6162747, u'lng': -122.3205492},
    u'location_type': u'ROOFTOP',
    u'viewport': {u'northeast': {u'lat': 47.6176236802915,
      u'lng': -122.3192002197085},
     u'southwest': {u'lat': 47.6149257197085, u'lng': -122.3218981802915}}},
   u'types': [u'street_address']}],
 'lastModified': datetime.datetime(2015, 3, 27, 23, 26, 5, 374027),
 u'link': u'/biz/yogurtland-seattle',
 'monkType': u'Entity',
 'name': u'',
 u'price_range': 1,
 u'rating_string': u'4.5',
 u'review_count': u'275',
 u'title': u'Yogurtland'}

In [7]:
fields=['title', 'comment', 'desc']

In [12]:
stemT.save()

Create new turtle

Turtle provides a solution for a specific problem.

Parameters Definitions Examples
monkType Turtle type MultiLabelTurtle
name A unique string flydragon_tagger
description Detailed project description can help others understand Used to predict first level tags for activities stored in flydragon
mapping An encoder that encodes the targets into internal pandas structures. It is not useful for multilabel turtle. But, it is useful for multi-class Turtle. Encoding strategies can be very tricky since it will affect the final accuracies. For example, Error Correcting Output Code can be of a good choice, but a random binary coding can be bad. For Sum-Product-Network Turtle, smart coding will be learned from data, which will relieve the coding burden from scientists {'dinning':[0000001000], 'hotel':[100000000]}
entityCollectionName The data collection name to work on, assuming the database is given in MONK's configuration activities
requires The field defines the turtle's dependencies, or which features to use. It can be uids or turtle_ids. When turtle_ids used, the pandas' uids in those turtles will all be the features for this turtle {'requires': {'turtleIds':[<id1>, <id2>]}}
pMaxPathLength Inference will employ Beam Search algorithm, this parameter will be the maximum length of a search path before it stops 1
pMaxInferenceSteps The maximum number of inferences before it gives up. 1
tigress A tigress as the superviser to train the turtles and measure the performances see below
pandas A list of panda that are employed to solve the problem see below

Tigress provides functionalities to supervise and measure the performance of a turtle.

Parameters Definitions Examples
monkType Tigress type MultiLabelTigress
name A string flydragon_tagger
description Detailed discriptions Equal weighted multilabel classifiers
costs The cost of each tag/label being incorrectly predicted, if not defined, defaultCost will be used {'dinning':1.0, 'boating':0.2}
defaultCost If cost of a tag is not specified, defaultCost will be used. If defaultCost is not defined, it will be the smallest value in the costs 1.0
displayTextFields The fields for the entity to be displayed when doing inter-active learning ['title','reviews']
displayImageField If possible, display an image (url) photo_url
activeBatchSize For each active learning stage, how many uncertain examples to scan through, default to 100 100
pCuriosity The factor of the active learning to trade off between exploitation and exploration. 0.0 means no exploitation 0.0
patterns For PatternTigress and children, each target has a pattern to search. If the pattern matches, the tag is on, otherwise the tag is off {'dinning':'dinning'}
fields In which fields, the tigress is supposed to search through ['title', 'description', ...]
mutualExclusive True for only one target existing in the fields, False otherwise False
defaulting True for using default tag when nothing found in the fields. False for ignoring this example False

Panda is a basic classifier/regressor.

Parameters Definitions Examples
monkType Panda type LinearPanda
name A unique string dinning
mantis A learning algorithm see below

Mantis is a basic learning algorithm.

Parameters Definitions Examples
monkType Mantis type Mantis
maxNumIters Maximum number of iterations to perform optimization 100
maxNumInstances Maximum number of instances for each user 1000
eps Convergence interval 1e-4
lam Lambda that controls the regularization strength 1
rho Personalization strength, the smaller the higher the personalization 1

In [5]:
likeTS = ms.yaml2json('turtle_scripts/turtle_like.yml')
print likeTS


{'tigress': {'patterns': {'likeTravel': 'Y'}, 'name': 'likeTravel', 'fields': ['likeTravel'], 'monkType': 'PatternTigress', 'defaulting': True, 'description': 'binary like or dislike classifier'}, 'name': 'likeTravel', 'monkType': 'SingleTurtle', 'description': 'predicting if one likes the item or not', 'requires': {'turtles': ['travel_stem']}, 'mapping': {'likeTravel': [1]}, 'pandas': [{'monkType': 'LinearPanda', 'name': 'likeTravel', 'mantis': {'monkType': 'Mantis'}}]}

In [6]:
likeT = ms.create_turtle(likeTS)

In [7]:
likeT.save()

In [9]:
ent = ents[0]

In [32]:
ents[0].generic()


Out[32]:
{'_features': [(4097, 1.0),
  (4098, 1.0),
  (4099, 1.0),
  (4100, 1.0),
  (4101, 1.0),
  (4102, 1.0)],
 '_raws': {u'flavor': 1,
  u'love': 1,
  u'option': 1,
  u'top': 1,
  u'yogurt': 1,
  u'yogurtland': 1},
 u'address': u'\n\t\t\t1620 BroadwaySeattle, WA 98122\n\t\t',
 u'category_str_list': [u'Ice Cream & Frozen Yogurt'],
 u'comment': u'I love all those options for yogurt flavors and toppings.',
 'createdTime': datetime.datetime(2015, 3, 21, 6, 32, 51, 571000),
 'creator': u'monk',
 u'desc': u'Yogurtland',
 u'google_geometry': [{u'address_components': [{u'long_name': u'1620',
     u'short_name': u'1620',
     u'types': [u'street_number']},
    {u'long_name': u'Broadway',
     u'short_name': u'Broadway',
     u'types': [u'route']},
    {u'long_name': u'Capitol Hill',
     u'short_name': u'Capitol Hill',
     u'types': [u'neighborhood', u'political']},
    {u'long_name': u'Seattle',
     u'short_name': u'Seattle',
     u'types': [u'locality', u'political']},
    {u'long_name': u'Seattle',
     u'short_name': u'Seattle',
     u'types': [u'administrative_area_level_3', u'political']},
    {u'long_name': u'King',
     u'short_name': u'King',
     u'types': [u'administrative_area_level_2', u'political']},
    {u'long_name': u'Washington',
     u'short_name': u'WA',
     u'types': [u'administrative_area_level_1', u'political']},
    {u'long_name': u'United States',
     u'short_name': u'US',
     u'types': [u'country', u'political']},
    {u'long_name': u'98122',
     u'short_name': u'98122',
     u'types': [u'postal_code']}],
   u'formatted_address': u'1620 Broadway, Seattle, WA 98122, USA',
   u'geometry': {u'location': {u'lat': 47.6162747, u'lng': -122.3205492},
    u'location_type': u'ROOFTOP',
    u'viewport': {u'northeast': {u'lat': 47.6176236802915,
      u'lng': -122.3192002197085},
     u'southwest': {u'lat': 47.6149257197085, u'lng': -122.3218981802915}}},
   u'types': [u'street_address']}],
 'lastModified': datetime.datetime(2015, 3, 21, 8, 25, 55, 493265),
 u'link': u'/biz/yogurtland-seattle',
 'monkType': u'Entity',
 'name': u'',
 u'price_range': 1,
 u'rating_string': u'4.5',
 u'review_count': u'275',
 u'title': u'Yogurtland'}

Add label

Flask API: add_label(ent_id, field, value)


In [11]:
ent._setattr('likeTravel', 'Y')
ms.crane.entityStore.save_one(ent)

In [12]:
ent.generic()


Out[12]:
{'_features': [(4097, 1.0),
  (4098, 1.0),
  (4099, 1.0),
  (4100, 1.0),
  (4101, 1.0),
  (4102, 1.0)],
 '_raws': {u'flavor': 1,
  u'love': 1,
  u'option': 1,
  u'top': 1,
  u'yogurt': 1,
  u'yogurtland': 1},
 u'address': u'\n\t\t\t1620 BroadwaySeattle, WA 98122\n\t\t',
 u'category_str_list': [u'Ice Cream & Frozen Yogurt'],
 u'comment': u'I love all those options for yogurt flavors and toppings.',
 'createdTime': datetime.datetime(2015, 3, 21, 6, 32, 51, 571000),
 'creator': u'monk',
 u'desc': u'Yogurtland',
 u'google_geometry': [{u'address_components': [{u'long_name': u'1620',
     u'short_name': u'1620',
     u'types': [u'street_number']},
    {u'long_name': u'Broadway',
     u'short_name': u'Broadway',
     u'types': [u'route']},
    {u'long_name': u'Capitol Hill',
     u'short_name': u'Capitol Hill',
     u'types': [u'neighborhood', u'political']},
    {u'long_name': u'Seattle',
     u'short_name': u'Seattle',
     u'types': [u'locality', u'political']},
    {u'long_name': u'Seattle',
     u'short_name': u'Seattle',
     u'types': [u'administrative_area_level_3', u'political']},
    {u'long_name': u'King',
     u'short_name': u'King',
     u'types': [u'administrative_area_level_2', u'political']},
    {u'long_name': u'Washington',
     u'short_name': u'WA',
     u'types': [u'administrative_area_level_1', u'political']},
    {u'long_name': u'United States',
     u'short_name': u'US',
     u'types': [u'country', u'political']},
    {u'long_name': u'98122',
     u'short_name': u'98122',
     u'types': [u'postal_code']}],
   u'formatted_address': u'1620 Broadway, Seattle, WA 98122, USA',
   u'geometry': {u'location': {u'lat': 47.6162747, u'lng': -122.3205492},
    u'location_type': u'ROOFTOP',
    u'viewport': {u'northeast': {u'lat': 47.6176236802915,
      u'lng': -122.3192002197085},
     u'southwest': {u'lat': 47.6149257197085, u'lng': -122.3218981802915}}},
   u'types': [u'street_address']}],
 'lastModified': datetime.datetime(2015, 3, 27, 23, 29, 17, 554751),
 'likeTravel': 'Y',
 u'link': u'/biz/yogurtland-seattle',
 'monkType': u'Entity',
 'name': u'',
 u'price_range': 1,
 u'rating_string': u'4.5',
 u'review_count': u'275',
 u'title': u'Yogurtland'}

In [10]:
likeT.pandas[0].mantis


Out[10]:
<monk.core.mantis.Mantis at 0x7f4c4d85dfd0>

Add data to turtle

Flask API add_data_to_model(turtleName, creator, ent_id)

  • ent_id is a string

  • creator defaults to 'monk'


In [13]:
ms.add_data('likeTravel', 'monk', str(ent._id))


[2015-03-27 23:29:41,714][19717][monk.tigress][DEBUG   ][109 ][tigress.py] : combinedField Y
Out[13]:
True

In [14]:
likeT.tigress.p


Out[14]:
{re.compile(r'Y'): 'likeTravel'}

In [15]:
likeT.pandas[0].mantis.data


Out[15]:
{ObjectId('524c04c4e291973e1136496c'): (0, 1, 1.0)}

In [16]:
ms.add_data('likeTravel', 'monk', str(ents[1]._id))


[2015-03-27 23:29:54,326][19717][monk.tigress][DEBUG   ][109 ][tigress.py] : combinedField 
Out[16]:
True

In [17]:
likeT.tigress.defaulting=True

In [18]:
likeT.pandas[0].mantis.data


Out[18]:
{ObjectId('524c04c4e291973e1136496c'): (0, 1, 1.0),
 ObjectId('524c04c4e291973e1136496d'): (1, -1, 1.0)}

Training

Flask API:

  • train(turtleName, creator)

In [21]:
likeT = ms.load_turtle('likeTravel','monk')

In [22]:
likeT.train()


[2015-03-27 23:30:56,534][19717][monk.mantis ][DEBUG   ][102 ][mantis.py] : gamma in mantis 1
[2015-03-27 23:30:56,535][19717][monk.svm_solver_dual][DEBUG   ][138 ][mantis.py] : rho0 in svm_solver_dual.trainModel 0.25
[2015-03-27 23:30:56,535][19717][monk.svm_solver_dual][DEBUG   ][138 ][mantis.py] : num_instances 2
[2015-03-27 23:30:56,536][19717][monk.mantis ][DEBUG   ][155 ][mantis.py] : q = * * * 4097:0.1309 4098:0.1309 4099:0.1309 4100:0.1309 ...
[4103:-0.0595660060644,4097:0.13090929389]
[2015-03-27 23:30:56,536][19717][monk.mantis ][DEBUG   ][156 ][mantis.py] : w = * * * 4097:0.1929 4098:0.1929 4099:0.1929 4100:0.1929 ...
[4103:-0.188097715378,4097:0.192854225636]
[2015-03-27 23:30:56,539][19717][monk.turtle ][DEBUG   ][230 ][turtle.py] : training clock 1

In [40]:
from monk.math.cmath import sign0

Prediction

Flask API:

  • predict(turtleName, creator, ent_id)
  • creator defaults to 'monk'
  • ent_id string

In [ ]:
likeT = ms.load_turtle(turtleName, creator)
ent = ms.load_entity(ent_id)
return likeT.pandas[0].predict(ent)

In [25]:
likeT.pandas[0].predict(ents[0])


Out[25]:
0.874138355255127

In [ ]: