In [1]:
import numpy as np
import theano
import theano.tensor as T
import lasagne
import os

Generate names

  • Struggle to find a name for the variable? Let's see how you'll come up with a name for your son/daughter. Surely no human has expertize over what is a good child name, so let us train NN instead.
  • Dataset contains ~8k human names from different cultures[in latin transcript]
  • Objective (toy problem): learn a generative model over names.

In [2]:
start_token = " "

with open("names") as f:
    names =[:-1].split('\n')
    names = [start_token+name for name in names]

In [3]:
print 'n samples = ',len(names)
for x in names[::1000]:
    print x

n samples =  7944

Text processing

In [4]:
#all unique characters go here
token_set = set()
for name in names:
    for letter in name:

tokens = list(token_set)

print 'n_tokens = ',len(tokens)

n_tokens =  55

In [5]:
#!token_to_id = <dictionary of symbol -> its identifier (index in tokens list)>
token_to_id = {t:i for i,t in enumerate(tokens) }

#!id_to_token = < dictionary of symbol identifier -> symbol itself>
id_to_token = {i:t for i,t in enumerate(tokens)}

In [6]:
import matplotlib.pyplot as plt
%matplotlib inline

# truncate names longer than MAX_LEN characters. 
MAX_LEN = min([60,max(list(map(len,names)))])

Cast everything from symbols into identifiers

In [7]:
names_ix = list(map(lambda name: list(map(token_to_id.get,name)),names))

#crop long names and pad short ones
for i in range(len(names_ix)):
    names_ix[i] = names_ix[i][:MAX_LEN] #crop too long
    if len(names_ix[i]) < MAX_LEN:
        names_ix[i] += [token_to_id[" "]]*(MAX_LEN - len(names_ix[i])) #pad too short
assert len(set(map(len,names_ix)))==1

names_ix = np.array(names_ix)

Input variables

In [8]:
from agentnet import Recurrence
from lasagne.layers import *
from agentnet.memory import *
from agentnet.resolver import ProbabilisticResolver

In [9]:
sequence = T.matrix('token sequence','int64')

inputs = sequence[:,:-1]
targets = sequence[:,1:]

l_input_sequence = InputLayer(shape=(None, None),input_var=inputs)

Build NN

You'll be building a model that takes token sequence and predicts next tokens at each tick

This is basically equivalent to how rnn step was described in the lecture

In [10]:
###One step of rnn
class step:
    inp = InputLayer((None,),name='current character')
    h_prev = InputLayer((None,10),name='previous rnn state')
    #recurrent part
    emb = EmbeddingLayer(inp, len(tokens), 30,name='emb')
    h_new = RNNCell(h_prev,emb,name="rnn") #just concat -> denselayer
    next_token_probas = DenseLayer(h_new,len(tokens),nonlinearity=T.nnet.softmax)
    #pick next token from predicted probas
    next_token = ProbabilisticResolver(next_token_probas)

In [11]:
training_loop = Recurrence(

/home/jheuristic/Downloads/AgentNet/agentnet/agent/ UserWarning: You are giving Recurrence an input sequence of undefined length (None).
Make sure it is always above <unspecified>(n_steps) you specified for recurrence
  "Make sure it is always above {}(n_steps) you specified for recurrence".format(n_steps or "<unspecified>"))

In [12]:
# Model weights
weights = lasagne.layers.get_all_params(training_loop,trainable=True)
print weights

[rnn.hid_to_hid.W, rnn.hid_to_hid.b, emb.W, rnn.input0_to_hid.W, W, b]

In [13]:
predicted_probabilities = lasagne.layers.get_output(training_loop[step.next_token_probas])
#If you use dropout do not forget to create deterministic version for evaluation

In [15]:
loss = lasagne.objectives.categorical_crossentropy(predicted_probabilities.reshape((-1,len(tokens))),
#<Loss function - a simple categorical crossentropy will do, maybe add some regularizer>

updates = lasagne.updates.adam(loss,weights)

Compiling it

In [20]:
train_step = theano.function([sequence], loss,


here we re-wire the recurrent network so that it's output is fed back to it's input

In [21]:
n_steps = T.scalar(dtype='int32')
feedback_loop = Recurrence(

In [22]:
generated_tokens = get_output(feedback_loop[step.next_token])

In [23]:
generate_sample = theano.function([n_steps],generated_tokens,updates=feedback_loop.get_automatic_updates())

In [24]:
def generate_string(length=MAX_LEN):
    output_indices = generate_sample(length)[0]
    return ''.join(tokens[i] for i in output_indices)

In [25]:


Model training

Here you can tweak parameters or insert your generation function

Once something word-like starts generating, try increasing seq_length

In [26]:
def sample_batch(data, batch_size):
    rows = data[np.random.randint(0,len(data),size=batch_size)]
    return rows

In [27]:
print("Training ...")

#total N iterations

# how many minibatches are there in the epoch 
batches_per_epoch = 500

#how many training sequences are processed in a single function call

for epoch in xrange(n_epochs):

    avg_cost = 0;
    for _ in range(batches_per_epoch):
        avg_cost += train_step(sample_batch(names_ix,batch_size))
    print("\n\nEpoch {} average loss = {}".format(epoch, avg_cost / batches_per_epoch))

    print "Generated names"
    for i in range(10):
        print generate_string(),

Training ...

Epoch 0 average loss = 2.459113338
Generated names
 CTiiM daI  ViHs Tnay JeclGrA   n rAysserjHe          aVlolrCnagio  v   Sj     Nl'    Lih      Gaq    QAuik   gyGeixh  q  h       q     y         l   H       dtn   e   a 

Epoch 1 average loss = 1.48151442409
Generated names
Ama        SlaT  Tiari             un  EriaF     J                    Ne   a    n    Eronne              -rearaKdiUi             Jnidue Oeaa             U at         b   

Epoch 2 average loss = 1.32031585945
Generated names
Lrntlznuureta    Vj               SnaTeLivdni      meana          l                  urhf             Amoazara  Zanlub Amienidn     qes  Hsha                n            

Epoch 3 average loss = 1.2446654422
Generated names
X                T                ClievisG   Gi    Bermuo       Bao qGyUbo           TyOd             Shio             Mssine           Tda              Krac             

Epoch 4 average loss = 1.21228895439
Generated names
Gie              Casa             Lhel             Dirtilia         SyPioi           Veve             Lan              Huriwlenieea                      Kaalo            

Epoch 5 average loss = 1.16934621868
Generated names
Sinfa            Styyt             Ross            Cdeer            Brarse           Sicane           Alari          B Tidattae         PQrguky          Gugylid          

Epoch 6 average loss = 1.15463410643
Generated names
Laido            Aopak            Hyline           Eby              Whanecont        tosenn           Sesie            LeNerey          Onnie            Milareh          

Epoch 7 average loss = 1.13878075207
Generated names
Avuvdena         Heora            Jida             Jubey            Kennyw           Nizha         t  Blaeyte          Aened                             Sotsa            

Epoch 8 average loss = 1.13747142478
Generated names
Hillecns         Miece            Brei             E Taleldsama     Lameh            Roryrdy          Anarpes          Gafey            Delace           Kaartz           

Epoch 9 average loss = 1.12452646033
Generated names
Vanern           Laulror          Adddyl            li              Balam            Derinte          GarJasb          Losye            Leesal           Tuoly            

Epoch 10 average loss = 1.11760642508
Generated names
Lajlar           Jons             Flandy           Joaslenly        Ecaria           Dore             Eu               Jinane           Mhuy             Jorig            

Epoch 11 average loss = 1.11338808855
Generated names
Klinod           Gyndca           Cmisn            Harbfr           PGannde          Jora             Elwin            Siede            Jasoo            Wiima            

Epoch 12 average loss = 1.09521686695
Generated names
Keun             Jadarst          Glara            Donthor          Srarariet        Zilaea           Tilbie           Hirada           Jerirke   l      Dwera            

Epoch 13 average loss = 1.09180384213
Generated names
Bcannva          Bbanga           Ruednce          Nei              Cralline         Horla            Cytlk            GIlrrel          Relbintia        Canyf            

Epoch 14 average loss = 1.09008883149
Generated names
Nedne            Deverrin         Irnennintin      Acareceiiit      Athuelria        Tanla             Fymo            Nolilt           Cadi             Shud             

Epoch 15 average loss = 1.08665773279
Generated names
Carbava          Gonea            Rufrethe         Battsie          Cotidilre        Jarte            Lrygso           Boonas           Bal              A Fafusa         

Epoch 16 average loss = 1.08192835774
Generated names
Alad             Worle            Fusriwie         Wicie            Werhan           Matie            Mile             Balsod           Piene            Jranttola        

Epoch 17 average loss = 1.08181843138
Generated names
Meunhy           Karion           Sucar  d         Tsherdy          Tooabi           Winel            Shaber           Teicm            Bycaenne         Frini            

Epoch 18 average loss = 1.07160790063
Generated names
Gediatix         Jadlin           Brorsaa          Celin            Anlera           Ario             Canary           Dabelliti        Shannera         Shonia           

Epoch 19 average loss = 1.07553662761
Generated names
Amou             Keati            Gglomans         Leucan           Baronaw          Gen              Evi              Jowlaa           Diid             Folie            

Epoch 20 average loss = 1.07269653507
Generated names
Mirenca          Teerse           Kinchamy         Meetotta         Segthine         Fale             Dollad           Aiyngy           Livio            QTmdarsi         

Epoch 21 average loss = 1.07357783023
Generated names
Avhllal          Thdiit           Nallir           Vene             Jatex            Wanne            Gavina           Phonphiy         Kestaros         Citar            

Epoch 22 average loss = 1.07578047529
Generated names
Serara           Elaon            Ardana           Onrudlan         Cariinttelie     Kesc             Arilley          Smi              Aririe           Re               

Epoch 23 average loss = 1.05811868155
Generated names
Cirice           Ezoshan          Anna             Oldina           Narlastis        Zrinhon          Rurtelyt         Cyne             Shona            Sarevie          

Epoch 24 average loss = 1.06192015984
Generated names
Harannian        Osuan            Beonia           Kirye            Eripon           Netmencie        Sharlie          Belrlell         Fanina           Amlary           

Epoch 25 average loss = 1.06127356798
Generated names
Amon             Orinsa           Bhenla           Fema             Arien            Raoluse          Frypibitel       Orke             Anntan           Cadm             

Epoch 26 average loss = 1.06629730905
Generated names
Sauche           Asticna          Therig  a        Bele             Betiot           Shenntta         Senex            Arl              Vourrajey        Elldah           

Epoch 27 average loss = 1.05841800009
Generated names
Nedel            Chorlia          Albhaty          Cinton           Bemerlia         Moomia           Dech             Covey            Pirta            Golaly           

Epoch 28 average loss = 1.06199989354
Generated names
Gelin            Poncar           Jarinle          Miv              Shenislia        Murle            Nidee            Viordenn         Theskotn         A                

Epoch 29 average loss = 1.06542257645
Generated names
Mosrhine         Heena            Cyryn            Relcie           Ada              Auxonc           War              Diessa           Icefynie         A-ualis          

Epoch 30 average loss = 1.06065505525
Generated names
Larvely          Ent              Adca             Kirata           Nesen            Bfrarmme         Adicti           Walinsiey        Chorsie          Fian             

Epoch 31 average loss = 1.06292972122
Generated names
Sullanna         Vordes           Villion          Line             Jeris            Meldone          Drice            Meretrirfelle    Ron              Nefrke           

Epoch 32 average loss = 1.06257407306
Generated names
Jossati          Lobastca         Auselly          Luceligte        Manseeh          Shedntie         Hanna            Cmomba           Blekie           Annta            

Epoch 33 average loss = 1.05382351835
Generated names
Ganno            Ancte            Omntka           Holye            Galyn            Larta            Tis              Sharie           Gleli            Kesladt          

Epoch 34 average loss = 1.06353209275
Generated names
Dyllene          Derne            Hilannh          Iszia            Za               Dorin            Marte            Atiatsa          Jove             Trerele          

Epoch 35 average loss = 1.06220656465
Generated names
Lorrele          Celia            Nulia            S Illo           Alywy            L Eg             Ennca            Awdlay           Damman           Widile           

Epoch 36 average loss = 1.06133988675
Generated names
Mart             Donsi            Clartho          Salul            Dosko            Lane             uanl             Alrnorlie        Bbamer           Kalon            

Epoch 37 average loss = 1.06299217346
Generated names
Ponarine         Chairee          Telde            Harske           Junidan          Pobia            Deldis           Shanebe          Colanamiy        Rol              

Epoch 38 average loss = 1.06255422991
Generated names
Anna             Ciraica          Robix            Evyes            Gumonod          Begete           Dolia            Hery             Cenanie          Dardeut          

Epoch 39 average loss = 1.06007436512
Generated names
Beted            Aminna           Feishen          Am               Blets            Leree            Mwaragcs         Arbidi           Anrier           Kabin            

Epoch 40 average loss = 1.06165317529
Generated names
Doorana          Sediy            Lon              Jelophesta       Am               Lilana           Mopyne           Cannevlit-Mane   Aces             Lecha            

Epoch 41 average loss = 1.05168978429
Generated names
Sittok           Kliyt            Crer             Bawrley          Rogalina         Fancis           Enine            Kaned            Skasha           Bolla            

Epoch 42 average loss = 1.05243606738
Generated names
Lakk             Ennney           Carnarenne       Yagorr           Wifrald          Malne            Baly             Edmevy           Aldan            Berdele          

Epoch 43 average loss = 1.05857227095
Generated names
Ge               Tabya            Frirrelle        Tonarzia         Vangie           Arca             Jotiana          Pebeolle         Beninil          Canur            

Epoch 44 average loss = 1.05710756636
Generated names
Denarde          Phunande         Daria            Kafrlie          Alik             Cajerserta       Jonelly          Losonde          Tilgtel          Kichonata        

Epoch 45 average loss = 1.04989719715
Generated names
Pelelel          Olta             Orina            Bensia           Narranda         Jana             Hauey            Vinna            Anl              Rell             

Epoch 46 average loss = 1.05932706753
Generated names
Dazp             Ayge             Caldore          Endalan          Frdottan         Jaminne          Martannkie       Lise             Juocsina         Ennd             

Epoch 47 average loss = 1.05501585459
Generated names
Bordetcte        Laree            Vonatmel         Re-urelie        Dosa             Atily            Bayne            Ragdfralen       Jangina          Jery             

Epoch 48 average loss = 1.05377634225
Generated names
Elrlien          Judoth           Daune            Venta            Andie            De               Wegd             Pahllig          Harynel          Pid              

Epoch 49 average loss = 1.05219750884
Generated names
Nalatka          Ori              Eskacie          Eman             Menahis          Tuluellr         Swerbere         Shenlalys        Girria           Borre            

Epoch 50 average loss = 1.04643714534
Generated names
Lurido           Traby            Sinzesso         Hatis            Kin              Lmemsuly         Canga            Jotie            Genna            Emelel           

Epoch 51 average loss = 1.05003975992
Generated names
Caref            Jaruche          Alaollo          Hyr              Tolleen          Muin             Icy              Milats           Vars             Lesta            

Epoch 52 average loss = 1.049178787
Generated names
Caryete          Marbyn           Silia            Garabetsi        Deten            Gucheickese      Adabe            Malyse           Sharets          Olessisa         

Epoch 53 average loss = 1.05067507985
Generated names
Nevie            Parrli           Isak             Hisa             Helellme         Dara             Sarlne           Desaliqy         Falevi           Annese           

Epoch 54 average loss = 1.05353349935
Generated names
Bamman           Conto            Nevelia          Molkik           Jaulle           Gugn             Hocharver        Thanie           Xistind          Corina           

Epoch 55 average loss = 1.04445467334
Generated names
Bonnac           Tobiquy          Radie            Ettrell          Rosvis           Adede            Laricie          Bysslatie        Mignia           Losda            

Epoch 56 average loss = 1.047765695
Generated names
Karanlah         Encista          Erne             Kalen            Gainey           Meci             Mhanns           Irele            Roquli           Zonen            

Epoch 57 average loss = 1.05174707139
Generated names
Bbinde           Shisore          Distha           Matholie         Rulela           Kiorty           Zawley           Gegiquz          Torcia           Dathely          

Epoch 58 average loss = 1.04758747194
Generated names
Ckina            Wysh             Brardil          Sezam a          Malid            Dele             Heuda            Calieba          Ssarid           Girosta          

Epoch 59 average loss = 1.04894773406
Generated names
Chet             Iashela          Sinchel          Shinnelin        Aden             Dorf             Rugcary          Valie            Galthele         Garlena          

Epoch 60 average loss = 1.05018867118
Generated names
Nelota           Serydta          Kkyn             Inle             Auckern          Haphenelen       Brurenza         Danef            Shrianna         Andee            

Epoch 61 average loss = 1.04507420435
Generated names
Kistra           Lurtie           Derin            Gunnend          Laly             Bradond          Hajaw            Annelk           Tamane           Jorlica          

Epoch 62 average loss = 1.04726273395
Generated names
Aundy            Kichiy           Urnexeste        Sangit           Luflmer          Ewbe             Malaneri         Kighore          Phrolile         Nanie            

Epoch 63 average loss = 1.04885520548
Generated names
Tuendoiny        Lilill           Ell              Itramy           Izeledic         Sferel           Vord             Sejara           Xamada           Shanna           

Epoch 64 average loss = 1.05141810007
Generated names
Daliely          Rilayle          Ferica           Rolley           Cirden           Viasca           Ance             Corik            Shauadbina       Kerli            

Epoch 65 average loss = 1.04861521878
Generated names
Revine           Gerily           Lic              Gallely          Gotia            Iryw             Kecet            Michiek          Monna            Squkenee         

Epoch 66 average loss = 1.04267561351
Generated names
Porande          Eden             Taeece           Malora           Whridokha        Mavoie           Ku               Ronce            Lelmari          Pefenen          

Epoch 67 average loss = 1.04723004852
Generated names
Honduke          Ponna            Nerob            Kathyn           Klanie           Fuggit           Rriwen           Gala             Rablie           Lalie            

Epoch 68 average loss = 1.04282407689
Generated names
Jobis            Malhodola        Shrirela         Palan            Aryskan          Eca              Madyssesi        Tharne           Schelllo         Domiga           

Epoch 69 average loss = 1.04716842873
Generated names
Davolbel         Rorita           Ige              Ceana            Korolin          Ancer            Merws            Addupemfri       Herseriz         Mishita          

Epoch 70 average loss = 1.04304361275
Generated names
Styn             Staron           Kivieand         Kandachen        Mare             Mattidr          Barchanna        Karinis          Sarele           Himesta          

Epoch 71 average loss = 1.04361887917
Generated names
unamar           Kene             Alamis           Pharkte          Alilol           Malarle          Doo              Kaleri           Ttherday         Nyesin           

Epoch 72 average loss = 1.03822326145
Generated names
Winalyne         Ryetce           Aukay            Gerana           Stenina          Larta            Dorahnile        Auskas           Erno             Darine           

Epoch 73 average loss = 1.04598050544
Generated names
Panelbia         Eciania          Sporan           Gesse            Tierde           Kara             Pie              Jymlan           Roshare          Pailien          

Epoch 74 average loss = 1.0445951882
Generated names
Sphesry          Rremine          Chmia            Maglity          Harhelarl        Mowy             Defrah           Dana             Beni             Getin            

Epoch 75 average loss = 1.04135694475
Generated names
Jeina            Saselil          Lotphaly         Sulsisie         Kisa             Ramwe            Wira             Feerse           Hedlce           Thily            

Epoch 76 average loss = 1.03891499177
Generated names
Tendela          Dioti            Launaly          Talan            Jesene           Stenn            Makica           Tramavo          Meina            Kerreor          

Epoch 77 average loss = 1.04586444348
Generated names
Jerie            Jymadu           Virdos           Clendites        Brejay           Allale           Speliin          Maine            Yorna            Ma               

Epoch 78 average loss = 1.03815485907
Generated names
Luctorele        Tishida          Darder           Gamrely          Fiidy            Shavitie         Icieste          Elilutte         Doleri           Tan              

Epoch 79 average loss = 1.04607423584
Generated names
Millez           Fetry            Mary             Erna             Belelet          Sog              Stista           Alde             Routa            Gindi            

Epoch 80 average loss = 1.04191821797
Generated names
Dorlan           Miswaneine       Panel            Deliannes        Stiish           Rore             Nelee            Orllery          Araciride        Chimanna         

Epoch 81 average loss = 1.03834778344
Generated names
Sal              Helwele          Dery             Efboee           Cosetta          Wynayn           Avik             Celle            Halithert        Daloste          

Epoch 82 average loss = 1.04862534783
Generated names
Kidati           Mocamina         Jinse            Nenalett         Julina           Thenla           Fornatsad        Nathetma         Osie             Ausria           

Epoch 83 average loss = 1.03976895537
Generated names
Fo               Thriello         Trliloce         Annzie           Donnenne         Dasieo           Tardanna         Allana           Dely             Ciraelil         

Epoch 84 average loss = 1.04871116389
Generated names
Shignis          Carilyn          Chosonle         Nenne            Jarigita         Sdina            Ranncortona      Frastina         Thumriz          Perrauel         

Epoch 85 average loss = 1.03117208916
Generated names
RaAlchine        Edaletta         Erky             Allie            Heonch           Galnin           Braborer         Shacgie          Marmellyne       Silyn            

Epoch 86 average loss = 1.03876071298
Generated names
Gogossy          Wia              Cengah           Bescey           Robaphenn        Brerupa          Waril            Kyrtgie          Styann           Criur            

Epoch 87 average loss = 1.04527916443
Generated names
Beethain         Fanky            Brarlid          Denck            Mamben           Tlulanie         Mathama          Celte            Kerlmette        Senariy          

Epoch 88 average loss = 1.03494162979
Generated names
Cattorh          Shanne           Vinlet           Tesrah           Deica            Ads              Casse            Sorelde          Erlly            Sygnele          

Epoch 89 average loss = 1.03647793138
Generated names
Koranne          Sicenty          Carbry           Thophaneyn       Evwaris          Daith            Sullan           Gnammy           Getkie           Agnagsa          

Epoch 90 average loss = 1.04362710309
Generated names
Chryfie          Alotrid          Matatta          Amene            Trerie           Parwia           Sahonne          Rostesa          Kilrorian        Jhanena          

Epoch 91 average loss = 1.03989352741
Generated names
Jat              Relle            Jorla            Ctye             Hurrdsiy         Yemoto           Rofby            Anania           Rorertere        Maugny           

Epoch 92 average loss = 1.03404652436
Generated names
S-Armiste        Jmyanc           Ontoynne         Nara             Geedoth          Githona-Merora   Awnnle           Sydria           Kienshei         Pofra            

Epoch 93 average loss = 1.03516443445
Generated names
Haddyx           Dulra            Dasrabbe         Eryssa           Erria            Terra            Gatelel          Macik            Lungulyn         Ezagneri         

Epoch 94 average loss = 1.04478919768
Generated names
Incha            Morne            Atarla           Eie              Jweren           Ryvely           Edartel          Jyas             Wad              Silisty          

Epoch 95 average loss = 1.03923666168
Generated names
Emeldia          Gannine          Celephonia       Ganegol          Mithe            Lonne            Foeri            Emfba            Monne            Aude             

Epoch 96 average loss = 1.03841721218
Generated names
Kily             Bin-Alrita       Mattory          Meniy            Lob              Sena             Nickiel          Shelda           Anchely          Resaden          

Epoch 97 average loss = 1.03972609979
Generated names
Perhes           Ethorya          Raliney          Guttoy           Cyq              Matlette         Mabyen           Sy               Trenkya          Korrie           

Epoch 98 average loss = 1.03814580433
Generated names
Jundide          Chiy             Shylla           Keldmyan         Maell            Bonie            Lidie            Vorand           Sil              Malavan          

Epoch 99 average loss = 1.03470052032
Generated names
Mey              Uimn             Wabbati          Dyarolla         Jarry            Misy             Doroty           Kuan             Marie            Derrorti        

In [ ]:

And now,

  • try lstm/gru
  • try several layers
  • try mtg cards
  • try your own dataset of any kind

In [ ]: