In [ ]:

    
from fastai.text import *   # Quick access to NLP functionality

Text example

An example of creating a language model and then transfering to a classifier.



In [ ]:

    
path = untar_data(URLs.IMDB_SAMPLE)
path









    Out[ ]:





PosixPath('/home/sgugger/.fastai/data/imdb_sample')

Open and view the independent and dependent variables:



In [ ]:

    
df = pd.read_csv(path/'texts.csv')
df.head()









    Out[ ]:







  
    
      
      label
      text
      is_valid
    
  
  
    
      0
      negative
      Un-bleeping-believable! Meg Ryan doesn't even ...
      False
    
    
      1
      positive
      This is a extremely well-made film. The acting...
      False
    
    
      2
      negative
      Every once in a long while a movie will come a...
      False
    
    
      3
      positive
      Name just says it all. I watched this movie wi...
      False
    
    
      4
      negative
      This movie succeeds at being one of the most u...
      False

Create a DataBunch for each of the language model and the classifier:



In [ ]:

    
data_lm = TextLMDataBunch.from_csv(path, 'texts.csv')
data_clas = TextClasDataBunch.from_csv(path, 'texts.csv', vocab=data_lm.train_ds.vocab, bs=42)

We'll fine-tune the language model. fast.ai has a pre-trained English model available that we can download, we just have to specify it like this:



In [ ]:

    
moms = (0.8,0.7)



In [ ]:

    
learn = language_model_learner(data_lm, AWD_LSTM)
learn.unfreeze()
learn.fit_one_cycle(4, slice(1e-2), moms=moms)

Save our language model's encoder:



In [ ]:

    
learn.save_encoder('enc')

Fine tune it to create a classifier:



In [ ]:

    
learn = text_classifier_learner(data_clas, AWD_LSTM)
learn.load_encoder('enc')
learn.fit_one_cycle(4, moms=moms)



In [ ]:

    
learn.save('stage1-clas')



In [ ]:

    
learn.unfreeze()
learn.fit_one_cycle(8, slice(1e-5,1e-3), moms=moms)



In [ ]:

    
learn.predict("I really liked this movie!")









    Out[ ]:





(Category tensor(1), tensor(1), tensor([0.0666, 0.9334]))



In [ ]:

epoch	train_loss	valid_loss	accuracy	time
0	4.414052	3.939605	0.279167	00:05
1	4.152833	3.875656	0.284345	00:05
2	3.832567	3.848873	0.286280	00:05
3	3.561787	3.856220	0.286399	00:05

epoch	train_loss	valid_loss	accuracy	time
0	0.659827	0.600592	0.766169	00:04
1	0.599001	0.520201	0.756219	00:05
2	0.564309	0.494556	0.796020	00:04
3	0.520831	0.495697	0.776119	00:04

epoch	train_loss	valid_loss	accuracy	time
0	0.470689	0.488138	0.786070	00:08
1	0.455899	0.468737	0.786070	00:07
2	0.474349	0.498394	0.771144	00:08
3	0.466920	0.477338	0.766169	00:08
4	0.459592	0.462194	0.805970	00:08
5	0.431064	0.472223	0.786070	00:08
6	0.427589	0.466315	0.796020	00:09
7	0.417917	0.461701	0.786070	00:08

	label	text	is_valid
0	negative	Un-bleeping-believable! Meg Ryan doesn't even ...	False
1	positive	This is a extremely well-made film. The acting...	False
2	negative	Every once in a long while a movie will come a...	False
3	positive	Name just says it all. I watched this movie wi...	False
4	negative	This movie succeeds at being one of the most u...	False