NLP Interpret


In [ ]:
from fastai.gen_doc.nbdoc import *
from fastai.text import * 
from fastai.text.interpret import *

text.interpret is the module that implements custom Interpretation classes for different NLP tasks by inheriting from it.


In [ ]:
from fastai.gen_doc.nbdoc import *
from fastai.vision import *

In [ ]:
show_doc(TextClassificationInterpretation)


class TextClassificationInterpretation[source][test]

TextClassificationInterpretation(learn:Learner, preds:Tensor, y_true:Tensor, losses:Tensor, ds_type:DatasetType=<DatasetType.Valid: 2>) :: ClassificationInterpretation

No tests found for TextClassificationInterpretation. To contribute a test please refer to this guide and this discussion.

Provides an interpretation of classification based on input sensitivity. This was designed for AWD-LSTM only for the moment, because Transformer already has its own attentional model.


In [ ]:
show_doc(TextClassificationInterpretation.intrinsic_attention)


intrinsic_attention[source][test]

intrinsic_attention(text:str, class_id:int=None)

No tests found for intrinsic_attention. To contribute a test please refer to this guide and this discussion.

Calculate the intrinsic attention of the input w.r.t to an output class_id, or the classification given by the model if None. For reference, see the Sequential Jacobian session at https://www.cs.toronto.edu/~graves/preprint.pdf


In [ ]:
show_doc(TextClassificationInterpretation.html_intrinsic_attention)


html_intrinsic_attention[source][test]

html_intrinsic_attention(text:str, class_id:int=None, **kwargs) → str

No tests found for html_intrinsic_attention. To contribute a test please refer to this guide and this discussion.


In [ ]:
show_doc(TextClassificationInterpretation.show_intrinsic_attention)


show_intrinsic_attention[source][test]

show_intrinsic_attention(text:str, class_id:int=None, **kwargs)

No tests found for show_intrinsic_attention. To contribute a test please refer to this guide and this discussion.


In [ ]:
show_doc(TextClassificationInterpretation.show_top_losses)


show_top_losses[source][test]

show_top_losses(k:int, max_len:int=70)

No tests found for show_top_losses. To contribute a test please refer to this guide and this discussion.

Create a tabulation showing the first k texts in top_losses along with their prediction, actual,loss, and probability of actual class. max_len is the maximum number of tokens displayed.

Let's show how TextClassificationInterpretation can be used once we train a text classification model.

train


In [ ]:
imdb = untar_data(URLs.IMDB_SAMPLE)

In [ ]:
data_lm = (TextList.from_csv(imdb, 'texts.csv', cols='text')
                   .split_by_rand_pct()
                   .label_for_lm()
                   .databunch())
data_lm.save()

In [ ]:
data_lm.show_batch()


idx text
0 ! ! ! xxmaj finally this was directed by the guy who did xxmaj big xxmaj xxunk ? xxmaj must be a replay of xxmaj jonestown - hollywood style . xxmaj xxunk ! xxbos xxmaj this is a extremely well - made film . xxmaj the acting , script and camera - work are all first - rate . xxmaj the music is good , too , though it is
1 ) . xxmaj all in all , we were very disappointed at this xxmaj spike xxmaj lee effort ! ! xxbos a really great movie and true story . xxmaj dan xxmaj jansen the xxmaj greatest xxunk ever . a touching and beautiful movie the whole family can enjoy . xxmaj the story of xxmaj jane xxmaj xxunk battle with cancer and xxmaj dan xxmaj jansen love for his sister
2 just typical folks ) in everyday settings in order to create xxunk involving and realistic films . \n \n xxmaj in this case , the film is about xxmaj french and xxmaj german coal miners , so appropriately , the people in the roles seem like miners -- not actors . xxmaj the central conflict as the film begins is that there is a huge mine xxunk on the
3 here that xxunk banning ... which is a shame because i never would have sat through it where it not for the fact that it 's on ' the xxunk list ' . xxmaj the plot actually gives the film a decent base - or at least more of a decent base than most xxunk films - and it follows an actress who is kidnapped and dragged off into the
4 xxmaj at the same time , the xxmaj john xxmaj holmes character shows a very clever hustler who is able to pass through the xxunk and xxunk situations almost xxunk . xxmaj the movie deserves being watched more than once . xxmaj the seventies ambiance xxunk and full of drugs is amazing . xxbos xxmaj if you loved xxmaj long xxmaj way xxmaj round you will enjoy this nearly as

In [ ]:
learn = language_model_learner(data_lm, AWD_LSTM)
learn.fit_one_cycle(2, 1e-2)
learn.save('mini_train_lm')
learn.save_encoder('mini_train_encoder')


epoch train_loss valid_loss accuracy time
0 4.650112 3.822781 0.290729 00:21
1 4.378561 3.766616 0.295357 00:21

In [ ]:
data_clas = (TextList.from_csv(imdb, 'texts.csv', cols='text', vocab=data_lm.vocab)
                   .split_from_df(col='is_valid')
                   .label_from_df(cols='label')
                   .databunch(bs=42))

In [ ]:
learn = text_classifier_learner(data_clas, AWD_LSTM)
learn.load_encoder('mini_train_encoder')
learn.fit_one_cycle(2, slice(1e-3,1e-2))
learn.save('mini_train_clas')


epoch train_loss valid_loss accuracy time
0 0.666474 0.666000 0.605000 00:16
1 0.666053 0.646565 0.615000 00:18

interpret


In [ ]:
interp = TextClassificationInterpretation.from_learner(learn)

In [ ]:
interp.show_intrinsic_attention("I really like this movie, it is amazing!")


xxbos i really like this movie , it is amazing !

Undocumented Methods - Methods moved below this line will intentionally be hidden

New Methods - Please document or move to the undocumented section