This walkthrough is based on this spaCy tutorial.
Train a convolutional neural network text classifier on the
IMDB dataset, using the TextCategorizer component. The dataset will be loaded
automatically via Thinc's built-in dataset loader. The model is added to
spacy.pipeline, and predictions are available via doc.cats.
This notebook has been tested with the following package versions:
(you may need to change pip to pip3, depending on your own Python environment)
In [1]:
# Python >3.5
!pip install verta
!pip install spacy==2.1.6
!python -m spacy download en
In [2]:
HOST = 'app.verta.ai'
PROJECT_NAME = 'Film Review Classification'
EXPERIMENT_NAME = 'spaCy CNN'
In [3]:
# import os
# os.environ['VERTA_EMAIL'] =
# os.environ['VERTA_DEV_KEY'] =
In [4]:
from verta import Client
from verta.utils import ModelAPI
client = Client(HOST, use_git=False)
proj = client.set_project(PROJECT_NAME)
expt = client.set_experiment(EXPERIMENT_NAME)
run = client.set_experiment_run()
In [5]:
from __future__ import print_function
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)
import random
import six
import numpy as np
import thinc.extra.datasets
import spacy
from spacy.util import minibatch, compounding
In [6]:
run_id = ""
In [7]:
run = expt.expt_runs.find("id == '{}'".format(run_id))[0]
In [8]:
# test the logged model
print("Loading from verta..")
nlp2 = run.get_model()
In [9]:
test_text = "I would definitely watch this again!"
doc2 = nlp2(test_text)
print(test_text)
print(doc2.cats)
In [10]:
run.log_metric("val_metric_direct", 0.5)
Click the link above to view your Experiment Run in the Verta Web App, and deploy it.
Once it's ready, you can make predictions against the deployed model.
In [11]:
from verta._demo_utils import DeployedModel
deployed_model = DeployedModel(HOST, run.id)
In [12]:
deployed_model.predict(["I would definitely watch this again!"])
In [13]:
train_data, _ = thinc.extra.datasets.imdb()
In [14]:
import time
ctr = 0
live_metric = 0
for row in train_data:
print(row[:100])
prediction = deployed_model.predict([row[0]])
print("prediction:", prediction)
time.sleep(0.5)
ctr += 1
if ctr > 10:
break
if ((row[1] == 0) and (prediction == "NEGATIVE")) or ((row[1] == 1) and (prediction == "POSITIVE")):
live_metric += 1
run.log_metric("val_metric_deployed", live_metric * 1.0 / ctr)