Serving TensorFlow models

import warnings

%matplotlib inline
%pylab inline
import matplotlib.pyplot as plt

Populating the interactive namespace from numpy and matplotlib

!ls -l tf/1 tf/1/variables

total 312
-rw-r--r-- 1 olive 197609 315372 Aug  9 16:15 saved_model.pb
drwxr-xr-x 1 olive 197609      0 Aug  9 16:15 variables

total 140
-rw-r--r-- 1 olive 197609 136100 Aug  9 16:15
-rw-r--r-- 1 olive 197609   1480 Aug  9 16:15 variables.index

!saved_model_cli show --dir tf/1

The given SavedModel contains the following tag-sets:

!saved_model_cli show --dir tf/1 --tag_set serve

The given SavedModel MetaGraphDef contains SignatureDefs with the following keys:
SignatureDef key: "serving_default"

!saved_model_cli show --dir tf/1 --tag_set serve --signature_def serving_default

The given SavedModel SignatureDef contains the following input(s):
  inputs['inputs'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 3)
      name: hidden1_input:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['scores'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 3)
      name: softmax/Softmax:0
Method name is: tensorflow/serving/predict

# 0: red
# 1: green
# 2: yellow

!saved_model_cli run --dir tf/1 --tag_set serve --signature_def serving_default --input_exprs inputs=[[100.0,47.0,10.0]]

Result for output key scores:
[[0.0027608  0.8720881  0.12515119]]
2018-08-09 16:31:30.155791: I T:\src\github\tensorflow\tensorflow\core\platform\] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-08-09 16:31:30.435764: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\] Found device 0 with properties: 
name: GeForce GTX 1060 major: 6 minor: 1 memoryClockRate(GHz): 1.569
pciBusID: 0000:02:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2018-08-09 16:31:30.436283: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\] Adding visible gpu devices: 0
2018-08-09 16:31:31.199522: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-09 16:31:31.199753: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\]      0 
2018-08-09 16:31:31.199900: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\] 0:   N 
2018-08-09 16:31:31.200150: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4730 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:02:00.0, compute capability: 6.1)

# first we need to create a bucket on the goolge cloud and upload our model to it

!gsutil mb gs://manning_bucket
!gsutil cp -R tf/1 gs://manning_bucket

'gsutil' is not recognized as an internal or external command,
operable program or batch file.

!gcloud ml-engine models create "manning_insurance_1"
!gcloud ml-engine versions create "v1" --model "manning_insurance_1" --origin gs://manning_bucket/1    
!gcloud ml-engine versions describe "v1" --model "manning_insurance_1"

# one of each category
!cat sample_insurance.json

{"inputs": [ 160,  18,  100]}
{"inputs": [ 100,  47,  10]}
{"inputs": [ 90,  20,  20]}

# 0: red
# 1: green
# 2: yellow

!gcloud ml-engine predict --model "manning_insurance_1" --version "v1" --json-instances ./sample_insurance.json

# [0.8658562898635864, 7.318668918511809e-14, 0.13414366543293]
# [0.002760800765827298, 0.8720880746841431, 0.12515118718147278]
# [5.452934419736266e-05, 0.005952719133347273, 0.9939927458763123]

import googleapiclient.discovery

def predict_json(project, model, instances, version=None):
    """Send json data to a deployed model for prediction.

        project (str): project where the Cloud ML Engine Model is deployed.
        model (str): model name.
        instances ([Mapping[str: Any]]): Keys should be the names of Tensors
            your deployed model expects as inputs. Values should be datatypes
            convertible to Tensors, or (potentially nested) lists of datatypes
            convertible to tensors.
        version: str, version of the model to target.
        Mapping[str: any]: dictionary of prediction results defined by the
    # Create the ML Engine service object.
    # To authenticate set the environment variable
    # GOOGLE_APPLICATION_CREDENTIALS=<path_to_service_account_file>
    service ='ml', 'v1')
    name = 'projects/{}/models/{}'.format(project, model)

    if version is not None:
        name += '/versions/{}'.format(version)

    response = service.projects().predict(
        body={'instances': instances}

    if 'error' in response:
        raise RuntimeError(response['error'])

    return response['predictions']

instances = [{"inputs": [ 160,  18,  100]}, {"inputs": [ 100,  47,  10]}, {"inputs": [ 90,  20,  20]}]
predict_json("sandboxolli", "manning_insurance_1", instances=instances)

[{'scores': [0.8658562898635864, 7.318668918511809e-14, 0.13414366543293]},
 {'scores': [0.002760800765827298, 0.8720880746841431, 0.12515118718147278]},
 {'scores': [5.452934419736266e-05, 0.005952719133347273, 0.9939927458763123]}]

Running on a dedicated Linux Server

From here-on you will need a Linux Server that has a proper installation of TensorFlow and the TensorFlow Server

!tensorflow_model_server --port=9000 --model_name=manning_insurance_1 --model_base_path=$(pwd)/tf

!tensorflow_model_server --rest_api_port=8501 \
   --model_name=manning_insurance_1 \

!curl -d '{ "instances": [{"inputs": [ 100.0,  47.0,  10.0]}]}' -X POST http://localhost:8501/v1/models/manning_insurance_1:predict
# {
#     "predictions": [[0.0027608, 0.872088, 0.125151]
#     ]
# }

{ "error": "JSON Parse error: Invalid value. at offset: 0" }
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (6) Could not resolve host: instances
curl: (3) [globbing] bad range specification in column 2
curl: (3) [globbing] bad range specification in column 2
curl: (6) Could not resolve host: 100.0,
curl: (6) Could not resolve host: 47.0,
curl: (3) [globbing] unmatched close brace/bracket in column 5

100    62  100    60  100     2  60000   2000 --:--:-- --:--:-- --:--:-- 62000