NEXUS utilizes Apache Spark running on Apache Mesos for its analytical functions. Now that the infrastructure has been started, we can start up the analysis cluster.
The analysis cluster consists of and Apache Mesos cluster and the NEXUS webapp Tornado server. The Mesos cluster we will be bringing up has one master node and three agent nodes. Apache Spark is already installed and configured on the three agent nodes and will act as Spark executors for the NEXUS analytic functions.
We can use docker-compose
again to start our containers.
Navigate to the directory containing the docker-compose.yml file for the analysis cluster
$ cd ~/nexus/esip-workshop/docker/analysis
Use docker-compose to bring up the containers in the analysis cluster
$ docker-compose up -d
Now that the cluster has started we can use various commands to ensure that it is operational and monitor its status.
List all running docker containers.
$ docker ps
The output should look simillar to this:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES e5589456a78a nexusjpl/nexus-webapp "/tmp/docker-entry..." 5 seconds ago Up 5 seconds 0.0.0.0:4040->4040/tcp, 0.0.0.0:8083->8083/tcp nexus-webapp 18e682b9af0e nexusjpl/spark-mesos-agent "/tmp/docker-entry..." 7 seconds ago Up 5 seconds mesos-agent1 8951841d1da6 nexusjpl/spark-mesos-agent "/tmp/docker-entry..." 7 seconds ago Up 6 seconds mesos-agent3 c0240926a4a2 nexusjpl/spark-mesos-agent "/tmp/docker-entry..." 7 seconds ago Up 6 seconds mesos-agent2 c97ad268833f nexusjpl/spark-mesos-master "/bin/bash -c './b..." 7 seconds ago Up 7 seconds 0.0.0.0:5050->5050/tcp mesos-master 90d370eb3a4e nexusjpl/jupyter "tini -- start-not..." 2 days ago Up 2 days 0.0.0.0:8000->8888/tcp jupyter cd0f47fe303d nexusjpl/nexus-solr "docker-entrypoint..." 2 days ago Up 2 days 8983/tcp solr2 8c0f5c8eeb45 nexusjpl/nexus-solr "docker-entrypoint..." 2 days ago Up 2 days 8983/tcp solr3 27e34d14c16e nexusjpl/nexus-solr "docker-entrypoint..." 2 days ago Up 2 days 8983/tcp solr1 247f807cb5ec cassandra:2.2.8 "/docker-entrypoin..." 2 days ago Up 2 days 7000-7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp cassandra3 09cc86a27321 zookeeper "/docker-entrypoin..." 2 days ago Up 2 days 2181/tcp, 2888/tcp, 3888/tcp zk1 33e9d9b1b745 zookeeper "/docker-entrypoin..." 2 days ago Up 2 days 2181/tcp, 2888/tcp, 3888/tcp zk3 dd29e4d09124 cassandra:2.2.8 "/docker-entrypoin..." 2 days ago Up 2 days 7000-7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp cassandra2 11e57e0c972f zookeeper "/docker-entrypoin..." 2 days ago Up 2 days 2181/tcp, 2888/tcp, 3888/tcp zk2 2292803d942d cassandra:2.2.8 "/docker-entrypoin..." 2 days ago Up 2 days 7000-7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp cassandra1
List the available Mesos slaves by running the cell below.
In [ ]:
# TODO Run this cell to see the status of the Mesos slaves. You should see 3 slaves connected.
import requests
import json
response = requests.get('http://mesos-master:5050/state.json')
print(json.dumps(response.json()['slaves'], indent=2))
In [ ]:
import nexuscli
nexuscli.set_target("http://nexus-webapp:8083")
nexuscli.dataset_list()
In [ ]:
# TODO Run this cell to produce a Time Series plot using AVHRR data.
%matplotlib inline
import matplotlib.pyplot as plt
import time
import nexuscli
from datetime import datetime
from shapely.geometry import box
bbox = box(-150, 40, -120, 55)
datasets = ["AVHRR_OI_L4_GHRSST_NCEI"]
start_time = datetime(2013, 1, 1)
end_time = datetime(2013, 12, 31)
start = time.perf_counter()
ts, = nexuscli.time_series(datasets, bbox, start_time, end_time, spark=True)
print("Time Series took {} seconds to generate".format(time.perf_counter() - start))
plt.figure(figsize=(10,5), dpi=100)
plt.plot(ts.time, ts.mean, 'b-', marker='|', markersize=2.0, mfc='b')
plt.grid(b=True, which='major', color='k', linestyle='-')
plt.xlabel("Time")
plt.ylabel ("Sea Surface Temperature (C)")
plt.show()
In [ ]:
# TODO Run this cell. You should see at least one successful Time Series Spark job.
import requests
response = requests.get('http://nexus-webapp:4040/api/v1/applications')
appId = response.json()[0]['id']
response = requests.get("http://nexus-webapp:4040/api/v1/applications/%s/jobs" % appId)
for job in response.json():
print(job['name'])
print('\t' + job['status'])