These will look something like
AKIAIOSFODNN7EXAMPLE
wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
This is a name that you give - mine is cliburn-2016
and an associated PEM file - I keep mine at ~/AWS/cliburn-2016.pem.
Set the correct permissions on the PEM file.
chmod 400 xxx.pem
Warning: You will be charged for this.
aws emr create-cluster --name "<<NAME-FOR-CLUSTER>>" --release-label emr-4.5.0 --applications Name=Spark Name=Zeppelin-Sandbox --ec2-attributes KeyName=<<Your key-pair>>> --instance-type m3.xlarge --instance-count 3 --use-default-roles
For example, I start mine with
aws emr create-cluster --name "spak-2016-d" --release-label emr-4.5.0 --applications Name=Spark Name=Zeppelin-Sandbox --ec2-attributes KeyName="cliburn-2016" --instance-type m3.xlarge --instance-count 3 --use-default-role
A cluster-id should be returned
{
"ClusterId": "j-XXXXXXXXXXXXXXX"
}
Zepellin
notebookCreate an SSH tunnel to port 8890
ssh -i xxx.pem -L 8192:ec2-xx-xx-xx.compute-1.amazonaws.com:8192 hadoop@ec2-xx-xx-xx-xx.compute-1.amazonaws.com -N -v
Fill in the xxx
with the locatin of your PEM file, and the appropriate IP address.
Zeppelin
notebookOpen a browser to http://localhost:8890/ - if it worked you should see this
When you are done, remember to terminate the cluster!
aws emr terminate-clusters --cluster-id j-XXXXXXXXXXXXXXX
and confirm that it is terminating
aws emr describe-cluster --cluster-id j-XXXXXXXXXXXXXXX | grep \"State\"
You should see
"State": "TERMINATING"
"State": "TERMINATING"
"State": "TERMINATING"
If you are paranoid, log into the AWS Management Console and click on Services | EMR
and check the status of your cluster.
In [ ]: