Python Jupyter notebook supports execution of Linux command inside the notebook cells. This is done by adding the ! to the beginning of the command line. It should be noted that each command begins with a ! will create a new bash shell and close this cell once the execution is done:
Upload the text directory into the newly created intro-to-hadoop directory.
$ wget https://raw.githubusercontent.com/linhbngo/Distributed-and-Cluster-Computing/master/text/gutenberg-shakespeare.txt
$ hdfs dfs -put gutenberg-shakespeare.txt intro-to-hadoop/
$ hdfs dfs -ls intro-to-hadoop
18/11/01 17:03:38 INFO client.RMProxy: Connecting to ResourceManager at clnode188.clemson.cloudlab.us/130.127.133.197:8050
18/11/01 17:03:39 INFO client.AHSProxy: Connecting to Application History server at clnode195.clemson.cloudlab.us/130.127.133.204:10200
18/11/01 17:03:39 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/lngo/.staging/job_1541104508981_0008
18/11/01 17:03:39 INFO input.FileInputFormat: Total input files to process : 1
18/11/01 17:03:39 INFO mapreduce.JobSubmitter: number of splits:1
18/11/01 17:03:39 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1541104508981_0008
18/11/01 17:03:39 INFO mapreduce.JobSubmitter: Executing with tokens: []
18/11/01 17:03:39 INFO conf.Configuration: found resource resource-types.xml at file:/etc/hadoop/3.0.1.0-187/0/resource-types.xml
18/11/01 17:03:40 INFO impl.YarnClientImpl: Submitted application application_1541104508981_0008
18/11/01 17:03:40 INFO mapreduce.Job: The url to track the job: http://clnode188.clemson.cloudlab.us:8088/proxy/application_1541104508981_0008/
18/11/01 17:03:40 INFO mapreduce.Job: Running job: job_1541104508981_0008
18/11/01 17:03:44 INFO mapreduce.Job: Job job_1541104508981_0008 running in uber mode : false
18/11/01 17:03:44 INFO mapreduce.Job: map 0% reduce 0%
18/11/01 17:03:50 INFO mapreduce.Job: map 100% reduce 0%
18/11/01 17:03:54 INFO mapreduce.Job: map 100% reduce 100%
18/11/01 17:03:54 INFO mapreduce.Job: Job job_1541104508981_0008 completed successfully
18/11/01 17:03:54 INFO mapreduce.Job: Counters: 53
File System Counters
FILE: Number of bytes read=973082
FILE: Number of bytes written=2409015
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=5447902
HDFS: Number of bytes written=713504
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=282360
Total time spent by all reduces in occupied slots (ms)=266760
Total time spent by all map tasks (ms)=3620
Total time spent by all reduce tasks (ms)=1710
Total vcore-milliseconds taken by all map tasks=3620
Total vcore-milliseconds taken by all reduce tasks=1710
Total megabyte-milliseconds taken by all map tasks=289136640
Total megabyte-milliseconds taken by all reduce tasks=273162240
Map-Reduce Framework
Map input records=124213
Map output records=899681
Map output bytes=8529629
Map output materialized bytes=973082
Input split bytes=158
Combine input records=899681
Combine output records=67109
Reduce input groups=67109
Reduce shuffle bytes=973082
Reduce input records=67109
Reduce output records=67109
Spilled Records=134218
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=705
CPU time spent (ms)=28770
Physical memory (bytes) snapshot=3245010944
Virtual memory (bytes) snapshot=210530725888
Total committed heap usage (bytes)=3762814976
Peak Map Physical memory (bytes)=2774241280
Peak Map Virtual memory (bytes)=70509170688
Peak Reduce Physical memory (bytes)=470769664
Peak Reduce Virtual memory (bytes)=140021555200
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=5447744
File Output Format Counters
Bytes Written=713504