1.安装JAVA

(a)安装过程省略

过程参考这里, 版本略旧。

(b)查看安装版本

$ java –version

java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode

2.安装HADOOP

(a)安装过程省略

(b)查看安装版本

$ hadoop version

Hadoop 2.8.0
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 91f2b7a13d1e97be65db92ddabc627cc29ac0009
Compiled by jdu on 2017-03-17T04:12Z
Compiled with protoc 2.5.0
From source with checksum 60125541c2b3e266cbf3becc5bda666
This command was run using /home/ubuntu/Download/hadoop-2.8.0/share/hadoop/common/hadoop-common-2.8.0.jar

(c) 启动YARN集群

$ $HADOOP_HOME/sbin/start-dfs.sh
$ $HADOOP_HOME/sbin/start-yarn.sh
$ $HADOOP_HOME/sbin/start-all.sh

3.安装SPARK

(a).下载SPARK源码,省略

(b).编译

$ cd spark-2.1.1/
$ ./build/mvn -Phadoop-provided -Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -DskipTests clean package

[INFO] Reactor Summary:
[INFO] 
[INFO] Spark Project Parent POM ........................... SUCCESS [ 13.663 s]
[INFO] Spark Project Tags ................................. SUCCESS [ 17.955 s]
[INFO] Spark Project Sketch ............................... SUCCESS [  6.808 s]
[INFO] Spark Project Networking ........................... SUCCESS [ 11.276 s]
[INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [  5.935 s]
[INFO] Spark Project Unsafe ............................... SUCCESS [  9.713 s]
[INFO] Spark Project Launcher ............................. SUCCESS [ 14.214 s]
[INFO] Spark Project Core ................................. SUCCESS [02:28 min]
[INFO] Spark Project ML Local Library ..................... SUCCESS [ 11.149 s]
[INFO] Spark Project GraphX ............................... SUCCESS [ 13.799 s]
[INFO] Spark Project Streaming ............................ SUCCESS [ 32.115 s]
[INFO] Spark Project Catalyst ............................. SUCCESS [01:20 min]
[INFO] Spark Project SQL .................................. SUCCESS [01:45 min]
[INFO] Spark Project ML Library ........................... SUCCESS [01:15 min]
[INFO] Spark Project Tools ................................ SUCCESS [  1.924 s]
[INFO] Spark Project Hive ................................. SUCCESS [01:00 min]
[INFO] Spark Project REPL ................................. SUCCESS [  5.636 s]
[INFO] Spark Project YARN Shuffle Service ................. SUCCESS [  8.612 s]
[INFO] Spark Project YARN ................................. SUCCESS [ 15.541 s]
[INFO] Spark Project Hive Thrift Server ................... SUCCESS [ 29.356 s]
[INFO] Spark Project Assembly ............................. SUCCESS [ 10.835 s]
[INFO] Spark Project External Flume Sink .................. SUCCESS [  7.775 s]
[INFO] Spark Project External Flume ....................... SUCCESS [  9.472 s]
[INFO] Spark Project External Flume Assembly .............. SUCCESS [  2.272 s]
[INFO] Spark Integration for Kafka 0.8 .................... SUCCESS [ 23.306 s]
[INFO] Spark Project Examples ............................. SUCCESS [ 18.441 s]
[INFO] Spark Project External Kafka Assembly .............. SUCCESS [  4.008 s]
[INFO] Spark Integration for Kafka 0.10 ................... SUCCESS [ 11.553 s]
[INFO] Spark Integration for Kafka 0.10 Assembly .......... SUCCESS [  3.748 s]
[INFO] Kafka 0.10 Source for Structured Streaming ......... SUCCESS [  9.846 s]
[INFO] Spark Project Java 8 Tests ......................... SUCCESS [  5.179 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 12:59 min
[INFO] Finished at: 2017-06-13T06:11:22+00:00
[INFO] Final Memory: 96M/1352M
[INFO] ------------------------------------------------------------------------

4.安装HIVE

(a).安装HIVE

$ cd ~
$ wget https://archive.apache.org/dist/hive/hive-1.2.1/apache-hive-1.2.1-bin.tar.gz
$ tar zxvf apache-hive-1.2.1-bin.tar.gz
$ ls
apache-hive-1.2.1-bin  apache-hive-1.2.1-bin.tar.gz
$ sudo mv apache-hive-1.2.1-bin /usr/local/hive

注意版本,根据SPARK的版本选择HIVE版本,否则会出错。

(b).配置HIVE环境

$ cd ~
$ vi .bashrc

export HIVE_HOME=/usr/local/hive
export PATH=$PATH:$HIVE_HOME/bin
export CLASSPATH=$CLASSPATH:/usr/local/Hadoop/lib/*:.
export CLASSPATH=$CLASSPATH:/usr/local/hive/lib/*:.

$ source .bashrc

(c).配置HIVE

$ cd $HIVE_HOME/conf
$ cp hive-env.sh.template hive-env.sh
$ vi hive-env.sh

export HADOOP_HOME=/usr/local/hadoop

$ cp hive-default.xml.template hive-site.xml
$ vi hive-site.xml

配置服务器端口,介绍在这里

<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:derby://localhost:1527/metastore_db;create=true </value>
    <description>JDBC connect string for a JDBC metastore </description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>org.apache.derby.jdbc.ClientDriver</value>
    <description>Driver class name for a JDBC metastore</description>
  </property>

在hive-site.xml配置的开始行添加下列代码,原因在这里

<property>
    <name>system:java.io.tmpdir</name>
    <value>/tmp/hive/java</value>
  </property>
  <property>
    <name>system:user.name</name>
    <value>${user.name}</value>
  </property>
$ vi jpox.properties

javax.jdo.PersistenceManagerFactoryClass=org.jpox.PersistenceManagerFactoryImpl
org.jpox.autoCreateSchema=false
org.jpox.validateTables=false
org.jpox.validateColumns=false
org.jpox.validateConstraints=false
org.jpox.storeManagerType=rdbms
org.jpox.autoCreateSchema=true
org.jpox.autoStartMechanismMode=checked
org.jpox.transactionIsolation=read_committed
javax.jdo.option.DetachAllOnCommit=true
javax.jdo.option.NontransactionalRead=true
javax.jdo.option.ConnectionDriverName=org.apache.derby.jdbc.ClientDriver
javax.jdo.option.ConnectionURL=jdbc:derby://localhost:1527/metastore_db;create=true
javax.jdo.option.ConnectionUserName=APP
javax.jdo.option.ConnectionPassword=mine

5.安装DERBY

(a).安装DERBY

$ cd ~
$ wget http://archive.apache.org/dist/db/derby/db-derby-10.10.2.0/db-derby-10.10.2.0-bin.tar.gz
版本和HIVE保持一致
$ tar zxvf db-derby-10.10.2.0-bin.tar.gz
$ ls
db-derby-10.10.2.0-bin db-derby-10.10.2.0-bin.tar.gz
$ sudo mv db-derby-10.10.2.0-bin /usr/local/derby

(b).配置DERBY环境

$ cd ~
$ vi .bashrc
export DERBY_HOME=/usr/local/derby
export PATH=$PATH:$DERBY_HOME/bin
export CLASSPATH=$CLASSPATH:$DERBY_HOME/lib/derby.jar:$DERBY_HOME/lib/derbytools.jar
$ source .bashrc
$ cp $DERBY_HOME/lib/derbyclient.jar $HIVE_HOME/lib/
$ cp $DERBY_HOME/lib/derbytools.jar $HIVE_HOME/lib/
$ mkdir $DERBY_HOME/data

6.验证HIVE

(a).验证DERBY

$ java org.apache.derby.tools.sysinfo

------------------ Java Information ------------------
Java Version:    1.8.0_131
Java Vendor:     Oracle Corporation
Java home:       /usr/lib/jvm/java-8-oracle/jre
Java classpath:  :/usr/local/Hadoop/lib/*:.:/usr/local/hive/lib/calcite-core-1.2.0-incubating.jar::/usr/local/derby/lib/derbytools.jar
OS name:         Linux
OS architecture: amd64
OS version:      4.4.0-1018-aws
Java user name:  ubuntu
Java user home:  /home/ubuntu
Java user dir:   /home/ubuntu
java.specification.name: Java Platform API Specification
java.specification.version: 1.8
java.runtime.version: 1.8.0_131-b11
--------- Derby Information --------
[/usr/local/hive/lib/derby-10.10.2.0.jar] 10.10.2.0 - (1582446)
[/usr/local/hive/lib/derbytools.jar] 10.10.2.0 - (1582446)
[/usr/local/hive/lib/derbyclient.jar] 10.10.2.0 - (1582446)
[/usr/local/hive/lib/hive-jdbc-1.2.1-standalone.jar] 10.10.2.0 - (1582446)
[/usr/local/derby/lib/derby.jar] 10.10.2.0 - (1582446)
[/usr/local/derby/lib/derbytools.jar] 10.10.2.0 - (1582446)
------------------------------------------------------
----------------- Locale Information -----------------
Current Locale :  [English/United States [en_US]]
Found support for locale: [cs]
     version: 10.10.2.0 - (1582446)
Found support for locale: [de_DE]
     version: 10.10.2.0 - (1582446)
Found support for locale: [es]
     version: 10.10.2.0 - (1582446)
Found support for locale: [fr]
     version: 10.10.2.0 - (1582446)
Found support for locale: [hu]
     version: 10.10.2.0 - (1582446)
Found support for locale: [it]
     version: 10.10.2.0 - (1582446)
Found support for locale: [ja_JP]
     version: 10.10.2.0 - (1582446)
Found support for locale: [ko_KR]
     version: 10.10.2.0 - (1582446)
Found support for locale: [pl]
     version: 10.10.2.0 - (1582446)
Found support for locale: [pt_BR]
     version: 10.10.2.0 - (1582446)
Found support for locale: [ru]
     version: 10.10.2.0 - (1582446)
Found support for locale: [zh_CN]
     version: 10.10.2.0 - (1582446)
Found support for locale: [zh_TW]
     version: 10.10.2.0 - (1582446)
------------------------------------------------------
------------------------------------------------------

(b).启动YARN集群

$ $HADOOP_HOME/sbin/start-dfs.sh
$ $HADOOP_HOME/sbin/start-yarn.sh
$ $HADOOP_HOME/sbin/start-all.sh

(c).启动DERBY服务器

$ startNetworkServer
Tue Jun 13 06:51:35 UTC 2017 : Security manager installed using the Basic server security policy.
Tue Jun 13 06:51:35 UTC 2017 : Apache Derby Network Server - 10.10.2.0 - (1582446) started and ready to accept connections on port 1527

(b).启动HIVE

$ hive
Logging initialized using configuration in jar:file:/usr/local/hive/lib/hive-common-1.2.1.jar!/hive-log4j.properties
hive>