2015年5月24日 星期日

[研究] Apache Spark 1.3.1 + Hadoop 2.7.0 安裝

[研究] Apache Spark 1.3.1 + Hadoop 2.7.0 安裝

2015-05-23

官方網站 
https://spark.apache.org/

需要套件
CentOS
JDK >=6
Maven >= 3.0.4
Hadoop
Spark
Hive
MySQL-Server

安裝參考
https://spark.apache.org/docs/latest/building-spark.html

安裝

# 為了省事,避免意外的情況,關閉 SELinux (Security Linux ) 和 iptables

# 立刻關閉 SELinuxsetenforce 0 
# 設定 reboot 後自動關閉 SELinux#vi  /etc/selinux/config#找到#SELINUX=#設為#SELINUX=disabled  
sed -i -e "s@SELINUX=enforcing@#SELINUX=enforcing@"   /etc/selinux/config
sed -i -e "s@SELINUX=permissive@#SELINUX=permissive@"   /etc/selinux/config
sed -i -e "/SELINUX=/aSELINUX=disabled"   /etc/selinux/config

# 立刻停掉 iptables#service iptables stop  #service ip6tables stop systemctl   stop  firewalld # 設定 reboot 後自動關閉 iptable
#chkconfig iptables off  #chkconfig ip6tables off  systemctl   disable  firewalld 
cd   /usr/local


# 設定 Java 環境
#find / -name java
echo 'export JAVA_HOME=/usr/lib/jvm/java' >> /etc/profile
echo 'export PATH=$PATH:$JAVA_HOME/bin' >> /etc/profile
echo 'export CLASSPATH=$JAVA_HOME/lib/ext:$JAVA_HOME/lib/tools.jar' >> /etc/profile
source /etc/profile


# 安裝 Apache Hadoop
cd /usr/local
wget http://ftp.twaren.net/Unix/Web/apache/hadoop/common/hadoop-2.7.0/hadoop-2.7.0.tar.gz
tar zxvf hadoop-2.7.0.tar.gz
echo 'export HADOOP_HOME=/usr/local/hadoop-2.7.0' >> /etc/profile
echo 'export PATH=$PATH:$HADOOP_HOME/bin' >> /etc/profile
echo 'export PATH=$PATH:$HADOOP_HOME/sbin' >> /etc/profile
echo 'export HADOOP_PREFIX=$HADOOP_HOME' >> /etc/profile
echo 'export HADOOP_COMMON_HOME=$HADOOP_HOME' >> /etc/profile
echo 'export HADOOP_MAPRED_HOME=$HADOOP_HOME' >> /etc/profile
echo 'export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop' >> /etc/profile
echo 'export HADOOP_HDFS_HOME=$HADOOP_HOME' >> /etc/profile
echo 'export HADOOP_YARN_HOME=$HADOOP_HOME' >> /etc/profile
echo 'export YARN_CONF_DIR=$HADOOP_CONF_DIR' >> /etc/profile
source /etc/profile

yum   -y   install   java   java-devel   maven   mariadb-server    mysql

# 安裝 Apache Hive,目前最新 1.2.0 版
cd /usr/local
wget   http://apache.stu.edu.tw/hive/stable/apache-hive-1.2.0-bin.tar.gz
tar   xzvf   apache-hive-1.2.0-bin.tar.gz
export   HIVE_HOME=/usr/local/apache-hive-1.2.0-bin
export   PATH=$HIVE_HOME/bin:$PATH

# 安裝 Apache Spark
# 照官方資訊說不保證 Hadoop >= 2.5 可以正常,試試看吧,mvn 安裝在 Core i5 + SSD 上約1小時
# https://spark.apache.org/docs/latest/building-spark.html
# build/mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package

#wget   http://www.apache.org/dyn/closer.cgi/spark/spark-1.3.1/spark-1.3.1.tgz
# Apache 網站這份有問題,別抓

cd   /usr/local
wget   http://d3kbcqa49mib13.cloudfront.net/spark-1.3.1.tgz
tar   zxvf   spark-1.3.1.tgz
export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"
cd    spark-1.3.1
mvn -Pyarn -Dyarn.version=2.7.0 -Phadoop-2.7 -Dhadoop.version=2.7.0 -Phive -Phive-1.2.0 -Phive-thriftserver -DskipTests clean package

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Spark Project Parent POM .......................... SUCCESS [9:35.831s]
[INFO] Spark Project Networking .......................... SUCCESS [4:32.864s]
[INFO] Spark Project Shuffle Streaming Service ........... SUCCESS [10.769s]
[INFO] Spark Project Core ................................ SUCCESS [10:06.961s]
[INFO] Spark Project Bagel ............................... SUCCESS [36.339s]
[INFO] Spark Project GraphX .............................. SUCCESS [1:51.768s]
[INFO] Spark Project Streaming ........................... SUCCESS [2:02.945s]
[INFO] Spark Project Catalyst ............................ SUCCESS [2:23.229s]
[INFO] Spark Project SQL ................................. SUCCESS [3:07.234s]
[INFO] Spark Project ML Library .......................... SUCCESS [6:28.304s]
[INFO] Spark Project Tools ............................... SUCCESS [19.707s]
[INFO] Spark Project Hive ................................ SUCCESS [4:16.616s]
[INFO] Spark Project REPL ................................ SUCCESS [1:10.996s]
[INFO] Spark Project YARN ................................ SUCCESS [1:33.756s]
[INFO] Spark Project Hive Thrift Server .................. SUCCESS [1:25.043s]
[INFO] Spark Project Assembly ............................ SUCCESS [1:53.869s]
[INFO] Spark Project External Twitter .................... SUCCESS [36.507s]
[INFO] Spark Project External Flume Sink ................. SUCCESS [1:12.932s]
[INFO] Spark Project External Flume ...................... SUCCESS [46.367s]
[INFO] Spark Project External MQTT ....................... SUCCESS [1:31.214s]
[INFO] Spark Project External ZeroMQ ..................... SUCCESS [28.890s]
[INFO] Spark Project External Kafka ...................... SUCCESS [1:07.236s]
[INFO] Spark Project Examples ............................ SUCCESS [4:02.523s]
[INFO] Spark Project YARN Shuffle Service ................ SUCCESS [11.288s]
[INFO] Spark Project External Kafka Assembly ............. SUCCESS [25.212s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1:02:01.402s
[INFO] Finished at: Sat May 23 21:47:26 CST 2015
[INFO] Final Memory: 84M/727M
[INFO] ------------------------------------------------------------------------
[WARNING] The requested profile "hadoop-2.7" could not be activated because it does not exist.
[WARNING] The requested profile "hive-1.2.0" could not be activated because it does not exist.
[root@localhost spark-1.3.1]#

最後兩行說找不到 hadoop-2.7 和 hive-1.2.0 的 profile

***************************************************************************

改執行
mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 56:59.050s
[INFO] Finished at: Sun May 24 10:06:46 CST 2015
[INFO] Final Memory: 72M/540M
[INFO] ------------------------------------------------------------------------
[root@localhost spark-1.3.1]#

似乎成功

(完)

相關文章

[研究] Apache Maven 3.0.5 (yum) 安裝 (CentOS 7.1 x64_86)
http://shaurong.blogspot.com/2015/05/apache-maven-305-yum-centos-71-x6486.html

[研究] Hadoop 2.7.0 Single Cluster 安裝 (CentOS 7.1 x86_64)
http://shaurong.blogspot.com/2015/05/hadoop-270-single-cluster-centos-71.html

Spark-1.3.1與Hive整合實現查詢分析
http://wrox.cn/article/1036050/

沒有留言:

張貼留言