2013年11月7日 星期四

[研究] Hadoop 2.2.0 編譯 (CentOS 6.4 x64)

[研究] Hadoop 2.2.0 編譯 (CentOS 6.4 x64)

2013-11-06

Hadoop 是個架設雲端的系統,它參考Google Filesystem,以Java開發,提供HDFS與MapReduce API。

官方網站
http://hadoop.apache.org/

安裝參考
http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0/BUILDING.txt

為何要自己辛苦編譯,後面會解釋。

基本需求

* Unix System
* JDK 1.6+
* Maven 3.0 or later
* Findbugs 1.3.9 (if running findbugs)
* ProtocolBuffer 2.5.0
* CMake 2.6 or newer (if compiling native code)
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)

# 安裝基本套件
[root@localhost ~]# su root

[root@localhost ~]# yum -y install gcc  gcc-c++  svn  cmake git zlib zlib-devel openssl openssl-devel rsync

[root@localhost ~]# yum -y groupinstall 'Development Tools'
[root@localhost ~]# yum -y install cmake zlib-devel openssl openssl-devel rsync

# 安裝 JDK ( Java Development Kit 7 Update 45 ),Maven 需要
# 手動下載 jdk-7u45-linux-x64.rpm
# http://www.oracle.com/technetwork/java/javase/downloads/index-jsp-138363.html#javasejdk

[root@localhost ~]# rpm -ivh  jdk-7u45-linux-x64.rpm

# 目前預設是 OpenJDK,敝人把預設換成 Oracle JDK

[root@localhost ~]# java -version
java version "1.7.0_09-icedtea"
OpenJDK Runtime Environment (rhel-2.3.4.1.el6_3-x86_64)
OpenJDK 64-Bit Server VM (build 23.2-b09, mixed mode)

[root@localhost ~]# alternatives --install /usr/bin/java java /usr/java/jdk1.7.0_45/jre/bin/java 100

[root@localhost ~]# alternatives --config java

There are 3 programs which provide 'java'.

  Selection    Command
-----------------------------------------------
*+ 1           /usr/lib/jvm/jre-1.7.0-openjdk.x86_64/bin/java
   2           /usr/lib/jvm/jre-1.6.0-openjdk.x86_64/bin/java
   3           /usr/java/jdk1.7.0_45/jre/bin/java

Enter to keep the current selection[+], or type selection number: 3

[root@localhost ~]# export JAVA_HOME=/usr/java/jdk1.7.0_45

[root@localhost ~]# java -version
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)
[root@localhost ~]#

# 安裝 Apache Maven 3.1.1
cd  /usr/local/src
wget http://ftp.tc.edu.tw/pub/Apache/maven/maven-3/3.1.1/binaries/apache-maven-3.1.1-bin.tar.gz
tar zxvf apache-maven-3.1.1-bin.tar.gz -C /usr/local
ln  -s  /usr/local/apache-maven-3.1.1/bin/mvn  /usr/bin/mvn

# 安裝 FindBugs
cd  /usr/local/src
wget http://prdownloads.sourceforge.net/findbugs/findbugs-2.0.2.tar.gz?download
tar zxvf findbugs-2.0.2.tar.gz -C /usr/local/
ln -s /usr/local/findbugs-2.0.2/bin/findbugs  /usr/bin/findbugs

# 安裝 Protoc 2.5.0
cd  /usr/local/src
wget https://protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz
tar zxvf protobuf-2.5.0.tar.gz -C /usr/local/src
cd /usr/local/src/protobuf-2.5.0
./configure
make
make install
ln -s /usr/local/bin/protoc /usr/bin/protoc

# 編譯 Hadoop 2.2.0

cd  /usr/local/src
wget http://ftp.mirror.tw/pub/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0-src.tar.gz
tar zxvf hadoop-2.2.0-src.tar.gz -C /usr/local/src
cd  /usr/local/src/hadoop-2.2.0-src/

[root@localhost ~]# mvn package -Pdist,native -DskipTests -Dtar
...(略,mvn package 花費時間和網路頻寬、電腦快慢有關,十幾分鐘到超過半小時都遇過)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 31:08.714s
[INFO] Finished at: Wed Nov 06 16:32:30 CST 2013
[INFO] Final Memory: 104M/367M
[INFO] ------------------------------------------------------------------------


*****************************************************************************

自己編譯 hadoop 有甚麼用途嗎?
官方提供下載的 hadoop-2.2.0.tar.gz 就是已經編譯好的

問題出在這裡

[root@localhost src]# cd /usr/local/src
[root@localhost src]# wget http://ftp.mirror.tw/pub/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz
[root@localhost src]# tar zxvf hadoop-2.2.0.tar.gz -C /usr/local
[root@localhost src]# cd /usr/local/hadoop-2.2.0/lib/native
[root@localhost native]# file *
libhadoop.a:        current ar archive
libhadooppipes.a:   current ar archive
libhadoop.so:       symbolic link to `libhadoop.so.1.0.0'
libhadoop.so.1.0.0: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, not stripped
libhadooputils.a:   current ar archive
libhdfs.a:          current ar archive
libhdfs.so:         symbolic link to `libhdfs.so.0.0.0'
libhdfs.so.0.0.0:   ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, not stripped

hadoop-2.2.0.tar.gz 提供的某些 Library 是 32-bit 的,如果在 x86_64 的作業系統上跑,會出現

Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/hadoop/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.

WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

不過訊息說這只是 "WARN" (警告),不是 ERROR、FAIL、FAILURE 等級的問題
是否真的不會產生甚麼大問題,不知道....

來看一下在 x86_64 作業系統上編譯出來的

[root@localhost native]# cd /usr/local/src/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib/native
[root@hadoop1 native]# file *
libhadoop.a:        current ar archive
libhadooppipes.a:   current ar archive
libhadoop.so:       symbolic link to `libhadoop.so.1.0.0'
libhadoop.so.1.0.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped
libhadooputils.a:   current ar archive
libhdfs.a:          current ar archive
libhdfs.so:         symbolic link to `libhdfs.so.0.0.0'
libhdfs.so.0.0.0:   ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped

如果你使用 x86_64 作業系統,可考慮把 x86 的另外備份,x86_64 版的拷貝過來覆蓋掉 x86 版的

[root@localhost src]# cd /usr/local/hadoop-2.2.0/lib/native
[root@localhost native]# cp libhadoop.so.1.0.0  libhadoop.so.1.0.0.x86
[root@localhost native]# cp libhdfs.so.0.0.0  libhdfs.so.0.0.0.x86

[root@localhost native]# cd /usr/local/src/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib/native

[root@localhost native]# cp libhadoop.so.1.0.0  /usr/local/hadoop-2.2.0/lib/native
cp: overwrite `/usr/local/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0'? y

[root@localhost native]# cp -f  libhdfs.so.0.0.0    /usr/local/hadoop-2.2.0/lib/native
cp: overwrite `/usr/local/hadoop-2.2.0/lib/native/libhdfs.so.0.0.0'? y

為了節省大家時間,我把 x86_64 版的 native 目錄整個壓縮成 hadoop-2.2.0-native_x86_64.tar.gz,找了免註冊、免費空間放上去

http://www.sendspace.com/file/5rm0j1

https://folders.io/get/98YgDc

************************

後來根據
[研究] hadoop-2.2.0.tar.gz 有哪些檔案是 32-bit 專用的
http://shaurong.blogspot.tw/2013/11/hadoop-220targz-32-bit.html

又做了一份 hadoop-2.2.0-x86-x86_64.tar.gz
以 64-bit 為主,但是保留 x86 版本4個檔案,檔名添加了 .x86 作為識別,必要時候拷貝覆蓋過去使用。

(完)


相關

[研究] Hadoop 2.2.0 編譯 (CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/11/hadoop-220-centos-64-x64.html

[研究] Hadoop 2.2.0 Single Cluster 安裝 (二)(CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/11/hadoop-220-single-cluster-centos-64-x64_7.html

[研究] Hadoop 2.2.0 Single Cluster 安裝 (一)(CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/11/hadoop-220-single-cluster-centos-64-x64.html

[研究] Hadoop 1.2.1 (rpm)安裝 (CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/10/hadoop-121-rpm-centos-64-x64.html

[研究] Hadoop 1.2.1 (bin)安裝 (CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/07/hadoop-112-centos-64-x64.html

[研究] Hadoop 1.2.1 安裝 (CentOS 6.4 x64)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=80035

[研究] 雲端軟體 Hadoop 1.0.0 安裝 (CentOS 6.2 x86)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=21166

[研究] 雲端軟體 Hadoop 0.20.2 安裝 (CentOS 5.5 x86)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=18513

[研究] 雲端軟體 Hadoop 0.20.2 安裝 (CentOS 5.4 x86)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=17974

12 則留言:

  1. Thanks so much. I have been looking for the 64bit native libs for days. It really really helps.

    回覆刪除
    回覆
    1. 感謝回應 :) ,這篇更方便

      [研究] hadoop-2.2.0.tar.gz 快速編譯腳本程式(CentOS 6.4 x64)
      http://shaurong.blogspot.tw/2013/11/hadoop-220targz-centos-64-x64.html

      刪除
    2. 謝謝您的指導,由於已經先安裝oracle jdk了,所以第一次執行會出現JAVA_HOME目錄設定錯誤以及另一個問題是open jdk 目前的版本為 java-1.7.0-openjdk-1.7.0.51.x86_64 ,(註:CentOS 目前是 6.5),執行Hadoop-2.2.0_Compile.sh後出現以下的錯誤(跟上次一樣),還是需要您的協助,感謝!

      目前環境相關設定如下:
      [root@master01 jvm]# pwd
      /usr/lib/jvm
      [root@master01 jvm]# ls -l
      total 4
      lrwxrwxrwx. 1 root root 26 Feb 6 18:52 java -> /etc/alternatives/java_sdk
      lrwxrwxrwx. 1 root root 32 Feb 6 18:52 java-1.7.0 -> /etc/alternatives/java_sdk_1.7.0
      drwxr-xr-x. 7 root root 4096 Feb 6 18:52 java-1.7.0-openjdk-1.7.0.51.x86_64
      lrwxrwxrwx. 1 root root 34 Feb 6 18:52 java-1.7.0-openjdk.x86_64 -> java-1.7.0-openjdk-1.7.0.51.x86_64
      lrwxrwxrwx. 1 root root 34 Feb 6 18:52 java-openjdk -> /etc/alternatives/java_sdk_openjdk
      lrwxrwxrwx. 1 root root 21 Feb 6 18:55 jre -> /etc/alternatives/jre
      lrwxrwxrwx. 1 root root 27 Feb 6 18:52 jre-1.7.0 -> /etc/alternatives/jre_1.7.0
      lrwxrwxrwx. 1 root root 38 Feb 6 18:52 jre-1.7.0-openjdk.x86_64 -> java-1.7.0-openjdk-1.7.0.51.x86_64/jre
      lrwxrwxrwx. 1 root root 29 Feb 6 18:52 jre-openjdk -> /etc/alternatives/jre_openjdk
      [root@master01 jvm]#

      [root@master01 jvm]# java -version
      java version "1.7.0_51"
      OpenJDK Runtime Environment (rhel-2.4.4.1.el6_5-x86_64 u51-b02)
      OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode)


      alternatives --install /usr/bin/java java /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.51.x86_64/jre/bin/java 100

      [root@master01 bin]# alternatives --config java

      There are 3 programs which provide 'java'.

      Selection Command
      -----------------------------------------------
      1 /usr/java/jdk1.7.0_45/bin/java
      * 2 /usr/lib/jvm/jre-1.7.0-openjdk.x86_64/bin/java
      + 3 /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.51.x86_64/jre/bin/java

      Enter to keep the current selection[+], or type selection number: 3

      [root@master01 bin]# export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.51.x86_64
      [root@master01 bin]# java -version
      java version "1.7.0_51"
      OpenJDK Runtime Environment (rhel-2.4.4.1.el6_5-x86_64 u51-b02)
      OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode)
      [root@master01 bin]#
      [root@master01 bin]# echo $JAVA_HOME
      /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.51.x86_64

      Hadoop-2.2.0_Compile.sh 檔案內容修改以下內容
      export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.51.x86_64

      但執行後出現以下的錯誤(跟上次一樣),還需要您的指點,謝謝!
      GET request of: org/beanshell/bsh/2.0b4/bsh-2.0b4.jar from central failed: Premature end of Content-Length delimited message body (expected: 281694; received: 280504 -> [Help 1]

      刪除
    3. 文章長度限制,分2篇貼,請包涵
      執行訊息如下:
      [INFO] Apache Hadoop Distribution ........................ SKIPPED
      [INFO] Apache Hadoop Client .............................. SKIPPED
      [INFO] Apache Hadoop Mini-Cluster ........................ SKIPPED
      [INFO] ------------------------------------------------------------------------
      [INFO] BUILD FAILURE
      [INFO] ------------------------------------------------------------------------
      [INFO] Total time: 6.011s
      [INFO] Finished at: Thu Feb 06 19:32:10 PST 2014
      [INFO] Final Memory: 23M/56M
      [INFO] ------------------------------------------------------------------------
      [ERROR] Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:1.3.1:enforce (default) on project hadoop-main: Execution default of goal org.apache.maven.plugins:maven-enforcer-plugin:1.3.1:enforce failed: Plugin org.apache.maven.plugins:maven-enforcer-plugin:1.3.1 or one of its dependencies could not be resolved: Could not transfer artifact org.beanshell:bsh:jar:2.0b4 from/to central (http://repo.maven.apache.org/maven2): GET request of: org/beanshell/bsh/2.0b4/bsh-2.0b4.jar from central failed: Premature end of Content-Length delimited message body (expected: 281694; received: 280504 -> [Help 1]
      [ERROR]
      [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
      [ERROR] Re-run Maven using the -X switch to enable full debug logging.
      [ERROR]
      [ERROR] For more information about the errors and possible solutions, please read the following articles:
      [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginResolutionException
      [root@master01 tmp]#

      刪除
    4. 很奇怪,我剛剛測試也不行了,查了一下說是 hadoop-2.2.0-src.tar.gz 的 Bug (懷疑?)

      照這篇的說法,如果 maven cache 中的 jetty jars 是舊版就沒問題
      https://issues.apache.org/jira/browse/HADOOP-10110

      解決方法
      wget https://issues.apache.org/jira/secure/attachment/12614482/HADOOP-10110.patch
      patch -p0 /usr/local/src/hadoop-2.2.0-src/pom.xml < HADOOP-10110.patch
      mvn clean
      mvn package -Pdist,native -DskipTests -Dtar

      但我實際測試仍編譯問題,我猜測是 mvn package -Pdist,native -DskipTests -Dtar 下載的檔案有些變化影起的,還在研究...

      刪除
    5. 急的話,可以去官方下載
      http://apache.cdpa.nsysu.edu.tw/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz

      以及下載小弟提供的
      hadoop-2.2.0-native_x86_64.tar.gz
      http://www.sendspace.com/file/5rm0j1

      參考這篇覆蓋使用
      http://shaurong.blogspot.tw/2013/11/hadoop-220-single-cluster-centos-64-x64.html

      刪除
    6. 謝謝前輩的協助,感激不盡!
      我個人認為是 bsh-2.0b4.jar 檔案有問題,長度傳送與原始記載的不同,看訊息我是這樣的認為

      GET request of: org/beanshell/bsh/2.0b4/bsh-2.0b4.jar from central failed: Premature end of Content-Length delimited message body (expected: 281694; received: 280504 -> [Help 1]

      我會先試著你的建議方式裝裝看,希望能夠完成這個LAB
      最後還是要謝謝你。
      這個不是我的專門科,真頭痛~~~

      刪除
    7. 測試結果出來了

      [研究] hadoop-2.2.0-src.tar.gz 快速編譯安裝程式(二)(CentOS 6.5 x86_64)
      http://shaurong.blogspot.com/2014/02/hadoop-220-srctargz-centos-65-x8664_8080.html

      [研究] hadoop-2.2.0-src.tar.gz 快速編譯安裝程式(CentOS 6.5 x86_64)
      http://shaurong.blogspot.com/2014/02/hadoop-220-srctargz-centos-65-x8664_7.html

      [研究] hadoop-2.2.0-src.tar.gz 編譯研究(CentOS 6.5 x86_64)
      http://shaurong.blogspot.com/2014/02/hadoop-220-srctargz-centos-65-x8664.html

      刪除
    8. 照前輩這篇
      [研究] hadoop-2.2.0-src.tar.gz 快速編譯安裝程式(二)(CentOS 6.5 x86_64)
      http://shaurong.blogspot.com/2014/02/hadoop-220-srctargz-centos-65-x8664_8080.html
      可以正常的執行了,明天好好的拜讀一下,把未完的繼續走完。
      非常感謝您!

      刪除
    9. Hadoop初體驗終於完成了,只有 "MapReduce網頁管理介面" 還有問題,其他看來都正常,謝謝前輩。

      刪除
  2. 依照前輩指令(mvn package -Pdist,native -DskipTests -Dtar)
    出現了問題,對linux不熟悉,請不吝指點,謝謝
    [INFO] Apache Hadoop Distribution ........................ SKIPPED
    [INFO] Apache Hadoop Client .............................. SKIPPED
    [INFO] Apache Hadoop Mini-Cluster ........................ SKIPPED
    [INFO] ------------------------------------------------------------------------
    [INFO] BUILD FAILURE
    [INFO] ------------------------------------------------------------------------
    [INFO] Total time: 8.474s
    [INFO] Finished at: Wed Jan 29 11:56:28 CST 2014
    [INFO] Final Memory: 21M/72M
    [INFO] ------------------------------------------------------------------------
    [ERROR] Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:1.3.1:enforce (default) on project hadoop-main: Execution default of goal org.apache.maven.plugins:maven-enforcer-plugin:1.3.1:enforce failed: Plugin org.apache.maven.plugins:maven-enforcer-plugin:1.3.1 or one of its dependencies could not be resolved: Could not transfer artifact org.beanshell:bsh:jar:2.0b4 from/to central (http://repo.maven.apache.org/maven2): GET request of: org/beanshell/bsh/2.0b4/bsh-2.0b4.jar from central failed: Premature end of Content-Length delimited message body (expected: 281694; received: 275009 -> [Help 1]
    [ERROR]
    [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
    [ERROR] Re-run Maven using the -X switch to enable full debug logging.
    [ERROR]
    [ERROR] For more information about the errors and possible solutions, please read the following articles:
    [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginResolutionException
    [root@name1 hadoop-2.2.0-src]#

    回覆刪除
    回覆
    1. 試試看這篇

      [研究] hadoop-2.2.0.tar.gz 快速編譯腳本程式(CentOS 6.4 x64)
      http://shaurong.blogspot.tw/2013/11/hadoop-220targz-centos-64-x64.html

      刪除