2015年3月25日 星期三

[研究] Hadoop 2.6.0 Single Cluster 安裝 (CentOS 7.0 x86_64)

[研究] Hadoop 2.6.0 Single Cluster 安裝 (CentOS 7.0 x86_64)

2015-03-25

這是學習兼分享,可能不夠完善,或100%完全正確。

The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing.
它參考Google Filesystem,以Java開發,提供HDFS與MapReduce API。

官方網站
http://hadoop.apache.org/

安裝參考
http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/SingleCluster.html

http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/ClusterSetup.html

下載
http://www.apache.org/dyn/closer.cgi/hadoop/common/

安裝

# 為了省事,避免意外的情況,關閉 SELinux (Security Linux ) 和 iptables

# 立刻關閉 SELinux
setenforce 0 

# 設定 reboot 後自動關閉 SELinux
#vi  /etc/selinux/config
#找到
#SELINUX=
#設為
#SELINUX=disabled  

sed -i -e "s@SELINUX=enforcing@#SELINUX=enforcing@"   /etc/selinux/config
sed -i -e "s@SELINUX=permissive@#SELINUX=permissive@"   /etc/selinux/config
sed -i -e "/SELINUX=/aSELINUX=disabled"   /etc/selinux/config


# 立刻停掉 iptables
#service iptables stop  
#service ip6tables stop 
systemctl   stop  firewalld 

# 設定 reboot 後自動關閉 iptable
#chkconfig iptables off  
#chkconfig ip6tables off  
systemctl   disable  firewalld 

yum -y install  java
# or
#yum -y install java-1.7.0-openjdk

yum -y install  java-1.7.0-openjdk-devel
#find / -name java
#echo 'export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.65-2.5.1.2.el7_0.x86_64' >> /etc/profile
echo 'export JAVA_HOME=/usr/lib/jvm/java' >> /etc/profile
echo 'export PATH=$PATH:$JAVA_HOME/bin' >> /etc/profile

echo 'export CLASSPATH=$JAVA_HOME/lib/ext:$JAVA_HOME/lib/tools.jar' >> /etc/profile
source /etc/profile

#---------------------------
#wget http://apache.cdpa.nsysu.edu.tw/hadoop/common/hadoop-2.5.0/hadoop-2.5.0.tar.gz
cd /usr/local
wget http://ftp.twaren.net/Unix/Web/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
tar zxvf hadoop-2.6.0.tar.gz

echo 'export HADOOP_HOME=/usr/local/hadoop-2.6.0' >> /etc/profile
echo 'export PATH=$PATH:$HADOOP_HOME/bin' >> /etc/profile
echo 'export PATH=$PATH:$HADOOP_HOME/sbin' >> /etc/profile
echo 'export HADOOP_PREFIX=$HADOOP_HOME' >> /etc/profile

echo 'export HADOOP_COMMON_HOME=$HADOOP_HOME' >> /etc/profile
echo 'export HADOOP_MAPRED_HOME=$HADOOP_HOME' >> /etc/profile
echo 'export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop' >> /etc/profile
echo 'export HADOOP_HDFS_HOME=$HADOOP_HOME' >> /etc/profile
echo 'export HADOOP_YARN_HOME=$HADOOP_HOME' >> /etc/profile

echo 'export YARN_CONF_DIR=$HADOOP_CONF_DIR' >> /etc/profile
source /etc/profile

環境

192.168.128.10  (CentOS 7.0 x64)
主機名稱: localhost 和 localhost.localdomain

******************************************************************************

Standalone Operation 模式 測試
( 要切換到 hadoop 根目錄操作)

[root@localhost local]# cd hadoop-2.6.0/
[root@localhost hadoop-2.6.0]# mkdir input
[root@localhost hadoop-2.6.0]# cp etc/hadoop/*.xml input
[root@localhost hadoop-2.6.0]# $HADOOP_HOME/bin/hadoop jar  /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar  grep input output 'dfs[a-z.]+'
[root@localhost hadoop-2.6.0]#  cat output/*
1       dfsadmin
[root@localhost hadoop-2.6.0]#

******************************************************************************

Pseudo-Distributed Operation 模式 測試

設定執行環境

# 替 hadoop-env.sh  httpfs-env.sh  mapred-env.sh  yarn-env.sh 設定 JAVA_HOME
# 如果你已經執行過 export JAVA_HOME,且在 .bashrc 增加了,這部分可以不用設定,這邊是另一種方法,僅供參考

#1.設定hadoop-env.sh
[root@localhost ~]# vi /usr/local/hadoop-2.5.1/etc/hadoop/hadoop-env.sh

找到
# The java implementation to use.
export JAVA_HOME=${JAVA_HOME}

下面增加一行
export JAVA_HOME=/usr/lib/jvm/java

#2.設定httpfs-env.sh
[root@localhost ~]# vi /usr/local/hadoop-2.5.1/etc/hadoop/httpfs-env.sh
隨便找地方增加一行
export JAVA_HOME=/usr/lib/jvm/java

#3.設定mapred-env.sh
[root@localhost ~]# vi /usr/local/hadoop-2.5.0/etc/hadoop/mapred-env.sh
找到
# export JAVA_HOME=/home/y/libexec/jdk1.6.0/
下面增加一行
export JAVA_HOME=/usr/lib/jvm/java

#4.設定yarn-env.sh
[root@localhost ~]# vi /usr/local/hadoop-2.5.0/etc/hadoop/yarn-env.sh
找到
# export JAVA_HOME=/home/y/libexec/jdk1.6.0/
下面增加一行
export JAVA_HOME=/usr/lib/jvm/java

PS : 不知道為什麼,export JAVA_HOME=${JAVA_HOME} 沒有效果,只好再設定

Bash Shell 中可以這樣下令
sed -i -e "/JAVA_HOME=/aexport JAVA_HOME=\/usr\/lib\/jvm\/java"   $HADOOP_HOME/etc/hadoop/hadoop-env.sh
cat  $HADOOP_HOME/etc/hadoop/hadoop-env.sh  | grep "JAVA_HOME"


echo "export JAVA_HOME=/usr/lib/jvm/java" >>  $HADOOP_HOME/etc/hadoop/httpfs-env.sh
cat   $HADOOP_HOME/etc/hadoop/httpfs-env.sh  | grep "JAVA_HOME"

sed -i -e "/JAVA_HOME=/aexport JAVA_HOME=\/usr\/lib\/jvm\/java"   $HADOOP_HOME/etc/hadoop/mapred-env.sh
cat  $HADOOP_HOME/etc/hadoop/mapred-env.sh  | grep "JAVA_HOME"


sed -i -e "/JAVA_HOME=/aexport JAVA_HOME=\/usr\/lib\/jvm\/java"   $HADOOP_HOME/etc/hadoop/yarn-env.sh
cat  $HADOOP_HOME/etc/hadoop/yarn-env.sh  | grep "JAVA_HOME"


設定 ssh 不需要密碼就可登入


[root@localhost ~]# cd   $HADOOP_HOME

[root@localhost hadoop-2.6.0]# ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
Generating public/private dsa key pair.
Created directory '/root/.ssh'.
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
95:29:12:34:44:74:d1:90:1c:b8:5e:0d:ac:b1:3f:c6 root@localhost.localdomain
The key's randomart image is:
+--[ DSA 1024]----+
|     =B+=*       |
|      o+= .o     |
|      .=.o+      |
|      +..o.      |
|     . +S        |
|      . E        |
|       . .       |
|                 |
|                 |
+-----------------+
[root@localhost hadoop-2.6.0]# cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
[root@localhost hadoop-2.6.0]# ssh localhost
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is e2:05:04:2a:35:33:d0:e6:cf:df:2f:5b:8d:fc:43:d7.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
Last login: Wed Mar 25 03:41:33 2015 from 192.168.128.1
[root@localhost ~]# exit
logout
Connection to localhost closed.
[root@localhost hadoop-2.6.0]#


設定組態檔案

[root@localhost ~]# cd  $HADOOP_HOME
[root@localhost hadoop-2.6.0]# vi etc/hadoop/core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

[root@localhost hadoop-2.6.0]# vi etc/hadoop/hdfs-site.xml

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

執行測試

1.格式化檔案系統
[root@localhost hadoop-2.6.0]#  bin/hdfs namenode -format

2.啟動  NameNode daemon 和 DataNode daemon:

hadoop daemon log 預設輸出目錄 $HADOOP_LOG_DIR (預設為 $HADOOP_HOME/logs)

[root@localhost hadoop-2.6.0]# sbin/start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-root-namenode-localhost.localdomain.out
localhost: starting datanode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-root-datanode-localhost.localdomain.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is e2:05:04:2a:35:33:d0:e6:cf:df:2f:5b:8d:fc:43:d7.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-root-secondarynamenode-localhost.localdomain.out
[root@localhost hadoop-2.6.0]#


3. 瀏覽 NameNode 介面
http://localhost:50070/







4. 建立執行 MapReduce jobs 所需要的 HDFS 目錄
語法
  $ bin/hdfs dfs -mkdir /user
  $ bin/hdfs dfs -mkdir /user/<username>

其中 username 可用下面 whoami 指令查
[root@localhost hadoop-2.5.1]# whoami
root

執行
cd  $HADOOP_HOME
bin/hdfs dfs -mkdir /user
bin/hdfs dfs -mkdir /user/root


5. 拷貝輸入檔案到分散式系統目錄
bin/hdfs dfs -put etc/hadoop input

6.執行範例程式
bin/hadoop  jar  /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep input output 'dfs[a-z.]+'

$HADOOP_HOME/bin/hadoop jar /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar   randomwriter   out

7.檢查輸出
  $ bin/hdfs dfs -get output output
  $ cat output/*

[root@localhost hadoop-2.6.0]# cat output/*
cat: output/output: Is a directory
1       dfsadmin


  $ bin/hdfs dfs -cat output/*

8. 停止 Hadoop
  $ sbin/stop-dfs.sh

實際情況

[root@localhost hadoop-2.6.0]# sbin/stop-dfs.sh
Stopping namenodes on [localhost]
localhost: stopping namenode
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
[root@localhost hadoop-2.6.0]#

# *******************************************************************************

# 修改 xml 設定檔 (此處只改其中幾個)

# capacity-scheduler.xml  hadoop-policy.xml  httpfs-site.xml  yarn-site.xml  core-site.xml  hdfs-site.xml mapred-site.xml
# 參考
# http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/SingleCluster.html

注意,.xml 裡面的 host 要用 hostname  -f 指令看到的,不可以用 IP
[root@localhost hadoop-2.6.0]# hostname
localhost.localdomain

[root@localhost hadoop-2.6.0]# hostname  -f  
localhost

如果你要把主機名稱改掉 (例如改成 hadoop01 和 hadoop01.hadoopcluster )

設定主機名稱(立刻生效)
[root@localhost hadoop-2.6.0]# hostname  hadoop01.hadoopcluster

驗證
[root@localhost hadoop-2.6.0]# hostname
hadoop01.hadoopcluster

修改  /etc/sysconfig/network,設定 reboot 後的主機名稱
[root@localhost hadoop-2.6.0]# vi   /etc/sysconfig/network
找到
HOSTNAME=localhost.localdomain
改成
HOSTNAME=hadoop01.hadoopcluster

修改 hosts 設定
[root@localhost hadoop-2.6.0]# vi   /etc/hosts
增加
192.168.128.101    hadoop01   hadoop01.hadoopcluster

因為這篇只是單機安裝,所以敝人仍用 localhost 和 localhost.localdomain,沒有去修改

******************************

YARN on Single Node 模式

1. 修改設定

# 設定 mapred-site.xml

cp  $HADOOP_HOME/etc/hadoop/mapred-site.xml.template  $HADOOP_HOME/etc/hadoop/mapred-site.xml

vi  $HADOOP_HOME/etc/hadoop/mapred-site.xml


<configuration>
</configuration>

之間增加

    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>


  <property>
    <name>mapreduce.cluster.temp.dir</name>
    <value></value>
    <description>No description</description>
    <final>true</final>
  </property>

  <property>

    <name>mapreduce.cluster.local.dir</name>
    <value></value>
    <description>No description</description>
    <final>true</final>
  </property>


# 設定 yarn-site.xml

vi  $HADOOP_HOME/etc/hadoop/yarn-site.xml


<configuration>
</configuration>
之間增加
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
 (注意 host:port 要依照實際情況修改,要用 hostname  -f 指令看到的)
(這些 port 可以自己改換別的)



  <property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>localhost:9001</value>
    <description>host is the hostname of the resource manager and 
    port is the port on which the NodeManagers contact the Resource Manager.
    </description>
  </property>

  <property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>localhost:9002</value>
    <description>host is the hostname of the resourcemanager and port is the port
    on which the Applications in the cluster talk to the Resource Manager.
    </description>
  </property>

  <property>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
    <description>In case you do not want to use the default scheduler</description>
  </property>

  <property>
    <name>yarn.resourcemanager.address</name>
    <value>localhost:9003</value>
    <description>the host is the hostname of the ResourceManager and the port is the port on
    which the clients can talk to the Resource Manager. </description>
  </property>

  <property>
    <name>yarn.nodemanager.local-dirs</name>
    <value></value>
    <description>the local directories used by the nodemanager</description>
  </property>

  <property>
    <name>yarn.nodemanager.address</name>
    <value>localhost:9004</value>
    <description>the nodemanagers bind to this port</description>
  </property>  

  <property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>10240</value>
    <description>the amount of memory on the NodeManager in GB</description>
  </property>

  <property>
    <name>yarn.nodemanager.remote-app-log-dir</name>
    <value>/app-logs</value>
    <description>directory on hdfs where the application logs are moved to </description>
  </property>

   <property>
    <name>yarn.nodemanager.log-dirs</name>
    <value></value>
    <description>the directories used by Nodemanagers as log directories</description>
  </property>

  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    <description>shuffle service that needs to be set for Map Reduce to run </description>
  </property>

# 設定 capacity-scheduler.xml

vi   $HADOOP_HOME/etc/hadoop/capacity-scheduler.xml

找 root.queue

將 ( 用 關鍵字 default 搜尋)
  <property>
    <name>yarn.scheduler.capacity.root.queues</name>
    <value>default</value>
    <description>
      The queues at the this level (root is the root queue).
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.default.capacity</name>
    <value>100</value>
    <description>Default queue target capacity.</description>
  </property>

改為

  <property>
    <name>yarn.scheduler.capacity.root.queues</name>
    <value>unfunded,default</value>
  </property>
  
  <property>
    <name>yarn.scheduler.capacity.root.capacity</name>
    <value>100</value>
  </property>
  
  <property>
    <name>yarn.scheduler.capacity.root.unfunded.capacity</name>
    <value>50</value>
  </property>
  
  <property>
    <name>yarn.scheduler.capacity.root.default.capacity</name>
    <value>50</value>
  </property>


# **************************************************************************

2.啟動

[root@localhost hadoop-2.6.0]# sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop-2.6.0/logs/yarn-root-resourcemanager-localhost.localdomain.out
localhost: starting nodemanager, logging to /usr/local/hadoop-2.6.0/logs/yarn-root-nodemanager-localhost.localdomain.out
[root@localhost hadoop-2.6.0]#



3. 瀏覽 http://localhost:8088/






4. 停止

[root@localhost hadoop-2.6.0]# sbin/stop-yarn.sh
stopping yarn daemons
stopping resourcemanager
localhost: stopping nodemanager
no proxyserver to stop
[root@localhost hadoop-2.6.0]#



------------------------------

如果需要個別啟動 Resource Manager 和 Node Manager
(這部分  2.6.0 網頁沒有,測試有問題不足為奇)

[root@localhost hadoop]# cd $HADOOP_MAPRED_HOME

# 啟動 Resource Manager

[root@localhost hadoop-2.6.0]# sbin/yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /usr/local/hadoop-2.6.0/logs/yarn-root-resourcemanager-localhost.localdomain.out
[root@localhost hadoop-2.6.0]#


# 一定要用 ps aux | grep resourcemanager 去驗證,因為上面執行失敗,可能未必有錯誤訊息,但是驗證會找不到,表示沒有執行成功

[root@localhost hadoop-2.6.0]#  ps aux | grep resourcemanager
root      18084 23.7  5.2 1843804 98672 pts/2   Sl   04:17   0:03 /usr/lib/jvm/java/bin/java -Dproc_resourcemanager -Xmx1000m -Dhadoop.log.dir=/usr/local/hadoop-2.6.0/logs -Dyarn.log.dir=/usr/local/hadoop-2.6.0/logs -Dhadoop.log.file=yarn-root-resourcemanager-localhost.localdomain.log -Dyarn.log.file=yarn-root-resourcemanager-localhost.localdomain.log -Dyarn.home.dir= -Dyarn.id.str=root -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/usr/local/hadoop-2.6.0/lib/native -Dyarn.policy.file=hadoop-policy.xml -Dhadoop.log.dir=/usr/local/hadoop-2.6.0/logs -Dyarn.log.dir=/usr/local/hadoop-2.6.0/logs -Dhadoop.log.file=yarn-root-resourcemanager-localhost.localdomain.log -Dyarn.log.file=yarn-root-resourcemanager-localhost.localdomain.log -Dyarn.home.dir=/usr/local/hadoop-2.6.0 -Dhadoop.home.dir=/usr/local/hadoop-2.6.0 -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/usr/local/hadoop-2.6.0/lib/native -classpath /usr/local/hadoop-2.6.0/etc/hadoop:/usr/local/hadoop-2.6.0/etc/hadoop:/usr/local/hadoop-2.6.0/etc/hadoop:/usr/local/hadoop-2.6.0/share/hadoop/common/lib/*:/usr/local/hadoop-2.6.0/share/hadoop/common/*:/usr/local/hadoop-2.6.0/share/hadoop/hdfs:/usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/*:/usr/local/hadoop-2.6.0/share/hadoop/hdfs/*:/usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*:/usr/local/hadoop-2.6.0/share/hadoop/yarn/*:/usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/*:/usr/local/hadoop-2.6.0/share/hadoop/mapreduce/*:/usr/local/hadoop-2.6.0/contrib/capacity-scheduler/*.jar:/usr/local/hadoop-2.6.0/contrib/capacity-scheduler/*.jar:/usr/local/hadoop-2.6.0/share/hadoop/yarn/*:/usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*:/usr/local/hadoop-2.6.0/etc/hadoop/rm-config/log4j.properties org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
root      18306  0.0  0.0 112640   980 pts/2    R+   04:17   0:00 grep --color=auto resourcemanager
[root@localhost hadoop-2.6.0]#


# 啟動 Node Manager

[root@localhost hadoop-2.6.0]# sbin/yarn-daemon.sh start nodemanager
starting nodemanager, logging to /usr/local/hadoop-2.6.0/logs/yarn-root-nodemanager-localhost.localdomain.out
[root@localhost hadoop-2.6.0]#



一定要用 ps aux | grep nodemanager 去驗證,因為上面執行失敗,可能未必有錯誤訊息,但是驗證會找不到,表示沒有執行成功

[root@localhost hadoop-2.6.0]# ps aux | grep nodemanager
root      18336 17.6  5.0 1702136 94836 pts/2   Sl   04:18   0:03 /usr/lib/jvm/java/bin/java -Dproc_nodemanager -Xmx1000m -Dhadoop.log.dir=/usr/local/hadoop-2.6.0/logs -Dyarn.log.dir=/usr/local/hadoop-2.6.0/logs -Dhadoop.log.file=yarn-root nodemanager-localhost.localdomain.log -Dyarn.log.file=yarn-root-nodemanager-localhost.localdomain.log -Dyarn.home.dir= -Dyarn.id.str=root -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/usr/local/hadoop-2.6.0/lib/native -Dyarn.policy.file=hadoop-policy.xml -server -Dhadoop.log.dir=/usr/local/hadoop-2.6.0/logs -Dyarn.log.dir=/usr/local/hadoop-2.6.0/logs -Dhadoop.log.file=yarn-root-nodemanager-localhost.localdomain.log -Dyarn.log.file=yarn-root-nodemanager-localhost.localdomain.log -Dyarn.home.dir=/usr/local/hadoop-2.6.0 -Dhadoop.home.dir=/usr/local/hadoop-2.6.0 -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/usr/local/hadoop-2.6.0/lib/native -classpath /usr/local/hadoop-2.6.0/etc/hadoop:/usr/local/hadoop-2.6.0/etc/hadoop:/usr/local/hadoop-2.6.0/etc/hadoop:/usr/local/hadoop-2.6.0/share/hadoop/common/lib/*:/usr/local/hadoop-2.6.0/share/hadoop/common/*:/usr/local/hadoop-2.6.0/share/hadoop/hdfs:/usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/*:/usr/local/hadoop-2.6.0/share/hadoop/hdfs/*:/usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*:/usr/local/hadoop-2.6.0/share/hadoop/yarn/*:/usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/*:/usr/local/hadoop-2.6.0/share/hadoop/mapreduce/*:/usr/local/hadoop-2.6.0/contrib/capacity-scheduler/*.jar:/usr/local/hadoop-2.6.0/contrib/capacity-scheduler/*.jar:/usr/local/hadoop-2.6.0/share/hadoop/yarn/*:/usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*:/usr/local/hadoop-2.6.0/etc/hadoop/nm-config/log4j.properties org.apache.hadoop.yarn.server.nodemanager.NodeManager
root      18454  0.0  0.0 112640   984 pts/2    R+   04:18   0:00 grep --color=auto nodemanager
[root@localhost hadoop-2.6.0]#


# ***************

# 測試範例程式


#切換路徑

[root@localhost hadoop-2.6.0]# cd $HADOOP_COMMON_HOME

[root@localhost hadoop-2.6.0]# $HADOOP_HOME/bin/hadoop   jar   /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar   randomwriter   out


15/03/25 04:20:07 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
Running 10 maps.
Job started: Wed Mar 25 04:20:08 EDT 2015
15/03/25 04:20:08 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
java.net.ConnectException: Call From localhost.localdomain/127.0.0.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
        at org.apache.hadoop.ipc.Client.call(Client.java:1472)
        at org.apache.hadoop.ipc.Client.call(Client.java:1399)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy13.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1988)
        at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1118)
        at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)
        at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:145)
        at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:562)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:432)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1314)
        at org.apache.hadoop.examples.RandomWriter.run(RandomWriter.java:283)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.examples.RandomWriter.main(RandomWriter.java:294)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
        at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
        at org.apache.hadoop.ipc.Client.call(Client.java:1438)
        ... 43 more
[root@localhost hadoop-2.6.0]#


執行疑似有問題,待研究

# **************

# 停止 (用 stop 參數)

# 停止 Node Manager

[root@localhost hadoop-2.6.0]# sbin/yarn-daemon.sh stop nodemanager
no nodemanager to stop

# 上面是 Node Manager 根本沒啟動成功,所以也沒 Node Manager 可以停止
# 若 Node Manager 有啟動,stop 出現訊息如下

[root@localhost hadoop-2.6.0]# sbin/yarn-daemon.sh stop nodemanager
stopping nodemanager

# 停止 Resource Manager

[root@localhost hadoop-2.6.0]# sbin/yarn-daemon.sh stop resourcemanager
no resourcemanager to stop

# 上面是 Resource Manager 根本沒啟動成功,所以也沒  Resource Manager 可以停止
# 若  Resource Manager 有啟動,stop 出現訊息如下

[root@localhost hadoop-2.6.0]# sbin/yarn-daemon.sh stop resourcemanager
stopping resourcemanager

啟動 Resource Manager 和 Node Manager 測試結束

# **************************************************************************

# 補充:關於防火牆 (CentOS 7.x )

測試 OK後,如果不想停掉防火牆,可加幾條 rules

#設定防火牆立刻啟動
systemctl start firewalld

#要永久開放某些 port,可執行
firewall-cmd   --permanent   --add-port=22/tcp
firewall-cmd   --permanent   --add-port=8080/tcp
firewall-cmd   --permanent   --add-port=50030/tcp
firewall-cmd   --permanent   --add-port=50070/tcp
firewall-cmd   --permanent   --add-port=9000/tcp
firewall-cmd   --permanent   --add-port=9001/tcp
firewall-cmd   --permanent   --add-port=9002/tcp
firewall-cmd   --permanent   --add-port=9003/tcp
firewall-cmd   --permanent   --add-port=9004/tcp
systemctl restart firewalld
#設定 httpd 隨作業系統啟動
systemctl enable  httpd


-A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 8080 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 50070 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 50030 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 9000 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 9001 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 9002 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 9003 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 9004 -j ACCEPT

其他常用 firewall-cmd 命令

# firewall-cmd --state
# firewall-cmd --list-all
# firewall-cmd --list-interfaces
# firewall-cmd --get-service
# firewall-cmd --query-service service_name
# firewall-cmd --add-port=8080/tcp


#設定防火牆立刻啟動
systemctl start firewalld

要暫時開放 http port,可執行
firewall-cmd --add-service=http

要永久開放 http port,可執行
firewall-cmd --permanent --add-service=http
# systemctl restart firewalld

#設定 httpd 隨作業系統啟動
systemctl enable  httpd

要停掉
systemctl stop firewalld


#設定防火牆立刻重新啟動
# systemctl restart firewalld

其他常用 firewall-cmd 命令

# **************************************************************************

# 補充:關於防火牆 (CentOS 6.x )

測試 OK後,如果不想停掉防火牆,可加幾條 rules
先啟動 iptables 和 ip6tables,把 rules 存檔
(會自動存到 /etc/sysconfig/iptables 和  /etc/sysconfig/ip6tables)

[root@localhost ~]# service iptables start

[root@localhost ~]# service ip6tables start

[root@localhost ~]# iptables-save

[root@localhost ~]# ip6tables-save

修改 iptables 防火牆 rules

[root@localhost ~]# vi /etc/sysconfig/iptables
# Firewall configuration written by system-config-firewall
# Manual customization of this file is not recommended.
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A INPUT -j REJECT --reject-with icmp-host-prohibited
-A FORWARD -j REJECT --reject-with icmp-host-prohibited
COMMIT

改為 (依自己設定的 port 增加)
# Firewall configuration written by system-config-firewall
# Manual customization of this file is not recommended.
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 8080 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 50070 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 50030 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 9000 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 9001 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 9002 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 9003 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 9004 -j ACCEPT
-A INPUT -j REJECT --reject-with icmp-host-prohibited
-A FORWARD -j REJECT --reject-with icmp-host-prohibited
COMMIT

[root@localhost ~]# vi /etc/sysconfig/ip6tables
ip6tables 的 rules 比照辦理去修改

重新啟動 iptable ( 會重新載入全部 rules )

[root@localhost ~]# service iptables restart
iptables: Flushing firewall rules:                         [  OK  ]
iptables: Setting chains to policy ACCEPT: filter          [  OK  ]
iptables: Unloading modules:                               [  OK  ]
iptables: Applying firewall rules:                         [  OK  ]

[root@localhost ~]# service ip6tables restart
ip6tables: Flushing firewall rules:                        [  OK  ]
ip6tables: Setting chains to policy ACCEPT: filter         [  OK  ]
ip6tables: Unloading modules:                              [  OK  ]
ip6tables: Applying firewall rules:                        [  OK  ]
[root@localhost ~]#

(完)

相關

[研究] Hadoop 2.6.0 安裝 (CentOS 7.0 x86_64)
http://shaurong.blogspot.com/2015/03/hadoop-260-single-cluster-centos-70.html

[研究] Hadoop 2.5.1 安裝 (CentOS 7.0 x86_64)
http://shaurong.blogspot.tw/2014/09/hadoop-251-centos-70-x8664.html

[研究] Hadoop 2.5.0 安裝 (CentOS 7.0 x86_64)
http://shaurong.blogspot.com/2014/08/hadoop-250-centos-70-x8664.html
http://forum.icst.org.tw/phpbb/viewtopic.php?f=26&t=81014
http://download.ithome.com.tw/article/index/id/2721

[研究] Hadoop 2.4.1 安裝 (CentOS 7.0 x86_64)
http://shaurong.blogspot.com/2014/08/hadoop-241-centos-70-x8664.html

[研究] hadoop-2.4.1-src.tar.gz 快速編譯安裝程式(CentOS 7.0 x86_64)
http://shaurong.blogspot.com/2014/08/hadoop-241-srctargz-centos-70-x8664.html
http://download.ithome.com.tw/article/index/id/2375

[研究] hadoop-2.2.0-src.tar.gz 快速編譯安裝程式(二)(CentOS 6.5 x86_64)
http://shaurong.blogspot.com/2014/02/hadoop-220-srctargz-centos-65-x8664_8080.html

[研究] hadoop-2.2.0-src.tar.gz 快速編譯安裝程式(CentOS 6.5 x86_64)
http://shaurong.blogspot.com/2014/02/hadoop-220-srctargz-centos-65-x8664_7.html

[研究] hadoop-2.2.0-src.tar.gz 編譯研究(CentOS 6.5 x86_64)
http://shaurong.blogspot.com/2014/02/hadoop-220-srctargz-centos-65-x8664.html

[研究] Hadoop 2.2.0 編譯 (CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/11/hadoop-220-centos-64-x64.html

[研究] Hadoop 2.2.0 Single Cluster 安裝 (二)(CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/11/hadoop-220-single-cluster-centos-64-x64_7.html

[研究] Hadoop 2.2.0 Single Cluster 安裝 (一)(CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/11/hadoop-220-single-cluster-centos-64-x64.html

[研究] Hadoop 1.2.1 (rpm)安裝 (CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/10/hadoop-121-rpm-centos-64-x64.html

[研究] Hadoop 1.2.1 (bin)安裝 (CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/07/hadoop-112-centos-64-x64.html

[研究] Hadoop 1.2.1 安裝 (CentOS 6.4 x64)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=80035

[研究] 雲端軟體 Hadoop 1.0.0 安裝 (CentOS 6.2 x86)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=21166

[研究] 雲端軟體 Hadoop 0.20.2 安裝 (CentOS 5.5 x86)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=18513

[研究] 雲端軟體 Hadoop 0.20.2 安裝 (CentOS 5.4 x86)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=17974

2 則留言:

  1. 謝謝你的分享,照你的文章測試安裝 2.6.1 沒問題,你會出現下面這個錯誤

    java.net.ConnectException: Call From localhost.localdomain/127.0.0.1 to localhost:9000 failed on connection

    是因為 9000 Port 連不上,因為服務沒啟動,執行 sbin/start-dfs.sh 即可
    執行測試若出現 out 目錄已存在的錯誤,只要刪除 out 目錄後就可測試成功

    回覆刪除