注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

mmicky 的博客

追逐刹那的惊艳

 
 
 

日志

 
 

hadoop2.2.0测试环境搭建  

2013-07-20 13:15:11|  分类: hadoop2 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |
    近几日,hadoop2.2.0稳定版发布,立即下载先搭建测试环境。
1:规划
centOS6.4上搭建hadoop2.2.0环境,java版本7UP21
192.168.100.171 hadoop1 (namenode)
192.168.100.172 hadoop2 (预留当namenode)
192.168.100.173 hadoop3 (datanode)
192.168.100.174 hadoop4 (datanode)
192.168.100.175 hadoop5 (datanode)

2:创建虚拟机样板机(VM和vitualBOX都可以)
a:安装centOS6.4虚拟机hadoop1,开通ssh服务,屏蔽iptables服务
[root@hadoop1 ~]# chkconfig sshd on
[root@hadoop1 ~]# chkconfig iptables off
[root@hadoop1 ~]# chkconfig ip6tables off
[root@hadoop1 ~]# chkconfig postfix off

b:修改/etc/sysconfig/selinux
SELINUX=disabled

c:修改ssh配置/etc/ssh/sshd_config,打开注释:
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys

d:修改/etc/hosts,增加:
192.168.100.171   hadoop1
192.168.100.172   hadoop2
192.168.100.173   hadoop3
192.168.100.174   hadoop4
192.168.100.175   hadoop5

e:安装JAVA,在环境变量配置文件/etc/profile末尾增加:
export JAVA_HOME=/usr/java/jdk1.7.0_21
export JRE_HOME=/usr/java/jdk1.7.0_21/jre
export HADOOP_FREFIX=/app/hadoop/hadoop220
export HADOOP_COMMON_HOME=${HADOOP_FREFIX}
export HADOOP_HDFS_HOME=${HADOOP_FREFIX}
export HADOOP_MAPRED_HOME=${HADOOP_FREFIX}
export YARN_HOME=${HADOOP_FREFIX}
export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:${HADOOP_FREFIX}/bin:${HADOOP_FREFIX}/sbin:$PATH

f:增加hadoop组和hadoop用户,并设置hadoop用户密码,然后解压缩安装文件到/app/hadoop/hadoop220,其中将/app/hadoop整个目录赋予hadoop:hadoop,并且在/app/hadoop/hadoop220下建立mydata目录存放数据。

g:修改hadoop相关配置文件:
[hadoop@hadoop1 hadoop205]$ cd etc/hadoop
[hadoop@hadoop1 hadoop]$ vi core-site.xml
******************************************************************************
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.100.171:8000/</value>
</property>

<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
</configuration>
******************************************************************************

[hadoop@hadoop1 hadoop]$ vi hdfs-site.xml
******************************************************************************
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/app/hadoop/hadoop220/mydata/name</value>
<description>用逗号隔开的路径相互冗余.</description>
</property>

<property>
<name>dfs.datanode.data.dir</name>
<value>file:/app/hadoop/hadoop220/mydata/data</value>
</property>

<property>
<name>dfs.blocksize</name>
<value>67108864</value>
</property>

<property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.permission</name>
<value>false</value>
</property>
</configuration>
******************************************************************************

[hadoop@hadoop1 hadoop]$ vi yarn-site.xml
******************************************************************************
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>192.168.100.171:8080</value>
</property>

<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>192.168.100.171:8081</value>
</property>

<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>192.168.100.171:8082</value>
</property>

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description>管理员在NodeManager上设置ShuffleHandler service时,要采用“mapreduce_shuffle”,而非之前的“mapreduce.shuffle”作为属性值</description>
</property>
</configuration>
******************************************************************************

[hadoop@hadoop1 hadoop]$ vi mapred-site.xml
******************************************************************************
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

<property>
<name>mapreduce.job.tracker</name>
<value>hdfs://192.168.100.171:8001</value>
<final>true</final>
</property>

<property>
<name>mapreduce.map.memory.mb</name>
<value>1536</value>
</property>

<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx1024M</value>
</property>

<property>
<name>mapreduce.reduce.memory.mb</name>
<value>3072</value>
</property>

<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx2560M</value>
</property>

<property>
<name>mapreduce.task.io.sort.mb</name>
<value>512</value>
</property>

<property>
<name>mapreduce.task.io.sort.factor</name>
<value>100</value>
</property>

<property>
<name>mapreduce.reduce.shuffle.parallelcopies</name>
<value>50</value>
</property>

<property>
<name>mapred.system.dir</name>
<value>file:/app/hadoop/hadoop220/mydata/sysmapred</value>
<final>true</final>
</property>

<property>
<name>mapred.local.dir</name>
<value>file:/app/hadoop/hadoop220/mydata/localmapred</value>
<final>true</final>
</property>
</configuration>
******************************************************************************

[hadoop@hadoop1 hadoop]$ vi hadoop-env.sh
******************************************************************************
export JAVA_HOME=/usr/java/jdk1.7.0_21
export HADOOP_FREFIX=/app/hadoop/hadoop220
export PATH=$PATH:${HADOOP_FREFIX}/bin:${HADOOP_FREFIX}/sbin
export HADOOP_CONF_HOME=${HADOOP_FREFIX}/etc/hadoop
export HADOOP_COMMON_HOME=${HADOOP_FREFIX}
export HADOOP_HDFS_HOME=${HADOOP_FREFIX}
export HADOOP_MAPRED_HOME=${HADOOP_FREFIX}
export YARN_HOME=${HADOOP_FREFIX}
export YARN_CONF_DIR=${HADOOP_FREFIX}/etc/hadoop
******************************************************************************

[hadoop@hadoop1 hadoop]$ vi yarn-env.sh
******************************************************************************
export JAVA_HOME=/usr/java/jdk1.7.0_21
export HADOOP_FREFIX=/app/hadoop/hadoop220
export PATH=$PATH:${HADOOP_FREFIX}/bin:${HADOOP_FREFIX}/sbin
export HADOOP_CONF_HOME=${HADOOP_FREFIX}/etc/hadoop
export HADOOP_COMMON_HOME=${HADOOP_FREFIX}
export HADOOP_HDFS_HOME=${HADOOP_FREFIX}
export HADOOP_MAPRED_HOME=${HADOOP_FREFIX}
export YARN_HOME=${HADOOP_FREFIX}
export YARN_CONF_DIR=${HADOOP_FREFIX}/etc/hadoop
******************************************************************************

[hadoop@hadoop1 hadoop]$ vi slaves
******************************************************************************
hadoop3
hadoop4
hadoop5
******************************************************************************

3:配置ssh
a:关闭样板机,分别复制成hadoop2、hadoop3、hadoop4、hadoop5:
修改vmware workstation配置文件的displayname;
修改虚拟机的下列文件中相关的信息
/etc/udev/rules.d/70-persistent-net.rules
/etc/sysconfig/network
/etc/sysconfig/network-scripts/ifcfg-eth0

b:启动hadoop1、hadoop2、hadoop3、hadoop4、hadoop5,确保相互之间能ping通。

c:配置ssh无密码登录
用用户hadoop登录各节点,生成各节点的秘钥对。
[hadoop@hadoop1 ~]$ ssh-keygen -t rsa
[hadoop@hadoop2 ~]$ ssh-keygen -t rsa
[hadoop@hadoop3 ~]$ ssh-keygen -t rsa
[hadoop@hadoop4 ~]$ ssh-keygen -t rsa
[hadoop@hadoop5 ~]$ ssh-keygen -t rsa

切换到hadoop1,进行所有节点公钥的合并
[hadoop@hadoop1 .ssh]$ ssh hadoop1 cat /home/hadoop/.ssh/id_rsa.pub>>authorized_keys
[hadoop@hadoop1 .ssh]$ ssh hadoop2 cat /home/hadoop/.ssh/id_rsa.pub>>authorized_keys
[hadoop@hadoop1 .ssh]$ ssh hadoop3 cat /home/hadoop/.ssh/id_rsa.pub>>authorized_keys
[hadoop@hadoop1 .ssh]$ ssh hadoop4 cat /home/hadoop/.ssh/id_rsa.pub>>authorized_keys
[hadoop@hadoop1 .ssh]$ ssh hadoop5 cat /home/hadoop/.ssh/id_rsa.pub>>authorized_keys

注意修改authorized_keys文件的属性(.ssh目录为700,authorized_keys文件为600,用chmod命令修改),不然ssh登录的时候还需要密码。
[hadoop@hadoop1 .ssh]$ chmod 600 authorized_keys

发放公钥到各节点
[hadoop@hadoop1 .ssh]$ scp authorized_keys hadoop@hadoop2:/home/hadoop/.ssh/authorized_keys
[hadoop@hadoop1 .ssh]$ scp authorized_keys hadoop@hadoop3:/home/hadoop/.ssh/authorized_keys
[hadoop@hadoop1 .ssh]$ scp authorized_keys hadoop@hadoop4:/home/hadoop/.ssh/authorized_keys
[hadoop@hadoop1 .ssh]$ scp authorized_keys hadoop@hadoop5:/home/hadoop/.ssh/authorized_keys

确认各节点的无密码访问,在各节点以下命令确保ssh无密码访问
[hadoop@hadoop1 .ssh]$ ssh hadoop1 date
[hadoop@hadoop1 .ssh]$ ssh hadoop2 date
[hadoop@hadoop1 .ssh]$ ssh hadoop3 date
[hadoop@hadoop1 .ssh]$ ssh hadoop4 date
[hadoop@hadoop1 .ssh]$ ssh hadoop5 date

4:初始化hadoop
[hadoop@hadoop1 hadoop220]$ hdfs namenode -format

5:启动hadoop
[hadoop@hadoop1 hadoop]$ start-dfs.sh
[hadoop@hadoop1 hadoop]$ start-yarn.sh
hadoop2.2测试环境搭建 - mmicky - mmicky 的博客

6:访问地址
NameNode http://192.168.100.171:50070/
hadoop2.2测试环境搭建 - mmicky - mmicky 的博客
 
ResourceManager http://192.168.100.171:8088/
hadoop2.2测试环境搭建 - mmicky - mmicky 的博客
 
7:测试
上传文件,然后运行wordcount。值得注意的地方是,hadoop2.2不能象hadoop1.x那样在缺省的HDFS目录下进行文件操作,而是要带上hdfs:台头(可以设置成不带台头,但还没找到如何设置)。参见官方说明:All FS shell commands take path URIs as arguments. The URI format is scheme://authority/path. For HDFS the scheme is hdfs, and for the Local FS the scheme is file. The scheme and authority are optional. If not specified, the default scheme specified in the configuration is used. An HDFS file or directory such as /parent/child can be specified as hdfs://namenodehost/parent/child or simply as /parent/child (given that your configuration is set to point to hdfs://namenodehost).

[hadoop@hadoop1 hadoop220]$ hdfs dfs -mkdir  hdfs://192.168.100.171:8000/input
[hadoop@hadoop1 hadoop220]$ hdfs dfs -put  ./etc/hadoop/slaves hdfs://192.168.100.171:8000/input/slaves
[hadoop@hadoop1 hadoop220]$ hdfs dfs -put  ./etc/hadoop/masters hdfs://192.168.100.171:8000/input/masters
[hadoop@hadoop1 hadoop220]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount hdfs://192.168.100.171:8000/input hdfs://192.168.100.171:8000/output
hadoop2.2测试环境搭建 - mmicky - mmicky 的博客
 
终于搞定!



  评论这张
 
阅读(2952)| 评论(4)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017