你要先去获取hadoop的安装文件,可以去http://hadoop.apache.org上下载,也可以到我的百度分享盘下载,如下:
链接:http://pan.baidu.com/s/1qWt446w        提取密码:dmd9

 
 

确保您的服务器上已经安装好java环境,使用 echo $JAVA_HOME 看能否打印出java的安装目录,或者 java -version 查看是否已安装java环境。
如果没有安装,你也可以使用我的一键安装包来安装jdk,安装脚本和jdk都有,如下百度网盘链接:
链接:http://pan.baidu.com/s/1kTmTfLH         提取密码:va3j
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网
----------------------------------------------------------------------------------------------------------------------
把安装文件下载下来之后上传到你的服务器上,再把hadoop-1.2.1.tar.gz文件解压缩,并把文件夹 放到 /opt 或者其他你喜欢管理的目录下。

 
 

接下来先配置一下hadoop的环境,我的测试机器的路径是:/opt/hadoop-1.2.1/

 
 

1、进入到conf/ 目录下,执行   vim hadoop-env.sh   加入如下
export JAVA_HOME=/usr/lib/java/jdk1.6.0_45
export PATH=$PATH:/opt/hadoop-1.2.1/bin
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网

上面的第二句,配置hadoop的环境变量,方便直接使用hadoop执行命令 ,如果加在此处,重启电脑还需要继续source该文件。
如果想重启之后还生效,则需在/etc/bash.bashrc 文件加入hadoop的环境变量,也就是在该文件最后加入如上的第二行文件。

 
 

2、执行 source hadoop-env.sh ,让设置的环境变量生效

 
 

3、执行 hadoop version  查看hadoop是否已经配置好环境变量
root@sambafs:/opt/hadoop-1.2.1/conf# hadoop version

Hadoop 1.2.1
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152
Compiled by mattf on Mon Jul 22 15:23:09 PDT 2013
From source with checksum 6923c86528809c4e7e6f493b6b413a9a
This command was run using /opt/hadoop-1.2.1/hadoop-core-1.2.1.jar

[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网

 
 

4、测试一下hadoop
使用命令:hadoop jar /opt/hadoop-1.2.1/hadoop-examples-1.2.1.jar wordcount /input/conf output2
我执行上面的命令的时候是在根目录下 : /  。
root@sambafs:/# hadoop jar /opt/hadoop-1.2.1/hadoop-examples-1.2.1.jar wordcount /input/conf output2
15/02/07 11:17:47 INFO util.NativeCodeLoader: Loaded the native-hadoop library
15/02/07 11:17:48 INFO input.FileInputFormat: Total input paths to process : 17
15/02/07 11:17:48 WARN snappy.LoadSnappy: Snappy native library not loaded
15/02/07 11:17:48 INFO mapred.JobClient: Running job: job_local956513444_0001
15/02/07 11:17:48 INFO mapred.LocalJobRunner: Waiting for map tasks
15/02/07 11:17:48 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000000_0
15/02/07 11:17:48 INFO util.ProcessTree: setsid exited with exit code 0
15/02/07 11:17:48 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@3e018c74
15/02/07 11:17:48 INFO mapred.MapTask: Processing split: file:/input/conf/capacity-scheduler.xml:0+7457
15/02/07 11:17:48 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:49 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:49 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:49 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:49 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:49 INFO mapred.Task: Task:attempt_local956513444_0001_m_000000_0 is done. And is in the process of commiting
15/02/07 11:17:49 INFO mapred.LocalJobRunner:
15/02/07 11:17:49 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000000_0' done.
15/02/07 11:17:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000000_0
15/02/07 11:17:49 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000001_0
15/02/07 11:17:49 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@722b9406
15/02/07 11:17:49 INFO mapred.MapTask: Processing split: file:/input/conf/log4j.properties:0+5018
15/02/07 11:17:49 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:49 INFO mapred.JobClient:  map 5% reduce 0%
15/02/07 11:17:49 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:49 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:49 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:49 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:49 INFO mapred.Task: Task:attempt_local956513444_0001_m_000001_0 is done. And is in the process of commiting
15/02/07 11:17:49 INFO mapred.LocalJobRunner:
15/02/07 11:17:49 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000001_0' done.
15/02/07 11:17:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000001_0
15/02/07 11:17:49 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000002_0
15/02/07 11:17:49 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@195ed659
15/02/07 11:17:49 INFO mapred.MapTask: Processing split: file:/input/conf/hadoop-policy.xml:0+4644
15/02/07 11:17:49 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:49 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:49 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:49 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:49 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:49 INFO mapred.Task: Task:attempt_local956513444_0001_m_000002_0 is done. And is in the process of commiting
15/02/07 11:17:49 INFO mapred.LocalJobRunner:
15/02/07 11:17:49 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000002_0' done.
15/02/07 11:17:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000002_0
15/02/07 11:17:49 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000003_0
15/02/07 11:17:49 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@7174d93a
15/02/07 11:17:49 INFO mapred.MapTask: Processing split: file:/input/conf/task-log4j.properties:0+3890
15/02/07 11:17:49 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:49 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:49 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:49 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:49 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:49 INFO mapred.Task: Task:attempt_local956513444_0001_m_000003_0 is done. And is in the process of commiting
15/02/07 11:17:49 INFO mapred.LocalJobRunner:
15/02/07 11:17:49 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000003_0' done.
15/02/07 11:17:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000003_0
15/02/07 11:17:49 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000004_0
15/02/07 11:17:49 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@6a5eb489
15/02/07 11:17:49 INFO mapred.MapTask: Processing split: file:/input/conf/hadoop-env.sh:0+2497
15/02/07 11:17:49 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:50 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:50 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:50 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:50 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:50 INFO mapred.Task: Task:attempt_local956513444_0001_m_000004_0 is done. And is in the process of commiting
15/02/07 11:17:50 INFO mapred.LocalJobRunner:
15/02/07 11:17:50 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000004_0' done.
15/02/07 11:17:50 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000004_0
15/02/07 11:17:50 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000005_0
15/02/07 11:17:50 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@7f5e2075
15/02/07 11:17:50 INFO mapred.MapTask: Processing split: file:/input/conf/hadoop-metrics2.properties:0+2052
15/02/07 11:17:50 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:50 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:50 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:50 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:50 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:50 INFO mapred.Task: Task:attempt_local956513444_0001_m_000005_0 is done. And is in the process of commiting
15/02/07 11:17:50 INFO mapred.LocalJobRunner:
15/02/07 11:17:50 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000005_0' done.
15/02/07 11:17:50 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000005_0
15/02/07 11:17:50 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000006_0
15/02/07 11:17:50 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@6cd24e3f
15/02/07 11:17:50 INFO mapred.MapTask: Processing split: file:/input/conf/ssl-client.xml.example:0+2042
15/02/07 11:17:50 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:50 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:50 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:50 INFO mapred.JobClient:  map 35% reduce 0%
15/02/07 11:17:50 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:50 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:50 INFO mapred.Task: Task:attempt_local956513444_0001_m_000006_0 is done. And is in the process of commiting
15/02/07 11:17:50 INFO mapred.LocalJobRunner:
15/02/07 11:17:50 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000006_0' done.
15/02/07 11:17:50 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000006_0
15/02/07 11:17:50 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000007_0
15/02/07 11:17:50 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@54c01e99
15/02/07 11:17:50 INFO mapred.MapTask: Processing split: file:/input/conf/mapred-queue-acls.xml:0+2033
15/02/07 11:17:50 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:50 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:50 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:50 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:50 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:50 INFO mapred.Task: Task:attempt_local956513444_0001_m_000007_0 is done. And is in the process of commiting
15/02/07 11:17:50 INFO mapred.LocalJobRunner:
15/02/07 11:17:50 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000007_0' done.
15/02/07 11:17:50 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000007_0
15/02/07 11:17:50 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000008_0
15/02/07 11:17:50 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@67ecd78
15/02/07 11:17:50 INFO mapred.MapTask: Processing split: file:/input/conf/ssl-server.xml.example:0+1994
15/02/07 11:17:50 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:50 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:50 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:50 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:50 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:50 INFO mapred.Task: Task:attempt_local956513444_0001_m_000008_0 is done. And is in the process of commiting
15/02/07 11:17:50 INFO mapred.LocalJobRunner:
15/02/07 11:17:50 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000008_0' done.
15/02/07 11:17:50 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000008_0
15/02/07 11:17:50 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000009_0
15/02/07 11:17:50 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4816ef71
15/02/07 11:17:50 INFO mapred.MapTask: Processing split: file:/input/conf/configuration.xsl:0+1095
15/02/07 11:17:50 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:50 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:50 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:50 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:50 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:50 INFO mapred.Task: Task:attempt_local956513444_0001_m_000009_0 is done. And is in the process of commiting
15/02/07 11:17:50 INFO mapred.LocalJobRunner:
15/02/07 11:17:50 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000009_0' done.
15/02/07 11:17:50 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000009_0
15/02/07 11:17:50 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000010_0
15/02/07 11:17:50 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@39697b67
15/02/07 11:17:50 INFO mapred.MapTask: Processing split: file:/input/conf/taskcontroller.cfg:0+382
15/02/07 11:17:50 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:50 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:50 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:50 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:50 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:50 INFO mapred.Task: Task:attempt_local956513444_0001_m_000010_0 is done. And is in the process of commiting
15/02/07 11:17:50 INFO mapred.LocalJobRunner:
15/02/07 11:17:50 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000010_0' done.
15/02/07 11:17:50 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000010_0
15/02/07 11:17:50 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000011_0
15/02/07 11:17:50 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1fc4f0f8
15/02/07 11:17:50 INFO mapred.MapTask: Processing split: file:/input/conf/fair-scheduler.xml:0+327
15/02/07 11:17:50 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:51 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:51 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:51 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:51 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:51 INFO mapred.Task: Task:attempt_local956513444_0001_m_000011_0 is done. And is in the process of commiting
15/02/07 11:17:51 INFO mapred.LocalJobRunner:
15/02/07 11:17:51 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000011_0' done.
15/02/07 11:17:51 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000011_0
15/02/07 11:17:51 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000012_0
15/02/07 11:17:51 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@66922804
15/02/07 11:17:51 INFO mapred.MapTask: Processing split: file:/input/conf/core-site.xml:0+178
15/02/07 11:17:51 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:51 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:51 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:51 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:51 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:51 INFO mapred.Task: Task:attempt_local956513444_0001_m_000012_0 is done. And is in the process of commiting
15/02/07 11:17:51 INFO mapred.LocalJobRunner:
15/02/07 11:17:51 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000012_0' done.
15/02/07 11:17:51 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000012_0
15/02/07 11:17:51 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000013_0
15/02/07 11:17:51 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@2e716cb7
15/02/07 11:17:51 INFO mapred.MapTask: Processing split: file:/input/conf/hdfs-site.xml:0+178
15/02/07 11:17:51 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:51 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:51 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:51 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:51 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:51 INFO mapred.Task: Task:attempt_local956513444_0001_m_000013_0 is done. And is in the process of commiting
15/02/07 11:17:51 INFO mapred.LocalJobRunner:
15/02/07 11:17:51 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000013_0' done.
15/02/07 11:17:51 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000013_0
15/02/07 11:17:51 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000014_0
15/02/07 11:17:51 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1dd49247
15/02/07 11:17:51 INFO mapred.MapTask: Processing split: file:/input/conf/mapred-site.xml:0+178
15/02/07 11:17:51 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:51 INFO mapred.JobClient:  map 82% reduce 0%
15/02/07 11:17:51 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:51 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:51 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:51 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:51 INFO mapred.Task: Task:attempt_local956513444_0001_m_000014_0 is done. And is in the process of commiting
15/02/07 11:17:51 INFO mapred.LocalJobRunner:
15/02/07 11:17:51 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000014_0' done.
15/02/07 11:17:51 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000014_0
15/02/07 11:17:51 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000015_0
15/02/07 11:17:51 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1f57ea4a
15/02/07 11:17:51 INFO mapred.MapTask: Processing split: file:/input/conf/masters:0+10
15/02/07 11:17:51 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:51 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:51 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:51 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:51 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:51 INFO mapred.Task: Task:attempt_local956513444_0001_m_000015_0 is done. And is in the process of commiting
15/02/07 11:17:51 INFO mapred.LocalJobRunner:
15/02/07 11:17:51 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000015_0' done.
15/02/07 11:17:51 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000015_0
15/02/07 11:17:51 INFO mapred.LocalJobRunner: Starting task: attempt_local956513444_0001_m_000016_0
15/02/07 11:17:51 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@5470be88
15/02/07 11:17:51 INFO mapred.MapTask: Processing split: file:/input/conf/slaves:0+10
15/02/07 11:17:51 INFO mapred.MapTask: io.sort.mb = 100
15/02/07 11:17:51 INFO mapred.MapTask: data buffer = 79691776/99614720
15/02/07 11:17:51 INFO mapred.MapTask: record buffer = 262144/327680
15/02/07 11:17:51 INFO mapred.MapTask: Starting flush of map output
15/02/07 11:17:51 INFO mapred.MapTask: Finished spill 0
15/02/07 11:17:51 INFO mapred.Task: Task:attempt_local956513444_0001_m_000016_0 is done. And is in the process of commiting
15/02/07 11:17:51 INFO mapred.LocalJobRunner:
15/02/07 11:17:51 INFO mapred.Task: Task 'attempt_local956513444_0001_m_000016_0' done.
15/02/07 11:17:51 INFO mapred.LocalJobRunner: Finishing task: attempt_local956513444_0001_m_000016_0
15/02/07 11:17:51 INFO mapred.LocalJobRunner: Map task executor complete.
15/02/07 11:17:51 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@19c8ef56
15/02/07 11:17:51 INFO mapred.LocalJobRunner:
15/02/07 11:17:51 INFO mapred.Merger: Merging 17 sorted segments
15/02/07 11:17:51 INFO mapred.Merger: Merging 8 intermediate segments out of a total of 17
15/02/07 11:17:51 INFO mapred.Merger: Down to the last merge-pass, with 10 segments left of total size: 31626 bytes
15/02/07 11:17:51 INFO mapred.LocalJobRunner:
15/02/07 11:17:52 INFO mapred.Task: Task:attempt_local956513444_0001_r_000000_0 is done. And is in the process of commiting
15/02/07 11:17:52 INFO mapred.LocalJobRunner:
15/02/07 11:17:52 INFO mapred.Task: Task attempt_local956513444_0001_r_000000_0 is allowed to commit now
15/02/07 11:17:52 INFO output.FileOutputCommitter: Saved output of task 'attempt_local956513444_0001_r_000000_0' to output2
15/02/07 11:17:52 INFO mapred.LocalJobRunner: reduce > reduce
15/02/07 11:17:52 INFO mapred.Task: Task 'attempt_local956513444_0001_r_000000_0' done.
15/02/07 11:17:52 INFO mapred.JobClient:  map 100% reduce 100%
15/02/07 11:17:52 INFO mapred.JobClient: Job complete: job_local956513444_0001
15/02/07 11:17:52 INFO mapred.JobClient: Counters: 20
15/02/07 11:17:52 INFO mapred.JobClient:   File Output Format Counters
15/02/07 11:17:52 INFO mapred.JobClient:     Bytes Written=16198
15/02/07 11:17:52 INFO mapred.JobClient:   File Input Format Counters
15/02/07 11:17:52 INFO mapred.JobClient:     Bytes Read=33985
15/02/07 11:17:52 INFO mapred.JobClient:   FileSystemCounters
15/02/07 11:17:52 INFO mapred.JobClient:     FILE_BYTES_READ=3340783
15/02/07 11:17:52 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=4021597
15/02/07 11:17:52 INFO mapred.JobClient:   Map-Reduce Framework
15/02/07 11:17:52 INFO mapred.JobClient:     Map output materialized bytes=31708
15/02/07 11:17:52 INFO mapred.JobClient:     Map input records=945
15/02/07 11:17:52 INFO mapred.JobClient:     Reduce shuffle bytes=0
15/02/07 11:17:52 INFO mapred.JobClient:     Spilled Records=3474
15/02/07 11:17:52 INFO mapred.JobClient:     Map output bytes=46060
15/02/07 11:17:52 INFO mapred.JobClient:     CPU time spent (ms)=0
15/02/07 11:17:52 INFO mapred.JobClient:     Total committed heap usage (bytes)=4318724096
15/02/07 11:17:52 INFO mapred.JobClient:     Combine input records=3418
15/02/07 11:17:52 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1681
15/02/07 11:17:52 INFO mapred.JobClient:     Reduce input records=1631
15/02/07 11:17:52 INFO mapred.JobClient:     Reduce input groups=857
15/02/07 11:17:52 INFO mapred.JobClient:     Combine output records=1631
15/02/07 11:17:52 INFO mapred.JobClient:     Physical memory (bytes) snapshot=0
15/02/07 11:17:52 INFO mapred.JobClient:     Reduce output records=857
15/02/07 11:17:52 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=0
15/02/07 11:17:52 INFO mapred.JobClient:     Map output records=3418
root@sambafs:/#
ok,安装成功了,测试也ok了,接下来继续。

 
 

在继续往下走之前,需要在hadoop目录下建几个文件夹,在 /opt/hadoop-1.2.1 下执行如下图的命令
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网

 
 

5、在conf目录下执行:vim core-site.xml ,在<configuration></configuration>之间加入如下:
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop-1.2.1/tmp</value>
</property>
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网
 
 

6、在conf目录下执行:vim hdfs-site.xml ,在<configuration></configuration>之间加入如下:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/opt/hadoop-1.2.1/hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/opt/hadoop-1.2.1/hdfs/data</value>
</property>
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网

 
 

7、在conf目录下执行:vim mapred-site.xml,在<configuration></configuration>之间加入如下:
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网

 
 

8、格式化namenode,在这里不要执行多次格式化,执行一次就好了,命令 hadoop namenode -format
root@sambafs:/opt/hadoop-1.2.1/conf# hadoop namenode -format
15/02/07 11:39:17 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = sambafs/10.122.129.160
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.2.1
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG:   java = 1.6.0_45
************************************************************/
Re-format filesystem in /opt/hadoop-1.2.1/hdfs/name ? (Y or N) Y
15/02/07 11:39:26 INFO util.GSet: Computing capacity for map BlocksMap
15/02/07 11:39:26 INFO util.GSet: VM type       = 64-bit
15/02/07 11:39:26 INFO util.GSet: 2.0% max memory = 1013645312
15/02/07 11:39:26 INFO util.GSet: capacity      = 2^21 = 2097152 entries
15/02/07 11:39:26 INFO util.GSet: recommended=2097152, actual=2097152
15/02/07 11:39:27 INFO namenode.FSNamesystem: fsOwner=root
15/02/07 11:39:27 INFO namenode.FSNamesystem: supergroup=supergroup
15/02/07 11:39:27 INFO namenode.FSNamesystem: isPermissionEnabled=true
15/02/07 11:39:27 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
15/02/07 11:39:27 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
15/02/07 11:39:27 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
15/02/07 11:39:27 INFO namenode.NameNode: Caching file names occuring more than 10 times
15/02/07 11:39:28 INFO common.Storage: Image file /opt/hadoop-1.2.1/hdfs/name/current/fsimage of size 110 bytes saved in 0 seconds.
15/02/07 11:39:28 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/opt/hadoop-1.2.1/hdfs/name/current/edits
15/02/07 11:39:28 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/opt/hadoop-1.2.1/hdfs/name/current/edits
15/02/07 11:39:28 INFO common.Storage: Storage directory /opt/hadoop-1.2.1/hdfs/name has been successfully formatted.
15/02/07 11:39:28 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at sambafs/10.122.129.160
************************************************************/
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网

 
 

9、格式化完成之后就可以 启动 hadoop 的各个进程了。 
进入到/opt/hadoop-1.2.1/bin目录下,执行  ./start-all.sh ,这里还需要输入一下root用户的密码,权限问题
root@sambafs:/opt/hadoop-1.2.1/bin# ./start-all.sh
namenode running as process 2318. Stop it first.
The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is d4:60:13:4b:7f:c7:a6:dc:73:59:a9:fa:ed:18:80:83.
Are you sure you want to continue connecting (yes/no)? yes
localhost: Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
root@localhost's password:
localhost: starting datanode, logging to /opt/hadoop-1.2.1/libexec/../logs/hadoop-root-datanode-sambafs.out
root@localhost's password:
localhost: starting secondarynamenode, logging to /opt/hadoop-1.2.1/libexec/../logs/hadoop-root-secondarynamenode-sambafs.out
jobtracker running as process 2489. Stop it first.
root@localhost's password:
localhost: starting tasktracker, logging to /opt/hadoop-1.2.1/libexec/../logs/hadoop-root-tasktracker-sambafs.out
root@sambafs:/opt/hadoop-1.2.1/bin#
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网

 
 

10、验证启动,使用java 的 jps 命令
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网
root@sambafs:/opt/hadoop-1.2.1/bin# jps
2489 JobTracker
3594 Jps
3521 TaskTracker
2938 DataNode
3227 SecondaryNameNode
2318 NameNode
root@sambafs:/opt/hadoop-1.2.1/bin#

6个进程全部启动了,此时安装正常,可以进行Hadoop的相关操作了。
http://192.168.1.180:50030/jobtracker.jsp     Hadoop Job Tracker   管理介面
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网

 
 

http://192.168.1.180:50060/tasktracker.jsp    Hadoop Task Tracker 状态
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网

 
 

http://192.168.1.180:50070/dfshealth.jsp      Hadoop DFS 状态
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网

 
 

hadoop的伪分布模式已经安装成功,我们再来执行  WordCount  来感受MapReduce过程。
这时注意程序是在文件系统dfs运行的,创建的文件也都基于文件系统。
 
 

11、dfs测试
使用如下命令在文件系统里创建文件夹
hadoop dfs -mkdir cfei_input

 
 

再把我们的jdk安装脚本拷贝到cfei_input里面去
hadoop dfs -copyFromLocal /root/cfei/install_jdk.sh cfei_input

 
 

开始使用wordcount测试,注意此时在 /opt/hadoop-1.2.1/ 目录下
hadoop jar hadoop-examples-1.2.1.jar wordcount cfei_input output

 
 

执行成功如下图:
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网
成功之后我看下它的输出:
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网
到这里就ok了,当然了,如果第一次安装的话 可能不会那么顺利,你还需处理一些异常,下面来搞搞这些。

 
 
 
 

异常处理
如果在测试 hadoop jar hadoop-examples-1.2.1.jar wordcount cfei_input output的时候 ,一直卡在  INFO mapred.JobClient:  map 100% reduce 0%  怎么办?
你可以配置一下你的hostname,比如我的hostname是 sambafs ,要这样处理
(1)、vim /etc/hosts

 
 

(2)、在该文件里加入如下图
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网
然后重启你的服务器,reboot命令。

 
 

(3)启动完成之后,开始用hadoop命令可能还不行,你还需要source一下hadoop-env.sh,让环境变量生效。

 
 

(4)还需要使用jps查看一下所有进程是否都已经启动,没有启动,还需执行 一遍启动命令:./start-all.sh ,没有启动完成你却又执行了hadoop的命令,出现如下图:
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网

 
 

(5)如果已经启动了各个进程,但又出现如下图,怎么办?
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网
此时就要删除 那个output2 的文件夹 ,或者换一个存放输出文件的文件夹名称,删除的话如下图:
[hadoop1.x伪分布安装][hadoop 1.2.1安装][hadoop安装]-飞网

 
 

这些都搞定了,那相信你执行 wordcount  这个测试命令 没有问题了。
 hadoop jar hadoop-examples-1.2.1.jar wordcount cfei_input output

 
 

执行ok,伪分布安装和配置到此结束,如果你按照这篇文章来配置出问题的话,可以发邮件给我:it@cfei.net
下一篇 ,我们来配下 分布模式 下的。