博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Linux下搭建Hadoop具体步骤
阅读量:6950 次
发布时间:2019-06-27

本文共 10707 字,大约阅读时间需要 35 分钟。

装好虚拟机+Linux。而且主机网络和虚拟机网络互通。

以及Linux上装好JDK

1:在Linux下输入命令vi /etc/profile 加入HADOOP_HOME

export  JAVA_HOME=/home/hadoop/export/jdkexport  HADOOP_HOME=/home/hadoop/export/hadoopexport  PATH=.:$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin
2:改动hadoop/conf文件夹以下hadoop-env.sh第九行

export JAVA_HOME=/home/hadoop/export/jdk
3:改动hadoop/conf文件夹以下core-site.xml

hadoop.tmp.dir
/home/.../tmp
fs.default.name
hdfs://127.0.0.1:9000
4:改动hadoop/conf文件夹以下hdfs-site.xml

dfs.replication
1
5:改动hadoop/conf文件夹以下mapred-site.xml
mapred.job.tracker
127.0.0.1:9001
改动完毕。
转到hadoop/bin以下输入hadoop namenode -format
出现例如以下:(说明成功)
Warning: $HADOOP_HOME is deprecated.14/07/15 16:06:27 INFO namenode.NameNode: STARTUP_MSG: /************************************************************STARTUP_MSG: Starting NameNodeSTARTUP_MSG:   host = ubuntu/127.0.1.1STARTUP_MSG:   args = [-format]STARTUP_MSG:   version = 1.2.1STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013STARTUP_MSG:   java = 1.7.0_55************************************************************/14/07/15 16:07:09 INFO util.GSet: Computing capacity for map BlocksMap14/07/15 16:07:09 INFO util.GSet: VM type       = 32-bit14/07/15 16:07:09 INFO util.GSet: 2.0% max memory = 101364531214/07/15 16:07:09 INFO util.GSet: capacity      = 2^22 = 4194304 entries14/07/15 16:07:09 INFO util.GSet: recommended=4194304, actual=419430414/07/15 16:07:10 INFO namenode.FSNamesystem: fsOwner=hadoop14/07/15 16:07:10 INFO namenode.FSNamesystem: supergroup=supergroup14/07/15 16:07:10 INFO namenode.FSNamesystem: isPermissionEnabled=true14/07/15 16:07:10 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=10014/07/15 16:07:10 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)14/07/15 16:07:10 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 014/07/15 16:07:10 INFO namenode.NameNode: Caching file names occuring more than 10 times 14/07/15 16:07:10 INFO common.Storage: Image file /home/hadoop/tmp/dfs/name/current/fsimage of size 118 bytes saved in 0 seconds.14/07/15 16:07:10 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/home/hadoop/tmp/dfs/name/current/edits14/07/15 16:07:10 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/home/hadoop/tmp/dfs/name/current/edits14/07/15 16:07:10 INFO common.Storage: Storage directory /home/hadoop/tmp/dfs/name has been successfully formatted.14/07/15 16:07:10 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1************************************************************/
在这一部分中有一部分人会出现失败的情况。可是你一定要去查hadoop以下logs里面的输出异常非常具体。

第一次失败一定要记住删掉tmp以下的输出。由于有可能会出现不兼容的情况。
然后输入start-all.sh

Warning: $HADOOP_HOME is deprecated.starting namenode, logging to /home/hadoop/export/hadoop/libexec/../logs/hadoop-hadoop-namenode-ubuntu.outlocalhost: starting datanode, logging to /home/hadoop/export/hadoop/libexec/../logs/hadoop-hadoop-datanode-ubuntu.outlocalhost: starting secondarynamenode, logging to /home/hadoop/export/hadoop/libexec/../logs/hadoop-hadoop-secondarynamenode-ubuntu.outstarting jobtracker, logging to /home/hadoop/export/hadoop/libexec/../logs/hadoop-hadoop-jobtracker-ubuntu.outlocalhost: starting tasktracker, logging to /home/hadoop/export/hadoop/libexec/../logs/hadoop-hadoop-tasktracker-ubuntu.out
在上面的过程中可能会提示你输入password,这时你能够设置个ssh免password登陆,我博客里面有。
输入jps 出现例如以下:(少一个datanode。这里我有益设置一个错误)
10666 NameNode
11547 Jps
11445 TaskTracker
11130 SecondaryNameNode
11218 JobTracker

查看logs

2014-07-15 16:13:43,032 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties2014-07-15 16:13:43,094 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.2014-07-15 16:13:43,098 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).2014-07-15 16:13:43,118 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started2014-07-15 16:13:43,999 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.2014-07-15 16:13:44,044 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!2014-07-15 16:13:45,484 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /home/hadoop/tmp/dfs/data: namenode namespaceID = 224603228; datanode namespaceID = 566757162	at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:232)	at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:147)	at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:414)	at org.apache.hadoop.hdfs.server.datanode.DataNode.
(DataNode.java:321) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1712) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1651) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1669) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1795) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1812)
这时你仅仅要删除tmp下的文件,问题解决。

然后你能够运行一个实例:详细操作例如以下

hadoop@ubuntu:~/export/hadoop$ lsbin          hadoop-ant-1.2.1.jar          ivy          README.txtbuild.xml    hadoop-client-1.2.1.jar       ivy.xml      sbinc++          hadoop-core-1.2.1.jar         lib          shareCHANGES.txt  hadoop-examples-1.2.1.jar     libexec      srcconf         hadoop-minicluster-1.2.1.jar  LICENSE.txt  webappscontrib      hadoop-test-1.2.1.jar         logsdocs         hadoop-tools-1.2.1.jar        NOTICE.txt
 进行上传hdfs文件操作
hadoop@ubuntu:~/export/hadoop$ hadoop fs -put README.txt  /Warning: $HADOOP_HOME is deprecated.
如上说明上传成功。
运行一段wordcount程序(进行对README.txt文件处理)

hadoop@ubuntu:~/export/hadoop$ hadoop jar hadoop-examples-1.2.1.jar wordcount /README.txt /wordcountoutputWarning: $HADOOP_HOME is deprecated.14/07/15 15:23:01 INFO input.FileInputFormat: Total input paths to process : 114/07/15 15:23:01 INFO util.NativeCodeLoader: Loaded the native-hadoop library14/07/15 15:23:01 WARN snappy.LoadSnappy: Snappy native library not loaded14/07/15 15:23:02 INFO mapred.JobClient: Running job: job_201407141636_000114/07/15 15:23:03 INFO mapred.JobClient:  map 0% reduce 0%14/07/15 15:23:15 INFO mapred.JobClient:  map 100% reduce 0%14/07/15 15:23:30 INFO mapred.JobClient:  map 100% reduce 100%14/07/15 15:23:32 INFO mapred.JobClient: Job complete: job_201407141636_000114/07/15 15:23:32 INFO mapred.JobClient: Counters: 2914/07/15 15:23:32 INFO mapred.JobClient:   Job Counters 14/07/15 15:23:32 INFO mapred.JobClient:     Launched reduce tasks=114/07/15 15:23:32 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=1256314/07/15 15:23:32 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=014/07/15 15:23:32 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=014/07/15 15:23:32 INFO mapred.JobClient:     Launched map tasks=114/07/15 15:23:32 INFO mapred.JobClient:     Data-local map tasks=114/07/15 15:23:32 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=1455014/07/15 15:23:32 INFO mapred.JobClient:   File Output Format Counters 14/07/15 15:23:32 INFO mapred.JobClient:     Bytes Written=130614/07/15 15:23:32 INFO mapred.JobClient:   FileSystemCounters14/07/15 15:23:32 INFO mapred.JobClient:     FILE_BYTES_READ=183614/07/15 15:23:32 INFO mapred.JobClient:     HDFS_BYTES_READ=146314/07/15 15:23:32 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=12083914/07/15 15:23:32 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=130614/07/15 15:23:32 INFO mapred.JobClient:   File Input Format Counters 14/07/15 15:23:32 INFO mapred.JobClient:     Bytes Read=136614/07/15 15:23:32 INFO mapred.JobClient:   Map-Reduce Framework14/07/15 15:23:32 INFO mapred.JobClient:     Map output materialized bytes=183614/07/15 15:23:32 INFO mapred.JobClient:     Map input records=3114/07/15 15:23:32 INFO mapred.JobClient:     Reduce shuffle bytes=183614/07/15 15:23:32 INFO mapred.JobClient:     Spilled Records=26214/07/15 15:23:32 INFO mapred.JobClient:     Map output bytes=205514/07/15 15:23:32 INFO mapred.JobClient:     Total committed heap usage (bytes)=21261107214/07/15 15:23:32 INFO mapred.JobClient:     CPU time spent (ms)=243014/07/15 15:23:32 INFO mapred.JobClient:     Combine input records=17914/07/15 15:23:32 INFO mapred.JobClient:     SPLIT_RAW_BYTES=9714/07/15 15:23:32 INFO mapred.JobClient:     Reduce input records=13114/07/15 15:23:32 INFO mapred.JobClient:     Reduce input groups=13114/07/15 15:23:32 INFO mapred.JobClient:     Combine output records=13114/07/15 15:23:32 INFO mapred.JobClient:     Physical memory (bytes) snapshot=17754521614/07/15 15:23:32 INFO mapred.JobClient:     Reduce output records=13114/07/15 15:23:32 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=69568102414/07/15 15:23:32 INFO mapred.JobClient:     Map output records=179
hadoop@ubuntu:~/export/hadoop$ hadoop fs -ls /Warning: $HADOOP_HOME is deprecated.Found 3 items-rw-r--r--   1 hadoop supergroup       1366 2014-07-15 15:21 /README.txtdrwxr-xr-x   - hadoop supergroup          0 2014-07-14 16:36 /homedrwxr-xr-x   - hadoop supergroup          0 2014-07-15 15:23 /wordcountoutputhadoop@ubuntu:~/export/hadoop$ hadoop fs -get  /wordcountoutput  /home/hadoop/Warning: $HADOOP_HOME is deprecated.
你能够下载下来看看这个文件例如以下:
(see	15D002.C.1,	1740.13)	1
1Administration 1Apache 1BEFORE 1BIS 1Bureau 1Commerce, 1Commodity 1Control 1Core 1Department 1ENC 1Exception 1Export 2For 1Foundation 1Government 1Hadoop 1Hadoop, 1Industry 1Jetty 1License 1Number 1Regulations, 1SSL 1Section 1Security 1See 1Software 2Technology 1The 4This 1U.S. 1Unrestricted 1about 1algorithms. 1and 6and/or 1another 1any 1as 1asymmetric 1at: 2both 1by 1check 1classified 1code 1code. 1concerning 1country 1country's 1country, 1cryptographic 3currently 1details 1distribution 2eligible 1encryption 3exception 1export 1following 1for 3form 1from 1functions 1has 1have 1

转载地址:http://nmhnl.baihongyu.com/

你可能感兴趣的文章
简单的一个布局CSS+DIV
查看>>
面试时要懂得说的黄金五条
查看>>
字王4K云字库入驻github
查看>>
UVa10561 Treblecross
查看>>
windbg 调试提示sos与clr不匹配问题
查看>>
剑指offer:数据流中的中位数
查看>>
JS调用命令实现F11全屏
查看>>
a标签href无值,点击刷新页面解决办法
查看>>
Arm开发板+Qt学习之路
查看>>
unknown local index 'index_name' in search request
查看>>
看视频学编程之C#中的类
查看>>
C# DataGridView控件绑定数据后清空数据
查看>>
C++基础知识(一)
查看>>
高抬贵手,拉耳复阳
查看>>
win2003 iis6 iis假死
查看>>
计算机网络知识总结
查看>>
poj 3844 Divisible Subsequences 剩余类,组合计数
查看>>
响应式布局这件小事
查看>>
子窗口访问父页面iframe中的iframe,top打开的子窗口访问父页面中的iframe中的iframe...
查看>>
css属性设置
查看>>