博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
部署hadoop2.7.2 集群 基于zookeeper配置HDFS HA+Federation
阅读量:6072 次
发布时间:2019-06-20

本文共 14729 字,大约阅读时间需要 49 分钟。

转自:http://www.2cto.com/os/201605/510489.html

hadoop1的核心组成是两部分,即HDFS和MapReduce。在hadoop2中变为HDFS和Yarn。新的HDFS中的NameNode不再是只有一个了,可以有多个(目前只支持2个)。每一个都有相同的职能。

两个NameNode

当集群运行时,只有active状态的NameNode是正常工作的,standby状态的NameNode是处于待命状态的,时刻同步active状态NameNode的数据。一旦active状态的NameNode不能工作,通过手工或者自动切换,standby状态的NameNode就可以转变为active状态的,就可以继续工作了。这就是高可靠。

NameNode发生故障时

2个NameNode的数据其实是实时共享的。新HDFS采用了一种共享机制,JournalNode集群或者NFS进行共享。NFS是操作系统层面的,JournalNode是hadoop层面的,我们这里使用JournalNode集群进行数据共享。

实现NameNode的自动切换

需要使用ZooKeeper集群进行选择了。HDFS集群中的两个NameNode都在ZooKeeper中注册,当active状态的NameNode出故障时,ZooKeeper能检测到这种情况,它就会自动把standby状态的NameNode切换为active状态。

HDFS Federation

NameNode是核心节点,维护着整个HDFS中的元数据信息,那么其容量是有限的,受制于服务器的内存空间。当NameNode服务器的内存装不下数据后,那么HDFS集群就装不下数据了,寿命也就到头了。因此其扩展性是受限的。HDFS联盟指的是有多个HDFS集群同时工作,那么其容量理论上就不受限了,夸张点说就是无限扩展。

 

节点分布

 

配置过程详述

配置文件一共包括6个,分别是hadoop-env.sh、core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml和slaves。除了hdfs-site.xml文件在不同集群配置不同外,其余文件在四个节点的配置是完全一样的,可以复制。

 

hadoop-env.sh

 

默认的HDFS路径。当有多个HDFS集群同时工作时,用户如果不写集群名称,那么默认使用哪个哪就在这里指定!该值来自于hdfs-site.xml中的配置

默认是NameNode、DataNode、JournalNode等存放数据的公共目录

ZooKeeper集群的地址和端口。注意,数量一定是奇数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
<configuration>
    
<property>
        
<
name
>fs.defaultFS</
name
>
        
<value>hdfs://cluster1</value>
    
</property>
    
<property>
        
<
name
>hadoop.tmp.dir</
name
>
        
<value>/opt/ha/hadoop-2.7.2/data/tmp</value>
    
</property>
    
<property>
        
<
name
>io.file.buffer.
size
</
name
>
        
<value>131072</value>
    
</property>
    
<property>
        
<
name
>ha.zookeeper.quorum</
name
>
        
<value>hadoop:2181,hadoop1:2181,hadoop2:2181;slave1:2181;slave2:2181</value>
    
</property>
</configuration>

 

hdfs-site.xml

这里dfs.namenode.shared.edits.dir的只在hadoop1,hadoop2中最后路径为cluster1,在slave1,slave2中最后路径为cluster2,区分开就行,可以是别的名称,还有一个core-site.xml中的fs.defaultFS在slave1和slave2中可以更改为cluster2

 

yarn-site.xml

 

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
<property>
    
<
name
>yarn.nodemanager.aux-services</
name
>
    
<value>mapreduce_shuffle</value>
</property>
<property>
    
<
name
>yarn.nodemanager.aux-services.mapreduce.shuffle.class</
name
>
    
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
    
<
name
>yarn.resourcemanager.hostname</
name
>
    
<value>hadoop</value>
</property>
<property>
    
<
name
>yarn.log-aggregation-enable</
name
>
    
<value>
true
</value>
</property>
<property>
    
<
name
>yarn.log-aggregation.retain-seconds</
name
>
    
<value>604800</value>
</property>

 

 

mapred-site.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<property>
    
<
name
>mapreduce.framework.
name
</
name
>
    
<value>yarn</value>
</property>
<property>
    
<
name
>mapreduce.job.tracker</
name
>
    
<value>hdfs://hadoop:9001</value>
    
<final>
true
</final>
</property>
<property>
    
<
name
>mapreduce.jobhistory.address</
name
>
    
<value>hadoop:10020</value>
</property>
<property>
    
<
name
>mapreduce.jobhistory.webapp.address</
name
>
    
<value>hadoop:19888</value>
</property>

slaves

 

 

1
2
3
4
5
hadoop
hadoop1
hadoop2
slave1
slave2

 

启动过程

在所有zk节点启动zookeeper

 

 

1
hadoop@hadoop:hadoop-2.7.2$ zkServer.sh start

格式化zookeeper集群

 

 

1
2
3
4
5
6
[hadoop@hadoop1 hadoop-2.7.2]$ bin/hdfs zkfc -formatZK
[hadoop@slave1 hadoop-2.7.2]$ bin/hdfs zkfc -formatZK
[hadoop@slave1 hadoop-2.7.2]$ zkCli.sh
[zk: localhost:2181(CONNECTED) 5] ls /hadoop-ha/cluster
 
cluster2   cluster1

在所有节点启动journalnode

 

 

1
2
3
hadoop@hadoop:hadoop-2.7.2$ sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging
to
/opt/ha/hadoop-2.7.2/logs/hadoop-hadoop-journalnode-hadoop.
out
hadoop@hadoop:hadoop-2.7.2$

在cluster1中的nn1格式化namenode,验证并启动

 

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[hadoop@hadoop1 hadoop-2.7.2]$ bin/hdfs namenode -format -clusterId hadoop1
16/05/19 15:43:01 INFO common.Storage: Storage directory /opt/ha/hadoop-2.7.2/data/dfs/
name
has been successfully formatted.
16/05/19 15:43:01 INFO namenode.NNStorageRetentionManager: Going
to
retain 1 images
with
txid >= 0
16/05/19 15:43:01 INFO util.ExitUtil: Exiting
with
status 0
16/05/19 15:43:01 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode
at
hadoop1/192.168.2.10
************************************************************/
[hadoop@hadoop1 hadoop-2.7.2]$ ls data/dfs/
name
/
current
/
fsimage_0000000000000000000      seen_txid
fsimage_0000000000000000000.md5  VERSION
[hadoop@hadoop1 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging
to
/opt/ha/hadoop-2.7.2/logs/hadoop-hadoop-namenode-hadoop1.
out
[hadoop@hadoop1 hadoop-2.7.2]$ jps
9551 NameNode
9423 JournalNode
9627 Jps
9039 QuorumPeerMain

 

http://hadoop1:50070查看

cluster1中另一个节点同步数据格式化,并启动

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[hadoop@hadoop2 hadoop-2.7.2]$ bin/hdfs namenode -bootstrapStandby
......
16/05/19 15:48:27 INFO common.Storage: Storage directory /opt/ha/hadoop-2.7.2/data/dfs/
name
has been successfully formatted.
16/05/19 15:48:27 INFO namenode.TransferFsImage: Opening
connection
to
16/05/19 15:48:28 INFO namenode.TransferFsImage: Image Transfer timeout configured
to
60000 milliseconds
16/05/19 15:48:28 INFO namenode.TransferFsImage: Transfer took 0.00s
at
0.00 KB/s
16/05/19 15:48:28 INFO namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000000000000
size
353 bytes.
16/05/19 15:48:28 INFO util.ExitUtil: Exiting
with
status 0
16/05/19 15:48:28 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode
at
hadoop2/192.168.2.11
************************************************************/
[hadoop@hadoop2 hadoop-2.7.2]$ ls data/dfs/
name
/
current
/
fsimage_0000000000000000000      seen_txid
fsimage_0000000000000000000.md5  VERSION
[hadoop@hadoop2 hadoop-2.7.2]$
[hadoop@hadoop2 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging
to
/opt/ha/hadoop-2.7.2/logs/hadoop-hadoop-namenode-hadoop2.
out
[hadoop@hadoop2 hadoop-2.7.2]$ jps
7196 Jps
6980 JournalNode
7120 NameNode
6854 QuorumPeerMain

http://hadoop2:50070查看如下

 

使用以上步骤同是启动cluster2的两个namenode;这里省略

然后启动所有的datanode和(必须也在hadoop节点上启动)yarn

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
[hadoop@hadoop1 hadoop-2.7.2]$ sbin/hadoop-daemons.sh start datanode
hadoop1: starting datanode, logging
to
/opt/ha/hadoop-2.7.2/logs/hadoop-hadoop-datanode-hadoop1.
out
slave2: starting datanode, logging
to
/opt/ha/hadoop-2.7.2/logs/hadoop-hadoop-datanode-slave2.
out
hadoop2: starting datanode, logging
to
/opt/ha/hadoop-2.7.2/logs/hadoop-hadoop-datanode-hadoop2.
out
slave1: starting datanode, logging
to
/opt/ha/hadoop-2.7.2/logs/hadoop-hadoop-datanode-slave1.
out
hadoop: starting datanode, logging
to
/opt/ha/hadoop-2.7.2/logs/hadoop-hadoop-datanode-hadoop.
out
hadoop@hadoop:hadoop-2.7.2$ sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging
to
/opt/ha/hadoop-2.7.2/logs/yarn-hadoop-resourcemanager-hadoop.
out
hadoop2: starting nodemanager, logging
to
/opt/ha/hadoop-2.7.2/logs/yarn-hadoop-nodemanager-hadoop2.
out
hadoop1: starting nodemanager, logging
to
/opt/ha/hadoop-2.7.2/logs/yarn-hadoop-nodemanager-hadoop1.
out
slave2: starting nodemanager, logging
to
/opt/ha/hadoop-2.7.2/logs/yarn-hadoop-nodemanager-slave2.
out
hadoop: starting nodemanager, logging
to
/opt/ha/hadoop-2.7.2/logs/yarn-hadoop-nodemanager-hadoop.
out
slave1: starting nodemanager, logging
to
/opt/ha/hadoop-2.7.2/logs/yarn-hadoop-nodemanager-slave1.
out
hadoop@hadoop:hadoop-2.7.2$ jps
19384 JournalNode
19013 QuorumPeerMain
20649 Jps
20241 ResourceManager
20396 NodeManager
19815 DataNode
 
 
[hadoop@hadoop1 hadoop-2.7.2]$ jps
10091 NodeManager
9551 NameNode
9822 DataNode
9423 JournalNode
10232 Jps
9039 QuorumPeerMain
[hadoop@hadoop2 hadoop-2.7.2]$ jps
7450 NodeManager
7295 DataNode
6980 JournalNode
7120 NameNode
6854 QuorumPeerMain
7580 Jps
[hadoop@slave1 hadoop-2.7.2]$ jps
3706 DataNode
3988 Jps
3374 JournalNode
3591 NameNode
3860 NodeManager
3184 QuorumPeerMain
[hadoop@slave2 hadoop-2.7.2]$ jps
3023 QuorumPeerMain
3643 NodeManager
3782 Jps
3177 JournalNode
3497 DataNode
3383 NameNod
  

http://hadoop:8088/cluster/nodes/

 

 

所有namenode节点启动zkfc

 

1
2
3
4
5
6
7
8
9
10
[hadoop@hadoop1 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start zkfc
starting zkfc, logging
to
/opt/ha/hadoop-2.7.2/logs/hadoop-hadoop-zkfc-hadoop1.
out
[hadoop@hadoop1 hadoop-2.7.2]$ jps
10665 DFSZKFailoverController
9551 NameNode
9822 DataNode
9423 JournalNode
10739 Jps
9039 QuorumPeerMain
10483 NodeManager

上传文件测试

 

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[hadoop@hadoop1 hadoop-2.7.2]$ bin/hdfs dfs -mkdir /test
16/05/19 16:09:19 WARN util.NativeCodeLoader: Unable
to
load
native-hadoop library
for
your platform... using builtin-java classes
where
applicable
[hadoop@hadoop1 hadoop-2.7.2]$ bin/hdfs dfs -put etc/hadoop/*.xml /test
16/05/19 16:09:36 WARN util.NativeCodeLoader: Unable
to
load
native-hadoop library
for
your platform... using builtin-java classes
where
applicable
#在slave1中查看
[hadoop@slave1 hadoop-2.7.2]$ bin/hdfs dfs -ls -R /
16/05/19 16:11:32 WARN util.NativeCodeLoader: Unable
to
load
native-hadoop library
for
your platform... using builtin-java classes
where
applicable
drwxr-xr-x   - hadoop supergroup          0 2016-05-19 16:09 /test
-rw-r
--r--   2 hadoop supergroup       4436 2016-05-19 16:09 /test/capacity-scheduler.xml
-rw-r
--r--   2 hadoop supergroup       1185 2016-05-19 16:09 /test/core-site.xml
-rw-r
--r--   2 hadoop supergroup       9683 2016-05-19 16:09 /test/hadoop-policy.xml
-rw-r
--r--   2 hadoop supergroup       3814 2016-05-19 16:09 /test/hdfs-site.xml
-rw-r
--r--   2 hadoop supergroup        620 2016-05-19 16:09 /test/httpfs-site.xml
-rw-r
--r--   2 hadoop supergroup       3518 2016-05-19 16:09 /test/kms-acls.xml
-rw-r
--r--   2 hadoop supergroup       5511 2016-05-19 16:09 /test/kms-site.xml
-rw-r
--r--   2 hadoop supergroup       1170 2016-05-19 16:09 /test/mapred-site.xml
-rw-r
--r--   2 hadoop supergroup       1777 2016-05-19 16:09 /test/yarn-site.xml
[hadoop@slave1 hadoop-2.7.2]$
  

验证yarn

 

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
[hadoop@hadoop1 hadoop-2.7.2]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /test /
out
16/05/19 16:15:25 WARN util.NativeCodeLoader: Unable
to
load
native-hadoop library
for
your platform... using builtin-java classes
where
applicable
16/05/19 16:15:26 INFO client.RMProxy: Connecting
to
ResourceManager
at
hadoop/192.168.2.3:8032
16/05/19 16:15:27 INFO input.FileInputFormat: Total input paths
to
process : 9
16/05/19 16:15:27 INFO mapreduce.JobSubmitter: number
of
splits:9
16/05/19 16:15:27 INFO mapreduce.JobSubmitter: Submitting tokens
for
job: job_1463644924165_0001
16/05/19 16:15:27 INFO impl.YarnClientImpl: Submitted application application_1463644924165_0001
16/05/19 16:15:27 INFO mapreduce.Job: The url
to
track the job:
16/05/19 16:15:27 INFO mapreduce.Job: Running job: job_1463644924165_0001
16/05/19 16:15:35 INFO mapreduce.Job: Job job_1463644924165_0001 running
in
uber mode :
false
16/05/19 16:15:35 INFO mapreduce.Job:  map 0% reduce 0%
16/05/19 16:15:44 INFO mapreduce.Job:  map 11% reduce 0%
16/05/19 16:15:59 INFO mapreduce.Job:  map 11% reduce 4%
16/05/19 16:16:08 INFO mapreduce.Job:  map 22% reduce 4%
16/05/19 16:16:10 INFO mapreduce.Job:  map 22% reduce 7%
16/05/19 16:16:22 INFO mapreduce.Job:  map 56% reduce 7%
16/05/19 16:16:26 INFO mapreduce.Job:  map 100% reduce 67%
16/05/19 16:16:29 INFO mapreduce.Job:  map 100% reduce 100%
16/05/19 16:16:29 INFO mapreduce.Job: Job job_1463644924165_0001 completed successfully
16/05/19 16:16:31 INFO mapreduce.Job: Counters: 51
    
File System Counters
        
FILE: Number
of
bytes
read
=25164
        
FILE: Number
of
bytes written=1258111
        
FILE: Number
of
read
operations=0
        
FILE: Number
of
large
read
operations=0
        
FILE: Number
of
write operations=0
        
HDFS: Number
of
bytes
read
=32620
        
HDFS: Number
of
bytes written=13523
        
HDFS: Number
of
read
operations=30
        
HDFS: Number
of
large
read
operations=0
        
HDFS: Number
of
write operations=2
    
Job Counters
        
Killed map tasks=2
        
Launched map tasks=10
        
Launched reduce tasks=1
        
Data-
local
map tasks=8
        
Rack-
local
map tasks=2
        
Total
time
spent
by
all
maps
in
occupied slots (ms)=381816
        
Total
time
spent
by
all
reduces
in
occupied slots (ms)=42021
        
Total
time
spent
by
all
map tasks (ms)=381816
        
Total
time
spent
by
all
reduce tasks (ms)=42021
        
Total vcore-milliseconds taken
by
all
map tasks=381816
        
Total vcore-milliseconds taken
by
all
reduce tasks=42021
        
Total megabyte-milliseconds taken
by
all
map tasks=390979584
        
Total megabyte-milliseconds taken
by
all
reduce tasks=43029504
    
Map-Reduce Framework
        
Map input records=963
        
Map
output
records=3041
        
Map
output
bytes=41311
        
Map
output
materialized bytes=25212
        
Input split bytes=906
        
Combine input records=3041
        
Combine
output
records=1335
        
Reduce input groups=673
        
Reduce shuffle bytes=25212
        
Reduce input records=1335
        
Reduce
output
records=673
        
Spilled Records=2670
        
Shuffled Maps =9
        
Failed Shuffles=0
        
Merged Map outputs=9
        
GC
time
elapsed (ms)=43432
        
CPU
time
spent (ms)=30760
        
Physical memory (bytes) snapshot=1813704704
        
Virtual memory (bytes) snapshot=8836780032
        
Total
committed
heap usage (bytes)=1722810368
    
Shuffle Errors
        
BAD_ID=0
        
CONNECTION
=0
        
IO_ERROR=0
        
WRONG_LENGTH=0
        
WRONG_MAP=0
        
WRONG_REDUCE=0
    
File Input Format Counters
        
Bytes
Read
=31714
    
File
Output
Format Counters
        
Bytes Written=13523

http://hadoop:8088/查看

 

结果

 

1
2
3
4
5
6
[hadoop@slave1 hadoop-2.7.2]$ bin/hdfs dfs -lsr /
out
lsr: DEPRECATED: Please use
'ls -R'
instead
.
16/05/19 16:22:14 WARN util.NativeCodeLoader: Unable
to
load
native-hadoop library
for
your platform... using builtin-java classes
where
applicable
-rw-r
--r--   2 hadoop supergroup          0 2016-05-19 16:16 /out/_SUCCESS
-rw-r
--r--   2 hadoop supergroup      13523 2016-05-19 16:16 /out/part-r-00000
[hadoop@slave1 hadoop-2.7.2]$

测试故障自动转移

 

当前情况在网页查看hadoop1和slave1为Active状态,

那把这两个namenode关闭,再查看

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[hadoop@hadoop1 hadoop-2.7.2]$ jps
10665 DFSZKFailoverController
9551 NameNode
12166 Jps
9822 DataNode
9423 JournalNode
9039 QuorumPeerMain
10483 NodeManager
[hadoop@hadoop1 hadoop-2.7.2]$ sbin/hadoop-daemon.sh stop namenode
stopping namenode
[hadoop@hadoop1 hadoop-2.7.2]$ jps
10665 DFSZKFailoverController
9822 DataNode
9423 JournalNode
12221 Jps
9039 QuorumPeerMain
10483 NodeManager

 

 

1
2
3
4
5
6
7
8
9
[hadoop@slave1 hadoop-2.7.2]$ sbin/hadoop-daemon.sh stop namenode
stopping namenode
[hadoop@slave1 hadoop-2.7.2]$ jps
3706 DataNode
3374 JournalNode
4121 NodeManager
5460 Jps
4324 DFSZKFailoverController
3184 QuorumPeerMain

 

此时Active NN已经分别转移到hadoop2和slave2上了

以上是hadoop2.2.0的HDFS集群HA配置和自动切换、HDFS federation配置、Yarn配置的基本过程,其中大家可以添加其他配置,zookeeper和journalnode也不一定所有节点都启动,只要是奇数个就ok,如果集群数量多,这些及节点均可以单独配置在一个host上

转载地址:http://yvngx.baihongyu.com/

你可能感兴趣的文章
mysql5.7 创建一个超级管理员
查看>>
【框架整合】Maven-SpringMVC3.X+Spring3.X+MyBatis3-日志、JSON解析、表关联查询等均已配置好...
查看>>
要想成为高级Java程序员需要具备哪些知识呢?
查看>>
带着问题去学习--Nginx配置解析(一)
查看>>
onix-文件系统
查看>>
java.io.Serializable浅析
查看>>
我的友情链接
查看>>
多线程之线程池任务管理通用模板
查看>>
CSS3让长单词与URL地址自动换行——word-wrap属性
查看>>
CodeForces 580B Kefa and Company
查看>>
开发规范浅谈
查看>>
Spark Streaming揭秘 Day29 深入理解Spark2.x中的Structured Streaming
查看>>
鼠标增强软件StrokeIt使用方法
查看>>
本地连接linux虚拟机的方法
查看>>
某公司面试java试题之【二】,看看吧,说不定就是你将要做的题
查看>>
BABOK - 企业分析(Enterprise Analysis)概要
查看>>
Linux 配置vnc,开启linux远程桌面
查看>>
NLog文章系列——如何优化日志性能
查看>>
Hadoop安装测试简单记录
查看>>
CentOS6.4关闭触控板
查看>>