hadoop集群搭建
hadoop配置
版本: VMware Workstation Pro16.2 hadoop2.7 centos7
进入hadoop/etc/hadoop目录
配置core-site.xml
fs.defaultFS hdfs://hadoop01:8020 hadoop.tmp.dir /home/hadoop/tmp ha.zookeeper.quorum hadoop01:2181,hadoop02:2181,hadoop03:2181 ha.zookeeper.session-timeout.ms 10000
配置hdfs-site.xml
dfs.replication 3 dfs.namenode.name.dir /home/hadoop/namenode/data dfs.datanode.data.dir /home/hadoop/datanode/data dfs.nameservices mycluster dfs.ha.namenodes.mycluster nn1,nn2 dfs.namenode.rpc-address.mycluster.nn1 hadoop01:8020 dfs.namenode.rpc-address.mycluster.nn2 hadoop02:8020 dfs.namenode.http-address.mycluster.nn1 hadoop01:50070 dfs.namenode.http-address.mycluster.nn2 hadoop02:50070 dfs.namenode.shared.edits.dir qjournal://hadoop01:8485;hadoop02:8485;hadoop03:8485/mycluster dfs.journalnode.edits.dir /home/hadoop/journalnode/data dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files /root/.ssh/id_rsa dfs.ha.fencing.ssh.connect-timeout 30000 dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProviderv alue> dfs.ha.automatic-failover.enabled true
配置yarn-site.xml
yarn.nodemanager.aux-services mapreduce_shuffle yarn.log-aggregation-enable true yarn.log-aggregation.retain-seconds 86400 yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id my-yarn-cluster yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 hadoop02 yarn.resourcemanager.hostname.rm2 hadoop03 yarn.resourcemanager.webapp.address.rm1 hadoop02:8088 yarn.resourcemanager.webapp.address.rm2 hadoop03:8088 yarn.resourcemanager.zk-address hadoop01:2181,hadoop02:2181,hadoop03:2181 yarn.resourcemanager.recovery.enabled true yarn.resourcemanager.store.class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStorevalue >
配置mapred-site.xml
mapreduce.framework.name yarn
配置slaves
hadoop01
hadoop02
hadoop03
分发程序
# 将安装包分发到hadoop02 scp -r Hadoop安装路径 hadoop002:安装路径 # 将安装包分发到hadoop03 scp -r Hadoop安装路径 hadoop003:安装路径
初始化
在 Hadoop01上执行 namenode 初始化命令
hdfs namenode -format
启动集群
在Hadoop01 的${HADOOP_HOME}/sbin
目录下,启动 Hadoop。此时 hadoop02
和 hadoop03
上的相关服务也会被启动:
# 启动dfs服务 start-dfs.sh # 启动yarn服务 start-yarn.sh
查看集群
使用jps查看服务进程,如图说明配置成功。
在浏览器使用主机名:50070进入 Web-UI 界面进入