【TPC-DS】trino+S3+hive+postgresql性能测试----hadoop+mysql+hive安装部署(三)


总体过程

1、安装hadoop-3.2.0

2、安装hive-standalone-metastore-3.0.0

3、配置环境变量

 

安装hadoop-3.2.0

  • 下载hadoop的链接,选择hadoop3.2.0https://archive.apache.org/dist/hadoop/common/。注意选择hadoop-3.2.0.tar.gz的包
  • 解压缩包
tar -zxvf hadoop-3.2.0.tar.gz -C /opt/
  • 重定向
ln -sv /opt/hadoop-3.2.0 /opt/hadoop
ln -sv /opt/hadoop/share/hadoop/tools/lib/hadoop-aws* /opt/hadoop/share/hadoop/common/lib/
ln -sv /opt/hadoop/share/hadoop/tools/lib/aws-java-sdk* /opt/hadoop/share/hadoop/common/lib/
  • 配置hadoop环境
vim /etc/profile

添加如下的内容

#set java environment
export JAVA_HOME CLASSPATH PATH
export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin

 

安装hive-standalone-metastore-3.0.0

  • 下载安装包:http://apache.uvigo.es/hive/hive-standalone-metastore-3.0.0/hive-standalone-metastore-3.0.0-bin.tar.gz
  • 解压缩包
tar -zxvf hive-standalone-metastore-3.0.0-bin.tar.gz -C /opt/
  • 重定向
ln -s /opt/apache-hive-metastore-3.0.0-bin /opt/hive-metastore
sed -i 's#/dev/null#/tmp/metastore_start.log#' /opt/apache-hive-metastore-3.0.0-bin/bin/start-metastore
sed -i 's/256/1024/g' /opt/hive-metastore/bin/metastore-config.sh

安装mysql-connector-java-5.1.47

  • 下载安装包:https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.47.tar.gz
  • 解压缩
tar -zxvf mysql-connector-java-5.1.47 /opt/
  • 重定向
ln -s /opt/mysql-connector-java-5.1.47/mysql-connector-java-5.1.47.jar /opt/hadoop/share/hadoop/common/lib/ 
ln -s /opt/mysql-connector-java-5.1.47/mysql-connector-java-5.1.47.jar /opt/hive-metastore/lib/

 

修改和添加配置文件metastore-site.xml

  • 进入目录/opt/hive-metastore/conf/,将hive-site.xml改为metastore-site.xml。将如下的内容复制替换掉metastore-site.xml的文件中
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>




        metastore.task.threads.always
        org.apache.hadoop.hive.metastore.events.EventCleanerTask


        metastore.expression.proxy
        org.apache.hadoop.hive.metastore.DefaultPartitionExpressionProxy



  hive.metastore.warehouse.dir
  s3a://fastdata-tsdb/syw/warehouse/



  hive.metastore.local
  false

 

  hive.metastore.uris
  thrift://10.201.0.124:9083



  fs.s3a.fast.upload
  true


  fs.s3a.multipart.size
  100M

 
   hive.metastore.local
     false

 
   hive.metastore.schema.verification
     false


        javax.jdo.option.ConnectionDriverName
        com.mysql.jdbc.Driver


        javax.jdo.option.ConnectionURL
        jdbc:mysql://10.201.0.125:3306/metastore_db?useSSL=false


        javax.jdo.option.ConnectionUserName
        root


        javax.jdo.option.ConnectionPassword
        123456


【重要】如下的内容字段需要根据自己的情况进行修改,其他的内容不变:

  • hive.metastore.uris:自己的服务器地址,注意修改10.201.0.124替换成自己的服务器IP
  • javax.jdo.option.ConnectionURL:自己安装的mysql的连接地址
  • javax.jdo.option.ConnectionUserName:自己mysql数据库的用户名称
  • javax.jdo.option.ConnectionPassword:自己mysql数据库的密码

修改和添加配置文件core-site.xml

  • 进入cd /opt/hadoop/etc/hadoop/ 修改core-site.xml文件,用如下的代码内容替换掉原先的内容即可,不需要做任何的修改
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

 
   fs.s3a.impl 
   org.apache.hadoop.fs.s3a.S3AFileSystem 
 
 
   fs.s3a.access.key 
   GVO55HIKELKQIN4QSOPL 
 
 
   fs.s3a.secret.key 
   XgdhYuGsXHHg3yi0WySVQ8GwztW4SpIrL5irUdlA 
 
 
   fs.s3a.endpoint 
   obs.cn-southwest-2.myhuaweicloud.com 
 

fs.s3a.list.version
1

 

catalog中添加hive、postgres、iceberg配置文件

注意:每个服务器都需要添加,直接复制粘贴即可,不需要做任何的修改

  • 进入目录:/root/trino/trino-server-363/etc/catalog
  • 新建hive.properties、postgresql.properties、iceberg.properties文件
  • hive.properties。10.201.0.124注意修改为master的IP,其余内容均不修改
connector.name=hive-hadoop2
hive.metastore.uri=thrift://10.201.0.124:9083
hive.allow-drop-table=true
hive.max-partitions-per-writers=10000
hive.max-partitions-per-scan=1000000
hive.s3.endpoint=obs.cn-southwest-2.myhuaweicloud.com
hive.s3.path-style-access=true
hive.s3.ssl.enabled=false
hive.s3.max-connections=100

postgresql.properties。(注意:tpcds是pq数据中常见的库名)

connector.name=postgresql
connection-url=jdbc:postgresql://10.201.0.125:5432/tpcds
connection-user=postgres
connection-password=public
  • iceberg.properties,10.201.0.124注意修改为master的IP,其余内容均不修改
connector.name=iceberg
hive.metastore.uri=thrift://10.201.0.124:9083
hive.max-partitions-per-scan=1000000
hive.s3.endpoint=obs.cn-southwest-2.myhuaweicloud.com
hive.s3.path-style-access=true
hive.s3.ssl.enabled=false
hive.s3.max-connections=100

启动metastore(124)

/opt/hive-metastore/bin/start-metastore -p 9083

创建metastore_db库

  • 连接mysql,并手动创建库,库名称为:metastore_db
mysql -uroot -h 10.201.0.125 -p
输入密码
create database metastore_db;

mysql库metastore_db中创建表

/opt/hive-metastore/bin/schematool --verbose -initSchema -dbType mysql -userName root  -passWord 123456 -url jdbc:mysql://10.201.0.125:3306/metastore_db?useSSL=false

【重要】注意修改10.201.0.125为自己配置的mysql服务器地址

  • 进入mysql中验证库中的表

启动trino

注:若之前已启动trino。则此处不用再次启动

验证trino环境是否ok

验证环境

./trino --server 10.201.0.125:8080 --catalog postgresql