presto安装及查询MySQL、Hive和Kafka

安装presto

1、下载服务端和客户端

服务端

1https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.234/presto-server-0.234.jar 2 3

客户端

1https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.234/presto-cli-0.234-executable.jar 2 3

2、安装

解压

1tar -zxvf presto-server-0.234.tar.gz 2 3

创建配置目录

1cd presto-server-0.234/ 2 3mkdir etc 4 5mkdir /newdisk/presto/data(最好将数据目录放到presto-server-0.234以外) 6 7cd etc 8 9mkdir catalog 10 11

3、配置

创建并配置config.properties

1#是否为Coordinator调度节点 2coordinator=true 3#是否也作为work。对于大型集群来说,在coordinator里做worker的工作会影响查询性能 4node-scheduler.include-coordinator=true 5#指定HTTP端口。Presto使用HTTP来与外部和内部进行交流 6http-server.http.port=8083 7#查询能用到的最大总内存 8query.max-memory=500MB 9#查询能用到的最大单结点内存 10query.max-memory-per-node=200MB 11#query.max-total-memory-per-node=2GB 12#Presto使用Discovery服务去找到集群中的所有结点。每个Presto实例在启动时都会在Discovery服务里注册。这样可以简化部署, 不需要额外的服务,Presto的coordinator内置一个Discovery服务。也是使用HTTP端口 13discovery-server.enabled=true 14#Coordinator节点的域名或者IP,Presto使用Discovery服务去找到集群中的所有结点。每个Presto实例在启动时都会在Discovery服务里注册。这样可以简化部署, 不需要额外的服务,Presto的coordinator内置一个Discovery服务。也是使用HTTP端口 15discovery.uri=http://192.168.72.141:8083 16 17

创建并配置jvm.config

1-server 2-Xmx16G 3-XX:+UseG1GC 4-XX:G1HeapRegionSize=32M 5-XX:+UseGCOverheadLimit 6-XX:+ExplicitGCInvokesConcurrent 7-XX:+HeapDumpOnOutOfMemoryError 8-XX:+ExitOnOutOfMemoryError 9 10

创建并配置log.properties

1com.facebook.presto = INFO 2 3

创建并配置node.properties

1#自定义环境名字,Presto集群中的结点的环境名字都必须是一样的 2node.environment=production 3#唯一标识,每个结点的标识都必须是为一的。就算重启或升级Presto都必须还保持原来的标识 4node.id=presto1 5#数据目录,Presto用它来保存log和其他数据 6node.data-dir=/newdisk/presto/data 7 8

4、启动

启动服务端

1cd /presto-server-0.234/bin 2 3./launcher start 后台启动 4 5./launcher stop停止 6 7./launcher run 前台启动,输出日志 8 9./launcher restart 重启 10 11

启动客户端

1#重命名客户端jar包presto-cli-0.228-executable.jar为presto 可执行文件 2mv presto-cli-0.228-executable.jar presto 3 4使用命令: 5./presto --server locahost:8083--catalog mysql --schema test11 6 7

presto操作MySQL

1、在 /etc/catalog/目录下创建mysql.properties

1connector.name=mysql 2connection-url=jdbc:mysql://192.168.72.143:3306 3connection-user=root 4connection-password=root 5 6

2、重启presto-server

1/launcher restart 2 3

3、测试

1在客户端安装目录执行 2./presto --server locahost:8083--catalog mysql--schema test11 3presto> select * from mysql.test11.test; 4 5

presto操作hive

1、在 /etc/catalog/目录下创建hive.properties

1connector.name=hive-hadoop2 2hive.metastore.uri=thrift://localhost:9083 3hive.config.resources=/export/servers/hadoop-2.6.0-cdh5.14.0/etc/hadoop/core-site.xml,/export/servers/hadoop-2.6.0-cdh5.14.0/etc/hadoop/hdfs-site.xml 4 5

2、重启presto-server

1/launcher restart 2 3

3、测试

1在客户端安装目录执行 2./presto --server locahost:9080 --catalog hive--schema default 3presto> select * from hive.default.test; 4 5

presto操作kafka

1、在 /etc/catalog/目录下创建kafka.properties

1connector.name=kafka 2kafka.table-names=hive2kafka,hive2kafka2,kafkatopic,hivetopic,hivetopic2,prestotopic2,tpch.kafkatopic2 3kafka.nodes=node01:9092,node02:9092,node03:9092 4kafka.hide-internal-columns=false 5kafka.table-description-dir=/newdisk/presto/presto-server-0.234/etc/kafka 6 7

2、创建kafka目录,设置schema

1mkdir kafka 2cd kafka 3vim kafkatopic.json 4 5
1{ 2 "tableName": "kafkatopic", 3 "schemaName": "tpch", 4 "topicName": "tpch.kafkatopic", 5 "message": { 6 "dataFormat": "json", 7 "fields": [ 8 { 9 "name": "p0", 10 "mapping": "p0", 11 "type": "DOUBLE" 12 }, 13 { 14 "name": "p1", 15 "mapping": "p1", 16 "type": "DOUBLE" 17 }, 18 { 19 "name": "p2", 20 "mapping": "p2", 21 "type": "DOUBLE" 22 }, 23 { 24 "name": "mp_id", 25 "mapping": "mp_id", 26 "type": "BIGINT" 27 }, 28 { 29 "name": "p3", 30 "mapping": "p3", 31 "type": "DOUBLE" 32 }, 33 { 34 "name": "i1", 35 "mapping": "i1", 36 "type": "DOUBLE" 37 }, 38 { 39 "name": "i2", 40 "mapping": "i2", 41 "type": "DOUBLE" 42 }, 43 { 44 "name": "i3", 45 "mapping": "i3", 46 "type": "DOUBLE" 47 }, 48 { 49 "name": "data_date", 50 "mapping": "data_date", 51 "type": "VARCHAR" 52 }, 53 { 54 "name": "sjd", 55 "mapping": "sjd", 56 "type": "BIGINT" 57 }, 58 { 59 "name": "tg_id", 60 "mapping": "tg_id", 61 "type": "DOUBLE" 62 }, 63 { 64 "name": "t_factor", 65 "mapping": "t_factor", 66 "type": "BIGINT" 67 }, 68 { 69 "name": "u1", 70 "mapping": "u1", 71 "type": "DOUBLE" 72 }, 73 { 74 "name": "u2", 75 "mapping": "u2", 76 "type": "DOUBLE" 77 }, 78 { 79 "name": "u3", 80 "mapping": "u3", 81 "type": "DOUBLE" 82 } 83 ] 84 } 85} 86 87

3、测试
注意:kafka的topic名字要写成tpch.kafkatopic这种

1在客户端安装目录执行 2./presto --server locahost:8083--catalog kafka--schema tpch 3presto> select * from tpch.kafkatopic; 4 5

在这里插入图片描述

多数据源查询

1./presto --server locahost:8083--catalog mysql 2 3presto:es> select * from mysql.test11.user t left join hive.default.es_test t1 on t.age=t1.age; 4 5

代码交流 2021