当前位置: 首页 > news >正文

doris:Paimon Catalog

使用须知​

  1. 数据放在 hdfs 时,需要将 core-site.xml,hdfs-site.xml 和 hive-site.xml 放到 FE 和 BE 的 conf 目录下。优先读取 conf 目录下的 hadoop 配置文件,再读取环境变量 HADOOP_CONF_DIR 的相关配置文件。
  2. 当前适配的 Paimon 版本为 0.8。

创建 Catalog​

Paimon Catalog 当前支持两种类型的 Metastore 创建 Catalog:

  • filesystem(默认),同时存储元数据和数据在 filesystem。
  • hive metastore,它还将元数据存储在 Hive metastore 中。用户可以直接从 Hive 访问这些表。

基于 FileSystem 创建 Catalog​

HDFS​
CREATE CATALOG `paimon_hdfs` PROPERTIES (
    "type" = "paimon",
    "warehouse" = "hdfs://HDFS8000871/user/paimon",
    "dfs.nameservices" = "HDFS8000871",
    "dfs.ha.namenodes.HDFS8000871" = "nn1,nn2",
    "dfs.namenode.rpc-address.HDFS8000871.nn1" = "172.21.0.1:4007",
    "dfs.namenode.rpc-address.HDFS8000871.nn2" = "172.21.0.2:4007",
    "dfs.client.failover.proxy.provider.HDFS8000871" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
    "hadoop.username" = "hadoop"
);

CREATE CATALOG `paimon_kerberos` PROPERTIES (
    'type'='paimon',
    "warehouse" = "hdfs://HDFS8000871/user/paimon",
    "dfs.nameservices" = "HDFS8000871",
    "dfs.ha.namenodes.HDFS8000871" = "nn1,nn2",
    "dfs.namenode.rpc-address.HDFS8000871.nn1" = "172.21.0.1:4007",
    "dfs.namenode.rpc-address.HDFS8000871.nn2" = "172.21.0.2:4007",
    "dfs.client.failover.proxy.provider.HDFS8000871" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
    'hadoop.security.authentication' = 'kerberos',
    'hadoop.kerberos.keytab' = '/doris/hdfs.keytab',   
    'hadoop.kerberos.principal' = 'hdfs@HADOOP.COM'
);

MINIO​
CREATE CATALOG `paimon_s3` PROPERTIES (
    "type" = "paimon",
    "warehouse" = "s3://bucket_name/paimons3",
    "s3.endpoint" = "http://<ip>:<port>",
    "s3.access_key" = "ak",
    "s3.secret_key" = "sk"
);

OBS​
CREATE CATALOG `paimon_obs` PROPERTIES (
    "type" = "paimon",
    "warehouse" = "obs://bucket_name/paimon",
    "obs.endpoint"="obs.cn-north-4.myhuaweicloud.com",
    "obs.access_key"="ak",
    "obs.secret_key"="sk"
);

COS​
CREATE CATALOG `paimon_s3` PROPERTIES (
    "type" = "paimon",
    "warehouse" = "cosn://paimon-1308700295/paimoncos",
    "cos.endpoint" = "cos.ap-beijing.myqcloud.com",
    "cos.access_key" = "ak",
    "cos.secret_key" = "sk"
);

OSS​
CREATE CATALOG `paimon_oss` PROPERTIES (
    "type" = "paimon",
    "warehouse" = "oss://paimon-zd/paimonoss",
    "oss.endpoint" = "oss-cn-beijing.aliyuncs.com",
    "oss.access_key" = "ak",
    "oss.secret_key" = "sk"
);

基于 Hive Metastore 创建 Catalog​

CREATE CATALOG `paimon_hms` PROPERTIES (
    "type" = "paimon",
    "paimon.catalog.type" = "hms",
    "warehouse" = "hdfs://HDFS8000871/user/zhangdong/paimon2",
    "hive.metastore.uris" = "thrift://172.21.0.44:7004",
    "dfs.nameservices" = "HDFS8000871",
    "dfs.ha.namenodes.HDFS8000871" = "nn1,nn2",
    "dfs.namenode.rpc-address.HDFS8000871.nn1" = "172.21.0.1:4007",
    "dfs.namenode.rpc-address.HDFS8000871.nn2" = "172.21.0.2:4007",
    "dfs.client.failover.proxy.provider.HDFS8000871" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
    "hadoop.username" = "hadoop"
);

CREATE CATALOG `paimon_kerberos` PROPERTIES (
    "type" = "paimon",
    "paimon.catalog.type" = "hms",
    "warehouse" = "hdfs://HDFS8000871/user/zhangdong/paimon2",
    "hive.metastore.uris" = "thrift://172.21.0.44:7004",
    "hive.metastore.sasl.enabled" = "true",
    "hive.metastore.kerberos.principal" = "hive/xxx@HADOOP.COM",
    "dfs.nameservices" = "HDFS8000871",
    "dfs.ha.namenodes.HDFS8000871" = "nn1,nn2",
    "dfs.namenode.rpc-address.HDFS8000871.nn1" = "172.21.0.1:4007",
    "dfs.namenode.rpc-address.HDFS8000871.nn2" = "172.21.0.2:4007",
    "dfs.client.failover.proxy.provider.HDFS8000871" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
    "hadoop.security.authentication" = "kerberos",
    "hadoop.kerberos.principal" = "hdfs@HADOOP.COM",
    "hadoop.kerberos.keytab" = "/doris/hdfs.keytab"
);

基于 Aliyun DLF 创建 Catalog​

该功能自 2.1.7 和 3.0.3 版本支持。

CREATE CATALOG `paimon_dlf` PROPERTIES (
    "type" = "paimon",
    "paimon.catalog.type" = "dlf",
    "warehouse" = "oss://xx/yy/",
    "dlf.proxy.mode" = "DLF_ONLY",
    "dlf.uid" = "xxxxx",
    "dlf.region" = "cn-beijing",
    "dlf.access_key" = "ak",
    "dlf.secret_key" = "sk"
    
    -- "dlf.endpoint" = "dlf.cn-beijing.aliyuncs.com",  -- optional
    -- "dlf.catalog.id" = "xxxx", -- optional
);

列类型映射​

Paimon Data TypeDoris Data TypeComment
BooleanTypeBoolean
TinyIntTypeTinyInt
SmallIntTypeSmallInt
IntTypeInt
FloatTypeFloat
BigIntTypeBigInt
DoubleTypeDouble
VarCharTypeVarChar
CharTypeChar
VarBinaryType, BinaryTypeString
DecimalType(precision, scale)Decimal(precision, scale)
TimestampType,LocalZonedTimestampTypeDateTime
DateTypeDate
ArrayTypeArray支持 Array 嵌套
MapTypeMap支持 Map 嵌套
RowTypeStruct支持 Struct 嵌套(2.0.10 和 2.1.3 版本开始支持)

相关文章:

  • 智能差旅管理新范式:MyAgent如何重塑企业差旅全流程自动化
  • 【uniapp】离线打包uniapp为apk详细步骤
  • 腾讯三面:写文件时进程宕机,数据会丢失吗?
  • 基于springboot+vue3图书借阅管理系统
  • MySQL笔记---Ubuntu环境下从零开始的MySQL
  • 【0013】HTML超链接标签详解
  • 【go语言】——方法集
  • 开源工具推荐:Uptime Kuma监控
  • 【Python 3.12.1 颠覆性升级:GIL 解锁与性能飞跃,开启多线程新时代】
  • C++ 将jpg图片变成16位565bmp图片
  • 直播预告|TinyEngine低代码引擎v2.2版本特性介绍
  • LabVIEW基于IMAQ实现直线边缘检测
  • µC/OS-III-事件标志
  • 探索 C 语言:编程世界的基石
  • 在kali linux中kafka的配置和使用
  • mysql深度分页优化方案
  • Redis 同步机制详解
  • 写Oracle表耗时25分钟缩短到23秒——SeaTunnel性能优化
  • 发布策略:蓝绿部署、金丝雀发布(灰度发布)、AB测试、滚动发布、红黑部署的概念与区别
  • CPaintDC的简单介绍
  • 浙江网站建设推广公司找哪家/2024年重大新闻摘抄
  • 企业网站建设感想/推动高质量发展
  • 在县城做团购网站/网站域名查询网
  • 网站缩略图存哪里好/seopeixun com cn
  • 把国外的网站翻译过来做自媒体/网络建设推广
  • 如何网站全部结构/注册推广赚钱一个10元