当前位置: 首页 > news >正文

HDFS 3.4.1 集成Kerberos 实现账户认证

HDFS集成Kerberos 实现账户认证调研

文章目录

  • HDFS集成Kerberos 实现账户认证调研
    • 准备环境
    • 部署集成
      • HDFS相关服务信息规划
      • 创建相关Kerberos主体
        • 创建hadoop各个服务的Kerberos主体
        • 创建存储 keytab 文件的路径
        • Kerberos导出keytab认证文件
        • 修改keytab文件权限
      • Secure DataNode
      • 配置HTTPS访问
      • 修改Hadoop配置文件
        • core-site.xml
        • hdfs-site.xml
        • hdfs-rbf-site.xml
    • Kerberos访问验证
      • 验证客户端
      • 验证内容
    • 遇到问题
      • 问题一
      • 问题二
      • 问题三
      • 问题四
      • 问题五
      • 问题六
      • 问题七
      • 问题八
      • 问题九

准备环境

安装本软件,涉及的软件包为:

  • jdk1.8.0_161.tgz
  • hadoop-3.4.1.tar.gz
  • zookeeper-3.8.4.tar.gz
  • Kerberos服务已经部署完成,涉及的11台服务器都有Kerberos Client,部署参考CentOS 7.3环境中部署Kerberos集群

部署集成

HDFS相关服务信息规划

zookeeper并未开启Kerberos认证

IP服务主机名Principal主体
192.168.1.1NameNodehadoop3test1-01.test.comnn/hadoop3test1-01.test.com
192.168.1.2NameNodehadoop3test1-02.test.comnn/hadoop3test1-02.test.com
192.168.1.3JournalNodehadoop3test1-03.test.comjn/hadoop3test1-03.test.com
192.168.1.4JournalNodehadoop3test1-04.test.comjn/hadoop3test1-04.test.com
192.168.1.5JournalNodehadoop3test1-05.test.comjn/hadoop3test1-05.test.com
192.168.1.6DataNodehadoop3test1-06.test.comdn/hadoop3test1-06.test.com
192.168.1.7DataNodehadoop3test1-07.test.comdn/hadoop3test1-07.test.com
192.168.1.8DataNodehadoop3test1-08.test.comdn/hadoop3test1-08.test.com
192.168.1.9Routerhadoop3test1-09.test.comrouter/hadoop3test1-09.test.com
192.168.1.10Routerhadoop3test1-10.test.comrouter/hadoop3test1-10.test.com
192.168.1.11Routerhadoop3test1-11.test.comrouter/hadoop3test1-11.test.com
192.168.1.1HTTPhadoop3test1-01.test.comHTTP/hadoop3test1-01.test.com
192.168.1.2HTTPhadoop3test1-02.test.comHTTP/hadoop3test1-02.test.com
192.168.1.3HTTPhadoop3test1-03.test.comHTTP/hadoop3test1-03.test.com
192.168.1.4HTTPhadoop3test1-04.test.comHTTP/hadoop3test1-04.test.com
192.168.1.5HTTPhadoop3test1-05.test.comHTTP/hadoop3test1-05.test.com
192.168.1.6HTTPhadoop3test1-06.test.comHTTP/hadoop3test1-06.test.com
192.168.1.7HTTPhadoop3test1-07.test.comHTTP/hadoop3test1-07.test.com
192.168.1.8HTTPhadoop3test1-08.test.comHTTP/hadoop3test1-08.test.com
192.168.1.9HTTPhadoop3test1-09.test.comHTTP/hadoop3test1-09.test.com
192.168.1.10HTTPhadoop3test1-10.test.comHTTP/hadoop3test1-10.test.com
192.168.1.11HTTPhadoop3test1-11.test.comHTTP/hadoop3test1-11.test.com

创建相关Kerberos主体

创建hadoop各个服务的Kerberos主体

【温馨提示】-randkey是密码是随机的,-pw是指定密码的,得手动输入密码。

# 在Kerberos服务端 执行如下命令
kadmin.local -q "addprinc -pw 123456 nn/hadoop3test1-01.test.com";
kadmin.local -q "addprinc -pw 123456 nn/hadoop3test1-02.test.com";
kadmin.local -q "addprinc -pw 123456 dn/hadoop3test1-06.test.com";
kadmin.local -q "addprinc -pw 123456 dn/hadoop3test1-07.test.com";
kadmin.local -q "addprinc -pw 123456 dn/hadoop3test1-08.test.com";
kadmin.local -q "addprinc -pw 123456 jn/hadoop3test1-03.test.com";
kadmin.local -q "addprinc -pw 123456 jn/hadoop3test1-04.test.com";
kadmin.local -q "addprinc -pw 123456 jn/hadoop3test1-05.test.com";
kadmin.local -q "addprinc -pw 123456 router/hadoop3test1-09.test.com";
kadmin.local -q "addprinc -pw 123456 router/hadoop3test1-10.test.com";
kadmin.local -q "addprinc -pw 123456 router/hadoop3test1-11.test.com";
kadmin.local -q "addprinc -pw 123456 HTTP/hadoop3test1-01.test.com";
kadmin.local -q "addprinc -pw 123456 HTTP/hadoop3test1-02.test.com";
kadmin.local -q "addprinc -pw 123456 HTTP/hadoop3test1-03.test.com";
kadmin.local -q "addprinc -pw 123456 HTTP/hadoop3test1-04.test.com";
kadmin.local -q "addprinc -pw 123456 HTTP/hadoop3test1-05.test.com";
kadmin.local -q "addprinc -pw 123456 HTTP/hadoop3test1-06.test.com";
kadmin.local -q "addprinc -pw 123456 HTTP/hadoop3test1-07.test.com";
kadmin.local -q "addprinc -pw 123456 HTTP/hadoop3test1-08.test.com";
kadmin.local -q "addprinc -pw 123456 HTTP/hadoop3test1-09.test.com";
kadmin.local -q "addprinc -pw 123456 HTTP/hadoop3test1-10.test.com";
kadmin.local -q "addprinc -pw 123456 HTTP/hadoop3test1-11.test.com";# 查看数据库里主体,可以看到刚刚创建的主体信息
kadmin.local -q  "listprincs"# todo
# kadmin.local -q 'modprinc -maxrenewlife 360day +allow_renewable bigdata@EXAMPLE.COM'
创建存储 keytab 文件的路径
# 每个节点上都执行
mkdir -p /opt/keytabs;ssh 192.168.1.1 "mkdir -p /opt/keytabs";
ssh 192.168.1.2 "mkdir -p /opt/keytabs";
ssh 192.168.1.3 "mkdir -p /opt/keytabs";
ssh 192.168.1.4 "mkdir -p /opt/keytabs";
ssh 192.168.1.5 "mkdir -p /opt/keytabs";
ssh 192.168.1.6 "mkdir -p /opt/keytabs";
ssh 192.168.1.7 "mkdir -p /opt/keytabs";
ssh 192.168.1.8 "mkdir -p /opt/keytabs";
ssh 192.168.1.9 "mkdir -p /opt/keytabs";
ssh 192.168.1.10 "mkdir -p /opt/keytabs";
ssh 192.168.1.11 "mkdir -p /opt/keytabs";
Kerberos导出keytab认证文件

将 Hadoop 服务的主体写入到 keytab 文件,按需分发至所有的keytab文件至各个节点,

有两种思路,

  1. 一种是按不同的服务将同类的账户信息写入同一个keytab文件中,如nn.service.keytabjn.service.keytabdn.service.keytabrouter.service.keytabspnego.service.keytab
  2. 另一种是将hdfs相关的账户信息写入同一个keytab文件中,如hdfs.service.keytabspnego.service.keytab

在生成keytab文件时需要加参数-norandkey,否则会导致直接使用kinit 初始化时会提示密码错误。

kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/nn.service.keytab nn/hadoop3test1-01.test.com@EXAMPLE.COM";
kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/spnego.service.keytab HTTP/hadoop3test1-01.test.com@EXAMPLE.COM";kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/nn.service.keytab nn/hadoop3test1-02.test.com@EXAMPLE.COM";
kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/spnego.service.keytab HTTP/hadoop3test1-02.test.com@EXAMPLE.COM";kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/jn.service.keytab jn/hadoop3test1-03.test.com@EXAMPLE.COM";
kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/spnego.service.keytab HTTP/hadoop3test1-03.test.com@EXAMPLE.COM";kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/jn.service.keytab jn/hadoop3test1-04.test.com@EXAMPLE.COM";
kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/spnego.service.keytab HTTP/hadoop3test1-04.test.com@EXAMPLE.COM";kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/jn.service.keytab jn/hadoop3test1-05.test.com@EXAMPLE.COM";
kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/spnego.service.keytab HTTP/hadoop3test1-05.test.com@EXAMPLE.COM";kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/dn.service.keytab dn/hadoop3test1-06.test.com@EXAMPLE.COM";
kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/spnego.service.keytab HTTP/hadoop3test1-06.test.com@EXAMPLE.COM";kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/dn.service.keytab dn/hadoop3test1-07.test.com@EXAMPLE.COM";
kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/spnego.service.keytab HTTP/hadoop3test1-07.test.com@EXAMPLE.COM";kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/dn.service.keytab dn/hadoop3test1-08.test.com@EXAMPLE.COM";
kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/spnego.service.keytab HTTP/hadoop3test1-08.test.com@EXAMPLE.COM";kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/router.service.keytab router/hadoop3test1-09.test.com@EXAMPLE.COM";
kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/spnego.service.keytab HTTP/hadoop3test1-09.test.com@EXAMPLE.COM";kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/router.service.keytab router/hadoop3test1-10.test.com@EXAMPLE.COM";
kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/spnego.service.keytab HTTP/hadoop3test1-10.test.com@EXAMPLE.COM";kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/router.service.keytab router/hadoop3test1-11.test.com@EXAMPLE.COM";
kadmin.local -q "ktadd -norandkey -kt /opt/keytabs/spnego.service.keytab HTTP/hadoop3test1-11.test.com@EXAMPLE.COM";
修改keytab文件权限

为了方便验证,将文件权限直接全部775。实际按类修改对应文件的用户和用户组,如chown -R bigdata:bigdata /opt/keytabs/hdfs.service.keytab

chmod 775 /opt/keytabs/*scp root@192.168.1.9:/opt/keytabs/* /root/gd/keytabs/scp /root/gd/keytabs/* root@192.168.1.1:/opt/keytabs/;
scp /root/gd/keytabs/* root@192.168.1.2:/opt/keytabs/;
scp /root/gd/keytabs/* root@192.168.1.3:/opt/keytabs/;
scp /root/gd/keytabs/* root@192.168.1.4:/opt/keytabs/;
scp /root/gd/keytabs/* root@192.168.1.5:/opt/keytabs/;
scp /root/gd/keytabs/* root@192.168.1.6:/opt/keytabs/;
scp /root/gd/keytabs/* root@192.168.1.7:/opt/keytabs/;
scp /root/gd/keytabs/* root@192.168.1.8:/opt/keytabs/;
scp /root/gd/keytabs/* root@192.168.1.9:/opt/keytabs/;
scp /root/gd/keytabs/* root@192.168.1.10:/opt/keytabs/;
scp /root/gd/keytabs/* root@192.168.1.11:/opt/keytabs/;ssh 192.168.1.1 "chmod 775 /opt/keytabs/*";
ssh 192.168.1.2 "chmod 775 /opt/keytabs/*";	
ssh 192.168.1.3 "chmod 775 /opt/keytabs/*";
ssh 192.168.1.4 "chmod 775 /opt/keytabs/*";
ssh 192.168.1.5 "chmod 775 /opt/keytabs/*";
ssh 192.168.1.6 "chmod 775 /opt/keytabs/*";
ssh 192.168.1.7 "chmod 775 /opt/keytabs/*";
ssh 192.168.1.8 "chmod 775 /opt/keytabs/*";
ssh 192.168.1.9 "chmod 775 /opt/keytabs/*";
ssh 192.168.1.10 "chmod 775 /opt/keytabs/*";
ssh 192.168.1.11 "chmod 775 /opt/keytabs/*";

klist -kt命令列出指定Keytab文件中包含的Principal信息,需要该文件的读权限,例如:klist -kt /opt/keytabs/spnego.service.keytab

Secure DataNode

由于Kerberos开启后,Datanode需要同时开启Secure DataNode,Secure DataNode有两种方式,

  1. 以root用户启动DN,先绑定特权端口,然后切换成HDFS_DATANODE_SECURE_USER 指定的用户账户继续运行DN。由于 DataNode 数据传输协议不使用 Hadoop RPC 框架,因此 DataNode 必须使用由 dfs.datanode.addressdfs.datanode.http.address 指定的特权端口进行身份验证。这种身份验证基于攻击者无法在 DataNode 主机上获得 root 权限的假设。启动时必须将 HDFS_DATANODE_SECURE_USERJSVC_HOME 指定为环境变量(在 hadoop-env.sh 中)。简单来说,就是需要配置JSVC环境,指定HDFS_DATANODE_SECURE_USERdfs.datanode.addressdfs.datanode.http.address 需要配置为特权端口,最后使用root启动DN。

    jsvc 程序

    特权端口指的是Linux的特权端口号,一般是指1024以下的端口,被知名服务使用的端口,也称为知名端口。

  2. 从 2.6.0 版开始,支持使用SASL作为验证数据传输协议。需要在服务器上开启HTTS访问,需要在hdfs-site.xml中配置dfs.data.transfer.protectionauthentication,配置dfs.datanode.address为非特权端口,配置dfs.http.policyHTTPS_ONLY

    只有 2.6.0 或更高版本的 HDFS 客户端才能连接到使用 SASL 验证数据传输协议的数据节点。如果DNs中同时存在以两种方式的Secure DataNode,那么开启SASL的HDFS客户端可以同时方式这两者方式的Secure DataNode。

    dfs.data.transfer.protection

    设置该属性可启用 SASL 对数据传输协议进行身份验证。如果启用,则在启动 DataNode 进程时,dfs.datanode.address 必须使用非特权端口,dfs.http.policy 必须设置为 HTTPS_ONLYHDFS_DATANODE_SECURE_USER 环境变量必须为未定义。其值:

    1. authentication :只进行身份验证;
    2. integrity:除身份验证外还进行完整性检查;
    3. privacy:除完整性外还进行数据加密 该属性默认为未指定。

配置HTTPS访问

No.1-No.10步骤为root用户执行,为了方便,如果需要输入密码的地方,统一使用123456

  1. 每台机器执行安装openssl命令

    yum install openssl
    
  2. 生成X.509秘钥与证书

    openssl req -new -x509 -keyout bd_ca_key -out bd_ca_cert -days 36500 -subj '/C=CN/ST=jiangsu/L=nanjing/O=test/OU=test/CN=test'
    

    参数解释

    • -new: 生成一个新的证书请求。
    • -x509: 直接生成一个自签名的证书,而不是生成一个证书请求(CSR)。
    • -keyout bd_ca_key: 指定生成的私钥文件名为 bd_ca_key
    • -out bd_ca_cert: 指定生成的证书文件名为 bd_ca_cert
    • -days 36500: 设置证书的有效期为36500天(约99年)。
    • -subj "/CN=your_common_name": 设置证书的主题字段(Subject),格式为/type0=value0/type1=value1...,替换或自定义证书请求时需要输入的信息,并输出修改后的请求信息。如果value为空,则表示使用配置文件中指定的默认值,如果value值为.,则表示该项留空。其中可识别type(man req)有:C是Country国家、ST是state省/州、L是localcity城市、O是Organization组织、OU是Organizational Unit组织类单位、CN是common name,可以是自己域名。
  3. 分发秘钥(key)与证书(cert)

    # 创建ssl证书处理路径
    ssh 192.168.1.1 "mkdir -p /opt/kerberos_https";
    ssh 192.168.1.2 "mkdir -p /opt/kerberos_https";
    ssh 192.168.1.3 "mkdir -p /opt/kerberos_https";
    ssh 192.168.1.4 "mkdir -p /opt/kerberos_https";
    ssh 192.168.1.5 "mkdir -p /opt/kerberos_https";
    ssh 192.168.1.6 "mkdir -p /opt/kerberos_https";
    ssh 192.168.1.7 "mkdir -p /opt/kerberos_https";
    ssh 192.168.1.8 "mkdir -p /opt/kerberos_https";
    ssh 192.168.1.9 "mkdir -p /opt/kerberos_https";
    ssh 192.168.1.10 "mkdir -p /opt/kerberos_https";
    ssh 192.168.1.11 "mkdir -p /opt/kerberos_https";# 分发秘钥(key)与证书(cert)
    scp /root/gd/kerberos_https/* root@192.168.1.1:/opt/kerberos_https/;
    scp /root/gd/kerberos_https/* root@192.168.1.2:/opt/kerberos_https/;
    scp /root/gd/kerberos_https/* root@192.168.1.3:/opt/kerberos_https/;
    scp /root/gd/kerberos_https/* root@192.168.1.4:/opt/kerberos_https/;
    scp /root/gd/kerberos_https/* root@192.168.1.5:/opt/kerberos_https/;
    scp /root/gd/kerberos_https/* root@192.168.1.6:/opt/kerberos_https/;
    scp /root/gd/kerberos_https/* root@192.168.1.7:/opt/kerberos_https/;
    scp /root/gd/kerberos_https/* root@192.168.1.8:/opt/kerberos_https/;
    scp /root/gd/kerberos_https/* root@192.168.1.9:/opt/kerberos_https/;
    scp /root/gd/kerberos_https/* root@192.168.1.10:/opt/kerberos_https/;
    scp /root/gd/kerberos_https/* root@192.168.1.11:/opt/kerberos_https/;
  4. 每个节点生成 keystore 文件

    # hadoop3test1-01.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-01.test.com -genkey -keyalg RSA -dname "CN=hadoop3test1-01.test.com, OU=dev, O=dev, L=nanjing, ST=jiangsu, C=CN";# hadoop3test1-01.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-02.test.com -genkey -keyalg RSA -dname "CN=hadoop3test1-02.test.com, OU=dev, O=dev, L=nanjing, ST=jiangsu, C=CN";# hadoop3test1-03.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-03.test.com -genkey -keyalg RSA -dname "CN=hadoop3test1-03.test.com, OU=dev, O=dev, L=nanjing, ST=jiangsu, C=CN";# hadoop3test1-04.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-04.test.com -genkey -keyalg RSA -dname "CN=hadoop3test1-04.test.com, OU=dev, O=dev, L=nanjing, ST=jiangsu, C=CN";# hadoop3test1-05.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-05.test.com -genkey -keyalg RSA -dname "CN=hadoop3test1-05.test.com, OU=dev, O=dev, L=nanjing, ST=jiangsu, C=CN";# hadoop3test1-06.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-06.test.com -genkey -keyalg RSA -dname "CN=hadoop3test1-06.test.com, OU=dev, O=dev, L=nanjing, ST=jiangsu, C=CN";# hadoop3test1-07.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-07.test.com -genkey -keyalg RSA -dname "CN=hadoop3test1-07.test.com, OU=dev, O=dev, L=nanjing, ST=jiangsu, C=CN";# hadoop3test1-08.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-08.test.com -genkey -keyalg RSA -dname "CN=hadoop3test1-08.test.com, OU=dev, O=dev, L=nanjing, ST=jiangsu, C=CN";# hadoop3test1-09.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-09.test.com -genkey -keyalg RSA -dname "CN=hadoop3test1-09.test.com, OU=dev, O=dev, L=nanjing, ST=jiangsu, C=CN";# hadoop3test1-10.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-10.test.com -genkey -keyalg RSA -dname "CN=hadoop3test1-10.test.com, OU=dev, O=dev, L=nanjing, ST=jiangsu, C=CN";# hadoop3test1-11.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-11.test.com -genkey -keyalg RSA -dname "CN=hadoop3test1-11.test.com, OU=dev, O=dev, L=nanjing, ST=jiangsu, C=CN";
    
    • keytool:Java密钥和证书管理工具的命令行实用程序。
    • -keystore /opt/kerberos_https/keystore:指定密钥库的位置和名称,这里为"keystore"。
    • -alias jetty:指定别名,这里为各节点hostname,用于标识存储在密钥库中的密钥对。
    • -genkey:指定将生成新密钥对的操作。
    • -keyalg RSA:指定密钥算法,这里为RSA。
    • -dname “CN=dev1, OU=dev2, O=dev3, L=dev4, ST=dev5, C=CN”:指定用于生成证书请求的"主题可分辨名称",包含以下信息:
      • CN(Common Name):指定通用名称,这里建议配置为各hostname。
      • OU(Organizational Unit):指定组织单位,机构内的部门
      • O(Organization):指定组织名称。
      • L(Locality):指定所在城市或地点。
      • ST(State or Province):指定所在省份或州。
      • C(Country):指定所在国家或地区,这里为CN(中国)。
  5. 生成 truststore 文件

    每个节点都需要执行,truststore文件存储了可信任的根证书,别名CARoot

    keytool -keystore /opt/kerberos_https/truststore -alias CARoot -import -file /opt/kerberos_https/bd_ca_cert
    
  6. keystore 中导出 cert

    每个节点都需要执行,提取证书保存进cert

    # hadoop3test1-01.test.com节点上执行
    keytool -certreq -alias hadoop3test1-01.test.com -keystore /opt/kerberos_https/keystore -file /opt/kerberos_https/cert;# hadoop3test1-02.test.com节点上执行
    keytool -certreq -alias hadoop3test1-02.test.com -keystore /opt/kerberos_https/keystore -file /opt/kerberos_https/cert;# hadoop3test1-03.test.com节点上执行
    keytool -certreq -alias hadoop3test1-03.test.com -keystore /opt/kerberos_https/keystore -file /opt/kerberos_https/cert;# hadoop3test1-04.test.com节点上执行
    keytool -certreq -alias hadoop3test1-04.test.com -keystore /opt/kerberos_https/keystore -file /opt/kerberos_https/cert;# hadoop3test1-05.test.com节点上执行
    keytool -certreq -alias hadoop3test1-05.test.com -keystore /opt/kerberos_https/keystore -file /opt/kerberos_https/cert;# hadoop3test1-06.test.com节点上执行
    keytool -certreq -alias hadoop3test1-06.test.com -keystore /opt/kerberos_https/keystore -file /opt/kerberos_https/cert;# hadoop3test1-07.test.com节点上执行
    keytool -certreq -alias hadoop3test1-07.test.com -keystore /opt/kerberos_https/keystore -file /opt/kerberos_https/cert;# hadoop3test1-08.test.com节点上执行
    keytool -certreq -alias hadoop3test1-08.test.com -keystore /opt/kerberos_https/keystore -file /opt/kerberos_https/cert;# hadoop3test1-09.test.com节点上执行
    keytool -certreq -alias hadoop3test1-09.test.com -keystore /opt/kerberos_https/keystore -file /opt/kerberos_https/cert;# hadoop3test1-10.test.com节点上执行
    keytool -certreq -alias hadoop3test1-10.test.com -keystore /opt/kerberos_https/keystore -file /opt/kerberos_https/cert;# hadoop3test1-11.test.com节点上执行
    keytool -certreq -alias hadoop3test1-11.test.com -keystore /opt/kerberos_https/keystore -file /opt/kerberos_https/cert;
    
  7. 生成自签名文件

    每个节点都需要执行,生成自签名的文件cert_signed

    openssl x509 -req -CA /opt/kerberos_https/bd_ca_cert -CAkey /opt/kerberos_https/bd_ca_key -in /opt/kerberos_https/cert -out /opt/kerberos_https/cert_signed -days 36500 -CAcreateserial;
    
  8. 将 CA 证书导入到 keystore

    每个节点都需要执行,将之前生成的bd_ca_cert证书文件导入到keystore中。

    keytool -keystore /opt/kerberos_https/keystore -alias CARoot -import -file /opt/kerberos_https/bd_ca_cert;
    
  9. 将自签名证书导入到 keystore

    每个节点都需要执行

    # hadoop3test1-01.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-01.test.com -import -file /opt/kerberos_https/cert_signed;# hadoop3test1-02.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-02.test.com -import -file /opt/kerberos_https/cert_signed;# hadoop3test1-03.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-03.test.com -import -file /opt/kerberos_https/cert_signed;# hadoop3test1-04.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-04.test.com -import -file /opt/kerberos_https/cert_signed;# hadoop3test1-05.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-05.test.com -import -file /opt/kerberos_https/cert_signed;# hadoop3test1-06.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-06.test.com -import -file /opt/kerberos_https/cert_signed;# hadoop3test1-07.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-07.test.com -import -file /opt/kerberos_https/cert_signed;# hadoop3test1-08.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-08.test.com -import -file /opt/kerberos_https/cert_signed;# hadoop3test1-09.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-09.test.com -import -file /opt/kerberos_https/cert_signed;# hadoop3test1-10.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-10.test.com -import -file /opt/kerberos_https/cert_signed;# hadoop3test1-11.test.com节点上执行
    keytool -keystore /opt/kerberos_https/keystore -alias hadoop3test1-11.test.com -import -file /opt/kerberos_https/cert_signed;
    
  10. 导出keystore trustores文件

    HDFS需要使用的是keystoretrustores文件,正常应该是需要特定的用户路径存放,本例中就使用/opt/kerberos_https/ 目录了,修改一下目录用户和用户组为HDFS启动用户。

    # 修改证书用户与用户组
    ssh 192.168.1.1 "chown bigdata:bigdata -R /opt/kerberos_https/";
    ssh 192.168.1.2 "chown bigdata:bigdata -R /opt/kerberos_https/";
    ssh 192.168.1.3 "chown bigdata:bigdata -R /opt/kerberos_https/";
    ssh 192.168.1.4 "chown bigdata:bigdata -R /opt/kerberos_https/";
    ssh 192.168.1.5 "chown bigdata:bigdata -R /opt/kerberos_https/";
    ssh 192.168.1.6 "chown bigdata:bigdata -R /opt/kerberos_https/";
    ssh 192.168.1.7 "chown bigdata:bigdata -R /opt/kerberos_https/";
    ssh 192.168.1.8 "chown bigdata:bigdata -R /opt/kerberos_https/";
    ssh 192.168.1.9 "chown bigdata:bigdata -R /opt/kerberos_https/";
    ssh 192.168.1.10 "chown bigdata:bigdata -R /opt/kerberos_https/";
    ssh 192.168.1.11 "chown bigdata:bigdata -R /opt/kerberos_https/";
    
  11. 配置ssl-server.xml

    此处开始配置hadoop相关文件。使用HDFS启动用户执行。服务端需要配置此文件。

    开启DN的Secure 才需要配置ssl,那么其他服务组件是否可以不配置ssl,就可以提供服务。与hdfs-site.xml中的dfs.http.policy相关

    cp /home/bigdata/software/hadoop/etc/hadoop/ssl-server.xml.example /home/bigdata/software/hadoop/etc/hadoop/ssl-server.xml
    vim /home/bigdata/software/hadoop/etc/hadoop/ssl-server.xml
    

    ssl-server.xml文件内容如下:

    <configuration><property><name>ssl.server.truststore.location</name><value>/opt/kerberos_https/truststore</value><description>Truststore to be used by NN and DN. Must be specified.</description></property><property><name>ssl.server.truststore.password</name><value>123456</value><description>Optional. Default value is "".</description></property><property><name>ssl.server.truststore.type</name><value>jks</value><description>Optional. The keystore file format, default value is "jks".</description></property><property><name>ssl.server.truststore.reload.interval</name><value>10000</value><description>Truststore reload check interval, in milliseconds.Default value is 10000 (10 seconds).</description></property><property><name>ssl.server.keystore.location</name><value>/opt/kerberos_https/keystore</value><description>Keystore to be used by NN and DN. Must be specified.</description></property><property><name>ssl.server.keystore.password</name><value>123456</value><description>Must be specified.</description></property><property><name>ssl.server.keystore.keypassword</name><value>123456</value><description>Must be specified.</description></property><property><name>ssl.server.keystore.type</name><value>jks</value><description>Optional. The keystore file format, default value is "jks".</description></property><property><name>ssl.server.exclude.cipher.list</name><value>TLS_ECDHE_RSA_WITH_RC4_128_SHA,SSL_DHE_RSA_EXPORT_WITH_DES40_CBC_SHA,SSL_RSA_WITH_DES_CBC_SHA,SSL_DHE_RSA_WITH_DES_CBC_SHA,SSL_RSA_EXPORT_WITH_RC4_40_MD5,SSL_RSA_EXPORT_WITH_DES40_CBC_SHA,SSL_RSA_WITH_RC4_128_MD5</value><description>Optional. The weak security cipher suites that you want excludedfrom SSL communication.</description></property>
    </configuration>
    
  12. 配置ssl-client.xml

    Hadoop服务端和Hadoop客户端需要配置此文件。

    cp /home/bigdata/software/hadoop/etc/hadoop/ssl-client.xml.example /home/bigdata/software/hadoop/etc/hadoop/ssl-client.xml
    vim /home/bigdata/software/hadoop/etc/hadoop/ssl-client.xml
    

    配置的ssl-client.xml文件内容如下:

    <configuration><property><name>ssl.client.truststore.location</name><value>/opt/kerberos_https/truststore</value><description>Truststore to be used by clients like distcp. Must bespecified.</description>
    </property><property><name>ssl.client.truststore.password</name><value>123456</value><description>Optional. Default value is "".</description>
    </property><property><name>ssl.client.truststore.type</name><value>jks</value><description>Optional. The keystore file format, default value is "jks".</description>
    </property><property><name>ssl.client.truststore.reload.interval</name><value>10000</value><description>Truststore reload check interval, in milliseconds.Default value is 10000 (10 seconds).</description>
    </property><property><name>ssl.client.keystore.location</name><value>/opt/kerberos_https/keystore</value><description>Keystore to be used by clients like distcp. Must bespecified.</description>
    </property><property><name>ssl.client.keystore.password</name><value>123456</value><description>Optional. Default value is "".</description>
    </property><property><name>ssl.client.keystore.keypassword</name><value>123456</value><description>Optional. Default value is "".</description>
    </property><property><name>ssl.client.keystore.type</name><value>jks</value><description>Optional. The keystore file format, default value is "jks".</description>
    </property></configuration>
    
  13. 分发ssl-client.xmlssl-server1.xml

    scp /root/gd/kerberos_https/ssl-server.xml root@192.168.1.1:/home/bigdata/software/hadoop/etc/hadoop/ssl-server.xml;
    scp /root/gd/kerberos_https/ssl-client.xml root@192.168.1.1:/home/bigdata/software/hadoop/etc/hadoop/ssl-client.xml;
    scp /root/gd/kerberos_https/ssl-server.xml root@192.168.1.2:/home/bigdata/software/hadoop/etc/hadoop/ssl-server.xml;
    scp /root/gd/kerberos_https/ssl-client.xml root@192.168.1.2:/home/bigdata/software/hadoop/etc/hadoop/ssl-client.xml;
    scp /root/gd/kerberos_https/ssl-server.xml root@192.168.1.3:/home/bigdata/software/hadoop/etc/hadoop/ssl-server.xml;
    scp /root/gd/kerberos_https/ssl-client.xml root@192.168.1.3:/home/bigdata/software/hadoop/etc/hadoop/ssl-client.xml;
    scp /root/gd/kerberos_https/ssl-server.xml root@192.168.1.4:/home/bigdata/software/hadoop/etc/hadoop/ssl-server.xml;
    scp /root/gd/kerberos_https/ssl-client.xml root@192.168.1.4:/home/bigdata/software/hadoop/etc/hadoop/ssl-client.xml;
    scp /root/gd/kerberos_https/ssl-server.xml root@192.168.1.5:/home/bigdata/software/hadoop/etc/hadoop/ssl-server.xml;
    scp /root/gd/kerberos_https/ssl-client.xml root@192.168.1.5:/home/bigdata/software/hadoop/etc/hadoop/ssl-client.xml;
    scp /root/gd/kerberos_https/ssl-server.xml root@192.168.1.6:/home/bigdata/software/hadoop/etc/hadoop/ssl-server.xml;
    scp /root/gd/kerberos_https/ssl-client.xml root@192.168.1.6:/home/bigdata/software/hadoop/etc/hadoop/ssl-client.xml;
    scp /root/gd/kerberos_https/ssl-server.xml root@192.168.1.7:/home/bigdata/software/hadoop/etc/hadoop/ssl-server.xml;
    scp /root/gd/kerberos_https/ssl-client.xml root@192.168.1.7:/home/bigdata/software/hadoop/etc/hadoop/ssl-client.xml;
    scp /root/gd/kerberos_https/ssl-server.xml root@192.168.1.8:/home/bigdata/software/hadoop/etc/hadoop/ssl-server.xml;
    scp /root/gd/kerberos_https/ssl-client.xml root@192.168.1.8:/home/bigdata/software/hadoop/etc/hadoop/ssl-client.xml;
    scp /root/gd/kerberos_https/ssl-server.xml root@192.168.1.9:/home/bigdata/software/hadoop/etc/hadoop/ssl-server.xml;
    scp /root/gd/kerberos_https/ssl-client.xml root@192.168.1.9:/home/bigdata/software/hadoop/etc/hadoop/ssl-client.xml;
    scp /root/gd/kerberos_https/ssl-server.xml root@192.168.1.10:/home/bigdata/software/hadoop/etc/hadoop/ssl-server.xml;
    scp /root/gd/kerberos_https/ssl-client.xml root@192.168.1.10:/home/bigdata/software/hadoop/etc/hadoop/ssl-client.xml;
    scp /root/gd/kerberos_https/ssl-server.xml root@192.168.1.11:/home/bigdata/software/hadoop/etc/hadoop/ssl-server.xml;
    scp /root/gd/kerberos_https/ssl-client.xml root@192.168.1.11:/home/bigdata/software/hadoop/etc/hadoop/ssl-client.xml;
    

修改Hadoop配置文件

新增开启Kerberos相关的配置项

core-site.xml
	<!-- 启用Kerberos安全认证 --><property><name>hadoop.security.authentication</name><value>kerberos</value></property><!-- 启用Hadoop集群授权管理 --><property><name>hadoop.security.authorization</name><value>true</value></property><!-- 外部系统用户身份映射到Hadoop用户的机制 --><property><name>hadoop.security.auth_to_local.mechanism</name><value>MIT</value></property><!-- Kerberos主体到Hadoop用户的具体映射规则 --><property><name>hadoop.security.auth_to_local</name><value>RULE:[2:$1/$2@$0]([ndj]n\/.*@EXAMPLE\.COM)s/.*/bigdata/RULE:[2:$1/$2@$0](router\/.*@EXAMPLE\.COM)s/.*/bigdata/RULE:[2:$1/$2@$0](HTTP\/.*@EXAMPLE\.COM)s/.*/bigdata/DEFAULT</value></property>

hadoop.security.auth_to_local中的映射规则,是将某些用Kerberos户映射成了OS账户,如HDFS进程用户,bigdata用户属于超级用户,目前是简化使用,如需强化安全,需要更详细的映射区分规则。RULE的格式:

RULE:[n:string](regexp)s/pattern/replacement/g

n代表的主体有几个组成部分

$1代表主体的primary部分,如nn

$2代表主体的Instance部分,如主体名

$0代表主体的领域,如EXAMPLE.COM

regexp代表Kerberos主体的正则匹配式

replacement代表替换后的文本,即OS账户?

hdfs-site.xml

是否需要拆分成JN DN NN的hdfs-site.xml,

	<!-- 开启访问DataNode数据块需Kerberos认证 --><property><name>dfs.block.access.token.enable</name><value>true</value></property><!-- NameNode服务的Kerberos主体 --><property><name>dfs.namenode.kerberos.principal</name><value>nn/_HOST@EXAMPLE.COM</value></property><!-- NameNode服务的keytab密钥文件路径 --><property><name>dfs.namenode.keytab.file</name><value>/opt/keytabs/nn.service.keytab</value></property><!-- DataNode服务的Kerberos主体 --><property><name>dfs.datanode.kerberos.principal</name><value>dn/_HOST@EXAMPLE.COM</value></property><!-- DataNode服务的keytab密钥文件路径 --><property><name>dfs.datanode.keytab.file</name><value>/opt/keytabs/dn.service.keytab</value></property><!-- JournalNode服务的Kerberos主体 --><property><name>dfs.journalnode.kerberos.principal</name><value>jn/_HOST@EXAMPLE.COM</value></property><!-- JournalNode服务的keytab密钥文件路径 --><property><name>dfs.journalnode.keytab.file</name><value>/opt/keytabs/jn.service.keytab</value></property><!-- 配置DataNode数据传输保护策略为仅认证模式 --><property><name>dfs.data.transfer.protection</name><value>authentication</value></property><!-- HDFS WebUI服务认证主体 --><property><name>dfs.web.authentication.kerberos.principal</name><value>HTTP/_HOST@EXAMPLE.COM</value></property><!-- HDFS WebUI服务keytab密钥文件路径 --><property><name>dfs.web.authentication.kerberos.keytab</name><value>/opt/keytabs/spnego.service.keytab</value></property><!-- NameNode WebUI 服务认证主体 --><property><name>dfs.namenode.kerberos.internal.spnego.principal</name><value>HTTP/_HOST@EXAMPLE.COM</value></property><!-- JournalNode WebUI 服务认证主体 --><property><name>dfs.journalnode.kerberos.internal.spnego.principal</name><value>HTTP/_HOST@EXAMPLE.COM</value></property><property><name>dfs.http.policy</name><value>HTTPS_ONLY</value><description>所有开启的web页面均使用https</description></property>

principa相关配置中的变量_HOST在实际运行中会自动替换成节点对应的主机名。

hdfs-rbf-site.xml

为了简化使用可以在hdfs-site.xml中去掉配置项dfs.http.policy

router会应用该文件,配置文件增加内容

    <!-- router服务的Kerberos主体 --><property><name>dfs.federation.router.kerberos.principal</name><value>router/_HOST@EXAMPLE.COM</value></property><!-- router服务的keytab密钥文件路径 --><property><name>dfs.federation.router.keytab.file</name><value>/opt/keytabs/router.service.keytab</value></property><!-- Router WebUI 服务认证主体 --><property><name>dfs.federation.router.kerberos.internal.spnego.principal</name><value>HTTP/_HOST@EXAMPLE.COM</value></property><property><name>dfs.federation.router.secret.manager.class</name><value>org.apache.hadoop.hdfs.server.federation.router.security.token.ZKDelegationTokenSecretManagerImpl</value></property><property><name>zk-dt-secret-manager.zkAuthType</name><value>none</value></property><property><name>zk-dt-secret-manager.zkConnectionString</name><value>hadoop3test1-03.test.com:2181,hadoop3test1-04.test.com:2181,hadoop3test1-05.test.com:2181</value></property><property><name>zk-dt-secret-manager.kerberos.keytab</name><value>/opt/keytabs/router.service.keytab</value></property><property><name>zk-dt-secret-manager.kerberos.principal</name><value>router/_HOST@EXAMPLE.COM</value></property>

在Router(3.4.1版本)上开启Kerberos时,从Router启动的源码上,发现必然需要实例化ZKDelegationTokenSecretManagerImpl。会使用到配置项zk-dt-secret-manager.zkAuthTypezk-dt-secret-manager.zkConnectionStringzk-dt-secret-manager.kerberos.keytabzk-dt-secret-manager.kerberos.principal

Kerberos访问验证

验证客户端

shell客户端

# 先进行用户认证
kinit -kt /opt/keytabs/huatuo.keytab huatuo
# 执行HDFS的shell命令
hdfs dfs -ls /
# 刷新Kerberos票据
kinit -R

JAVA客户端

pom.xml文件中需要的认证依赖

    <dependencies><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-common</artifactId><version>3.4.1</version></dependency><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-hdfs</artifactId><version>3.4.1</version></dependency><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-client</artifactId><version>3.4.1</version></dependency></dependencies>

代码需要的属性配置

 // 需要在JVM的环境中指明java.security.krb5.confSystem.setProperty("java.security.krb5.conf", "D:/code/KerberosHadoopClient/HDFS-3.4.1/target/classes/kerberos/krb5.conf");// org.apache.hadoop.conf.Configuration需要增加Kerberos的属性Configuration conf = new Configuration();.........// 开启权限认证conf.setBoolean("hadoop.security.authorization", true);// 开启用户认证模式 kerberosconf.set("hadoop.security.authentication", "kerberos");// DataNode数据传输保护策略为仅认证模式conf.set("dfs.data.transfer.protection", "authentication");// 启用基于 keytab 的 kerberos 登录的自动更新conf.setBoolean("hadoop.kerberos.keytab.login.autorenewal.enabled", true);// userName指即为userPrincipalString userPrincipal = "gudong";// userKerberosKeytabPath代表userPrincipal的keytab文件String userKerberosKeytabPath = "D:/code/KerberosHadoopClient/HDFS-3.4.1/target/classes/kerberos/gudong.keytab";// 使用UserGroupInformation进行用户认证UserGroupInformation.loginUserFromKeytab(userName, userKerberosKeytabPath);FileSystem fileSystem = FileSystem.get(conf);......

验证内容

  1. 创建超级用户bigdata的Kerberos主体、普通用户huatuo的Kerberos主体,按用户分别导出keytab
  2. 使用keytab进行用户认证
  3. 使用用户创建HDFS文件,写入数据,读取文件。
  4. 使用超级用户操作普通用户路径
  5. 使用普通用户操作其他用户路径

遇到问题

问题一

Kerberos校验失败org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]

代表客户端服务器上没有通过Kerberos校验,在使用hdfs命令之前需要先进行Kerberos认证,如【验证客户端】章节

[bigdata@hadoop3test1-01 ~]$ hadoop fs -ls /
2025-06-03 16:35:25,444 WARN ipc.Client: Exception encountered while connecting to the server hadoop3test1-01.test.com/192.168.1.1:9000
org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:179)at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:399)at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:578)at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:364)at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:799)at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:795)at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs(Subject.java:422)at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:795)at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:364)at org.apache.hadoop.ipc.Client.getConnection(Client.java:1649)at org.apache.hadoop.ipc.Client.call(Client.java:1473)at org.apache.hadoop.ipc.Client.call(Client.java:1426)at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258)at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139)at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.lambda$getFileInfo$41(ClientNamenodeProtocolTranslatorPB.java:820)at org.apache.hadoop.ipc.internal.ShadedProtobufHelper.ipc(ShadedProtobufHelper.java:160)at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:820)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:437)at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:170)at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:162)at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:100)at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:366)at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1770)at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1828)at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1825)at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1840)at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)at org.apache.hadoop.fs.Globber.doGlob(Globber.java:362)at org.apache.hadoop.fs.Globber.glob(Globber.java:202)at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:2225)at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:348)at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:265)at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:248)at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:105)at org.apache.hadoop.fs.shell.Command.run(Command.java:192)at org.apache.hadoop.fs.FsShell.run(FsShell.java:327)at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:97)at org.apache.hadoop.fs.FsShell.main(FsShell.java:390)
ls: DestHost:destPort hadoop3test1-01.test.com:9000 , LocalHost:localPort hadoop3test1-01.test.com/192.168.1.1:0. Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]

问题二

使用JAVA客户端中,调整krb5.conf配置文件中的票据有效期参数ticket_lifetime = 10m不生效。

同问题https://stackoverflow.com/questions/38555244/how-do-you-set-the-kerberos-ticket-lifetime-from-java

应是JAVA BUG JDK-8044500,使用JAVA 9+以上的JDK.

问题三

2025-05-30 11:28:54,975 ERROR [main] server.JournalNode: Failed to start journalnode.
java.lang.IllegalArgumentException: Can't get Kerberos realmat org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:71)at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:314)at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:299)at org.apache.hadoop.security.UserGroupInformation.isAuthenticationMethodEnabled(UserGroupInformation.java:394)at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:388)at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:315)at org.apache.hadoop.hdfs.qjournal.server.JournalNode.start(JournalNode.java:237)at org.apache.hadoop.hdfs.qjournal.server.JournalNode.run(JournalNode.java:216)at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:97)at org.apache.hadoop.hdfs.qjournal.server.JournalNode.main(JournalNode.java:458)
Caused by: java.lang.IllegalArgumentException: KrbException: Cannot locate default realmat javax.security.auth.kerberos.KerberosPrincipal.<init>(KerberosPrincipal.java:154)at org.apache.hadoop.security.authentication.util.KerberosUtil.getDefaultRealm(KerberosUtil.java:120)at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:69)... 10 more

客户端服务器中的/etc/krb5.conf中的[libdefaults]下未开启default_realm配置项。解开注释,配上Kerberos服务的领域。

问题四

025-05-30 11:52:42,635 INFO [Socket Reader #1 for port 9000] ipc.Server: Socket Reader #1 for port 9000: readAndProcess from client 10.37.74.29:15089 threw exception [org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]]
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]at org.apache.hadoop.ipc.Server$Connection.initializeAuthContext(Server.java:2582)at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:2531)at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:1650)at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:1505)at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:1476)

HDFS客户端访问服务端时,服务端开启了Kerberos认证,但是客户端没有配置对应的属性,采用默认认证SIMPLE,导致与服务端认证不一致,在客户端确认下面三个配置:

  1. hadoop.security.authorization:true
  2. hadoop.security.authentication:kerberos
  3. dfs.data.transfer.protection:authentication

问题五

2025-05-30 14:18:26,296 ERROR [main] datanode.DataNode: Exception in secureMain
java.lang.RuntimeException: Cannot start secure DataNode due to incorrect config. See https://cwiki.apache.org/confluence/display/HADOOP/Secure+DataNode for details.at org.apache.hadoop.hdfs.server.datanode.DataNode.checkSecureConfig(DataNode.java:2037)at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1890)at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:592)at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:3400)at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:3260)at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:3350)at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:3494)at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:3518)
2025-05-30 14:18:26,299 INFO [main] util.ExitUtil: Exiting with status 1: java.lang.RuntimeException: Cannot start secure DataNode due to incorrect config. See https://cwiki.apache.org/confluence/display/HADOOP/Secure+DataNode for details.

DN开启Kerberos后,启动的时候,并未开启secure datanode,完善配置HTTPS开启secure datanode。

问题六

WebUI中,一些地址无法访问

  1. 无法通过web ui访问HDFS文件系统
  2. 无法访问Logs
  3. 无法访问Log Level
2025-06-03 18:07:59,795 INFO [qtp2104973502-4547] requests.namenode: 10.46.61.83 - - [03/Jun/2025:10:07:59 +0000] "GET /logs/ HTTP/1.1" 403 570 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36"

问题七

SSL认证的失败

2025-06-04 09:20:03,871 ERROR [nioEventLoopGroup-3-5] web.DatanodeHttpServer: Exception in RestCsrfPreventionFilterHandler
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLException: Received fatal alert: certificate_unknownat io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:499)at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)at java.lang.Thread.run(Thread.java:748)
Caused by: javax.net.ssl.SSLException: Received fatal alert: certificate_unknownat sun.security.ssl.Alerts.getSSLException(Alerts.java:208)at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1666)at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1634)at sun.security.ssl.SSLEngineImpl.recvAlert(SSLEngineImpl.java:1800)at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:1083)at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:907)at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:309)at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1441)at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1334)at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1383)at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)... 17 more

可能是SSL秘钥文件访问有问题,比如bigdata无法访问,需要将SSL秘钥文件的用户与用户组调整为bigdata。

问题八

Router启动出现 Zookeeper SASL authentication 失败

2025-06-05 20:32:40,332 ERROR [main-SendThread(hadoop3test1-04.test.com:2181)] client.ZooKeeperSaslClient: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS 
initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)]) occurred when evaluating Zookeeper Quorum Member
's  received SASL token. Zookeeper Client will go to AUTH_FAILED state.
2025-06-05 20:32:40,332 ERROR [main-SendThread(hadoop3test1-04.test.com:2181)] zookeeper.ClientCnxn: SASL authentication with Zookeeper Quorum member failed.
javax.security.sasl.SaslException: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mech
anism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)]) occurred when evaluating Zookeeper Quorum Member's  received SASL token. Zookeeper Client will go to AUTH_FAILED state. [Cau
sed by java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerber
os database (7) - LOOKING_UP_SERVER)]]
...
...
...

并未开启Zookeeper的Kerberos认证。未分析Router启动时,需要去认证ZooKeeperSaslClient,直接在hadoop-env.sh文件export HDFS_DFSROUTER_OPTS中增加-Dzookeeper.sasl.client=false,跳过ZooKeeperSaslClient。

export HDFS_DFSROUTER_OPTS="-Xms16G -Xmx16G -XX:NewRatio=7 -Dzookeeper.sasl.client=false  -XX:SurvivorRatio=4 -Xloggc:${HADOOP_LOG_DIR}/hdfs-router.gc.%t.log $HDFS_BASE_OPTS"

问题九

Router启动中出现报错,通过Router对HDFS进行创建文件,写入数据,删除数据,未发现问题,未解决

2025-06-06 10:59:04,040 ERROR [Curator-SafeNotifyService-0] delegation.ZKDelegationTokenSecretManager: Error while processing Curator keyCacheListener NODE_CREATED / NODE_CHANGED event
2025-06-06 10:59:04,040 ERROR [Curator-SafeNotifyService-0] listen.MappingListenerManager: Listener (org.apache.curator.framework.recipes.cache.CuratorCacheListenerBuilderImpl$2@59e505b2) threw an excep
tion
java.io.UncheckedIOException: java.io.EOFExceptionat org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.lambda$startThreads$0(ZKDelegationTokenSecretManager.java:302)at org.apache.curator.framework.recipes.cache.CuratorCacheListenerBuilderImpl.lambda$forCreatesAndChanges$2(CuratorCacheListenerBuilderImpl.java:69)at org.apache.curator.framework.recipes.cache.CuratorCacheListenerBuilderImpl$2.lambda$event$0(CuratorCacheListenerBuilderImpl.java:149)at java.util.ArrayList.forEach(ArrayList.java:1257)at org.apache.curator.framework.recipes.cache.CuratorCacheListenerBuilderImpl$2.event(CuratorCacheListenerBuilderImpl.java:149)at org.apache.curator.framework.recipes.cache.CuratorCacheImpl.lambda$putStorage$7(CuratorCacheImpl.java:279)at org.apache.curator.framework.listen.MappingListenerManager.lambda$forEach$0(MappingListenerManager.java:92)at org.apache.curator.framework.listen.MappingListenerManager.forEach(MappingListenerManager.java:89)at org.apache.curator.framework.listen.StandardListenerManager.forEach(StandardListenerManager.java:89)at org.apache.curator.framework.recipes.cache.CuratorCacheImpl.lambda$callListeners$10(CuratorCacheImpl.java:293)at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.EOFExceptionat java.io.DataInputStream.readFully(DataInputStream.java:197)at java.io.DataInputStream.readFully(DataInputStream.java:169)at org.apache.hadoop.security.token.delegation.DelegationKey.readFields(DelegationKey.java:117)at org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.processKeyAddOrUpdate(ZKDelegationTokenSecretManager.java:395)at org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.lambda$startThreads$0(ZKDelegationTokenSecretManager.java:298)... 13 more
2025-06-06 10:59:04,045 INFO [main] token.ZKDelegationTokenSecretManagerImpl: Zookeeper delegation token secret manager instantiated
2025-06-06 10:59:04,046 INFO [Thread[Thread-3,5,main]] delegation.AbstractDelegationTokenSecretManager: Starting expired delegation token remover thread, tokenRemoverScanInterval=60 min(s)
2025-06-06 10:59:04,056 ERROR [Curator-SafeNotifyService-0] delegation.ZKDelegationTokenSecretManager: Error while processing Curator tokenCacheListener NODE_CREATED / NODE_CHANGED event
2025-06-06 10:59:04,056 ERROR [Curator-SafeNotifyService-0] listen.MappingListenerManager: Listener (org.apache.curator.framework.recipes.cache.CuratorCacheListenerBuilderImpl$2@73d983ea) threw an exception
java.io.UncheckedIOException: java.io.IOException: Unknown version of delegation token 49at org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.lambda$startThreads$2(ZKDelegationTokenSecretManager.java:326)at org.apache.curator.framework.recipes.cache.CuratorCacheListenerBuilderImpl.lambda$forCreatesAndChanges$2(CuratorCacheListenerBuilderImpl.java:69)at org.apache.curator.framework.recipes.cache.CuratorCacheListenerBuilderImpl$2.lambda$event$0(CuratorCacheListenerBuilderImpl.java:149)at java.util.ArrayList.forEach(ArrayList.java:1257)at org.apache.curator.framework.recipes.cache.CuratorCacheListenerBuilderImpl$2.event(CuratorCacheListenerBuilderImpl.java:149)at org.apache.curator.framework.recipes.cache.CuratorCacheImpl.lambda$putStorage$7(CuratorCacheImpl.java:279)at org.apache.curator.framework.listen.MappingListenerManager.lambda$forEach$0(MappingListenerManager.java:92)at org.apache.curator.framework.listen.MappingListenerManager.forEach(MappingListenerManager.java:89)at org.apache.curator.framework.listen.StandardListenerManager.forEach(StandardListenerManager.java:89)at org.apache.curator.framework.recipes.cache.CuratorCacheImpl.lambda$callListeners$10(CuratorCacheImpl.java:293)at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Unknown version of delegation token 49at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenIdentifier.readFields(AbstractDelegationTokenIdentifier.java:193)at org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.processTokenAddOrUpdate(ZKDelegationTokenSecretManager.java:415)at org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.lambda$startThreads$2(ZKDelegationTokenSecretManager.java:322)... 13 more
2025-06-06 10:59:04,064 INFO [Thread[Thread-3,5,main]] delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens

相关文章:

  • 6月10日星期二今日早报简报微语报早读
  • 用纯.NET开发并制作一个智能桌面机器人(五):使用.NET为树莓派开发Wifi配网功能
  • Unit 2 训练你的第一个深度强化学习智能体 LunarLander-v3
  • 慢接口优化万能公式-适合所有系统
  • 1.2 git使用
  • SIP协议之NACK(Negative Acknowledgement)
  • LLMs 系列实操科普(3)
  • 智慧工地云平台源码,基于微服务架构+Java+Spring Cloud +UniApp +MySql
  • 业财融合怎么做?如何把握业务与财务的边界?
  • crackme008
  • Unity | AmplifyShaderEditor插件基础(第八集:噪声波动shader)
  • Siri在WWDC中的缺席显得格外刺眼
  • day50python打卡
  • 通道注意力机制
  • spring jms使用
  • 上位机开发:C# 读写 PLC 数据块数据
  • 内存分配函数malloc kmalloc vmalloc
  • LeetCode 3442.奇偶频次间的最大差值 I:计数
  • gro文件和top文件介绍,以及如何合并两个gro文件或两个top文件
  • 天猫官方认证TP服务商——品融电商代运营全链路解析
  • wordpress 分享后可见/免费seo快速排名系统
  • 做网站42类商标怎么选小类/电商最好卖的十大产品
  • 网站推广通常是从网站建设及运营/5118数据分析平台
  • 在哪里创建网站平台/百度站长工具seo查询
  • 申请做网站要什么局/百度网盘app
  • 腾讯云学生怎么做网站的/百度旗下的所有产品