当前位置: 首页 > news >正文

Spark on k8s部署

一、环境准备

1.安装Jdk1.8

(1)Jdk1.8下载地址:https://www.oracle.com/java/technologies/downloads/archive/

将压缩包解压到/opt/目录

tar zxf jdk-8u212-linux-x64.tar.gz -C /opt/

(2)配置环境变量

编辑配置文件,vi /etc/profile,添加以下内容

#jdk1.8.0_121
export JAVA_HOME=/opt/jdk1.8.0_212
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin

使环境变量生效

source /etc/profile

(3)添加jdk安全认证

引入如下三个包
在这里插入图片描述

并在java.security文件中添加配置 security.provider.10=org.bouncycastle.jce.provider.BouncyCastleProvider
在这里插入图片描述

2、获取Spark安装包文件

(1)使用wget命令下载Spark v3.2.3安装包文件。

wget https://archive.apache.org/dist/spark/spark-3.2.3/spark-3.2.3-bin-hadoop3.2.tgz

(2)解压并重命名

tar -zxvf spark-3.2.3-bin-hadoop3.2.tgz -C /opt/module
mv spark-3.2.3-bin-hadoop3.2 spark-3.2.3

3、初始化K8s环境

(1)创建metaSphere Namespace

编写metaSphere-namespace.yaml

vi metaSphere-namespace.yaml
apiVersion: v1
kind: Namespace
metadata:name: metaspherelabels:app.kubernetes.io/name: metasphereapp.kubernetes.io/instance: metasphere

提交yaml创建namespace

kubectl apply -f metaSphere-namespace.yaml

查看namespace

kubectl get ns
(2)创建ServiceAccount

编写spark-service-account.yaml

vi spark-service-account.yaml
apiVersion: v1
kind: ServiceAccount
metadata:namespace: metaspherename: spark-service-accountlabels:app.kubernetes.io/name: metasphereapp.kubernetes.io/instance: metasphereapp.kubernetes.io/version: v3.2.3

提交yaml创建ServiceAccount

kubectl apply -f spark-service-account.yaml

查看ServiceAccount

kubectl get sa -n metasphere
(3)创建Role和RoleBinding

编写spark-role.yaml

vi spark-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:labels:app.kubernetes.io/name: metasphereapp.kubernetes.io/instance: metasphereapp.kubernetes.io/version: v3.2.3namespace: metaspherename: spark-role
rules:- apiGroups: [""]resources: ["pods"]verbs: ["get", "watch", "list", "create", "delete"]- apiGroups: ["extensions", "apps"]resources: ["deployments"]verbs: ["get", "watch", "list", "create", "delete"]- apiGroups: [""]resources: ["configmaps"]verbs: ["get", "create", "update", "delete"]- apiGroups: [""]resources: ["secrets"]verbs: ["get"]- apiGroups: [""]resources: ["services"]verbs: ["get", "list", "create", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:labels:app.kubernetes.io/name: metasphereapp.kubernetes.io/instance: metasphereapp.kubernetes.io/version: v3.2.3name: spark-role-bindingnamespace: metasphere
roleRef:apiGroup: rbac.authorization.k8s.iokind: Rolename: spark-role
subjects:- kind: ServiceAccountname: spark-service-accountnamespace: metasphere

提交yaml创建Role和RoleBinding

kubectl apply -f spark-role.yaml

查看Role和RoleBinding

kubectl get role -n metasphere
kubectl get rolebinding -n metasphere
(4)创建ClusterRole和ClusterRoleBinding

编写cluster-role.yaml

vi cluster-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:labels:app.kubernetes.io/name: metasphereapp.kubernetes.io/instance: metasphereapp.kubernetes.io/version: v3.2.3name: apache-spark-clusterrole
rules:- apiGroups:- ''resources:- configmaps- endpoints- nodes- pods- secrets- namespacesverbs:- list- watch- get- apiGroups:- ''resources:- servicesverbs:- get- list- watch- apiGroups:- ''resources:- eventsverbs:- create- patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:labels:app.kubernetes.io/name: metasphereapp.kubernetes.io/instance: metasphereapp.kubernetes.io/version: v3.2.3name: apache-spark-clusterrole-binding
roleRef:apiGroup: rbac.authorization.k8s.iokind: ClusterRolename: apache-spark-clusterrole
subjects:- kind: ServiceAccountname: spark-service-accountnamespace: metasphere

提交yaml创建ClusterRole和ClusterRoleBinding

kubectl apply -f cluster-role.yaml

查看ClusterRole和ClusterRoleBinding

kubectl get ClusterRole | grep sparkkubectl get ClusterRoleBinding | grep spark 

二、Spark On K8s基本测试

1、拉取apache spark镜像

到Docker Hub查找apache spark的镜像,并拉取到本地

docker pull apache/spark:v3.2.3

如果因为网络原因无法下载镜像,则使用以下镜像

docker pull registry.cn-hangzhou.aliyuncs.com/cm_ns01/apache-spark:v3.2.3

2、查看k8s master的url

获取Kubernetes control plane URL

kubectl cluster-info

3、提交Spark程序到K8s上运行

/opt/module/spark-3.2.3/bin/spark-submit \--name SparkPi \--verbose \--master k8s://https://localhost:6443 \--deploy-mode cluster \--conf spark.network.timeout=300 \--conf spark.executor.instances=3 \--conf spark.driver.cores=1 \--conf spark.executor.cores=1 \--conf spark.driver.memory=1024m \--conf spark.executor.memory=1024m \--conf spark.kubernetes.namespace=metasphere \--conf spark.kubernetes.container.image.pullPolicy=IfNotPresent \--conf spark.kubernetes.container.image=registry.cn-hangzhou.aliyuncs.com/cm_ns01/apache-spark:v3.2.3 \--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-service-account \--conf spark.kubernetes.authenticate.executor.serviceAccountName=spark-service-account \--conf spark.driver.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" \--conf spark.executor.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" \--class org.apache.spark.examples.SparkPi \local:///opt/spark/examples/jars/spark-examples_2.12-3.2.3.jar \3000

参数说明:

–master为Kubernetes control plane URL

–deploy-mode为cluster,则driver和executor都运行在K8s里

–conf spark.kubernetes.namespace为前面创建的命名空间metasphere

–conf spark.kubernetes.container.image为Spark的镜像地址

–conf spark.kubernetes.authenticate.executor.serviceAccountName为前面创建的spark-service-account

–class为Spark程序的启动类

local:///opt/spark/examples/jars/spark-examples_2.12-3.2.3.jar为Spark程序所在的Jar文件,spark-examples_2.12-3.2.3.jar是Spark镜像自带的,所以使用local schema

3000是传入Spark程序的启动类的参数

4、观察driver pod和executor pod

watch -n 1 kubectl get all -owide -n metasphere

5、查看日志输出

kubectl logs sparkpi-b9de1a887b1163f1-driver -n metasphere

6、清理Driver Pod

kubectl delete pod sparkpi-b9de1a887b1163f1-driver -n metasphere
http://www.dtcms.com/a/460987.html

相关文章:

  • Kotlin 内联函数、高阶函数、扩展函数
  • 用化学方法nmp溶剂从佳能cmos传感器上剥离拜耳矩阵和微透镜
  • Apache Tomcat 详解
  • 矩阵奇异值分解(SVD)中Golub–Kahan 双对角化 + 对双对角矩阵的隐式QR详解
  • QT MVC中Model的特点及使用注意事项
  • wordpress最快仿站宁波网络营销服务
  • 徕卡RTC360助力铝单板设计效率提升
  • EasyExcel 读取 Excel 文件指南
  • LabVIEW光栅旋转式光谱仪
  • 上海营销网站设计去设计公司还是去企业
  • 怎么查询自己注册的商标东营网站建设课程定位优化
  • 【rabbitmq 高级特性】RabbitMQ 延迟队列全面解析
  • linux学习笔记(22)线程同步——线程信号量
  • 如何用营销自动化提升开信率与转化率
  • 人形机器人安全研究
  • 比斯特自动化|为什么焊接18650电池离不开点焊机?
  • 多字节串口收发IP设计(二)串口通信扫盲
  • 人工智能基础知识笔记十七:微调方法
  • 北京企业免费建站农八师建设兵团社保网站
  • 《强化学习数学原理》学习笔记11——阶段策略迭代算法
  • Qt QtConcurrent使用入门浅解
  • C语言字符串与内存操作函数完全指南
  • 【第五章:计算机视觉-项目实战之生成式算法实战:扩散模型】2.CV黑科技:生成式算法理论-(5)Stable Diffusion模型讲解
  • Cookie和Seeion在客户端和服务端的角色作用
  • Linux 远程Ubuntu服务器本地部署大模型 EmoLLM 中常见的问题及解决方案 万字详解
  • 如何建设公司网站信息灯塔网站seo
  • Java 中 `equals()`、`==` 和 `hashCode()` 的区别
  • 成像系统(十四-1:《工业级ISP流水线:ISP前端处理 - 从原始数据到可用图像》):从LED冬奥会、奥运会及春晚等大屏,到手机小屏,快来挖一挖里面都有什么
  • vue-router(vue 路由)基本使用指南(二)
  • 深入理解 Java中的 异常和泛型(指南十二)