当前位置: 首页 > news >正文

Spark读取MySQL数据库表

官方地址:JDBC To Other Databases - Spark 4.0.0 Documentation

官方案例:

// Note: JDBC loading and saving can be achieved via either the load/save or jdbc methods
// Loading data from a JDBC source
val jdbcDF = spark.read.format("jdbc").option("url", "jdbc:postgresql:dbserver").option("dbtable", "schema.tablename").option("user", "username").option("password", "password").load()val connectionProperties = new Properties()
connectionProperties.put("user", "username")
connectionProperties.put("password", "password")
val jdbcDF2 = spark.read.jdbc("jdbc:postgresql:dbserver", "schema.tablename", connectionProperties)
// Specifying the custom data types of the read schema
connectionProperties.put("customSchema", "id DECIMAL(38, 0), name STRING")
val jdbcDF3 = spark.read.jdbc("jdbc:postgresql:dbserver", "schema.tablename", connectionProperties)// Saving data to a JDBC source
jdbcDF.write.format("jdbc").option("url", "jdbc:postgresql:dbserver").option("dbtable", "schema.tablename").option("user", "username").option("password", "password").save()jdbcDF2.write.jdbc("jdbc:postgresql:dbserver", "schema.tablename", connectionProperties)// Specifying create table column data types on write
jdbcDF.write.option("createTableColumnTypes", "name CHAR(64), comments VARCHAR(1024)").jdbc("jdbc:postgresql:dbserver", "schema.tablename", connectionProperties)

验证:

<dependencies><dependency><groupId>org.scala-lang</groupId><artifactId>scala-library</artifactId><version>2.13.16</version></dependency><dependency><groupId>org.apache.spark</groupId><artifactId>spark-core_2.13</artifactId><version>4.0.0</version></dependency><dependency><groupId>org.apache.spark</groupId><artifactId>spark-sql_2.13</artifactId><version>4.0.0</version></dependency><dependency><groupId>junit</groupId><artifactId>junit</artifactId><version>3.8.1</version><scope>test</scope></dependency><dependency><groupId>mysql</groupId><artifactId>mysql-connector-java</artifactId><version>8.0.15</version></dependency><dependency><groupId>org.apache.spark</groupId><artifactId>spark-streaming_2.13</artifactId><version>4.0.0</version></dependency><dependency><groupId>org.apache.spark</groupId><artifactId>spark-streaming-kafka-0-10_2.13</artifactId><version>4.0.0</version></dependency></dependencies>
import org.apache.spark.SparkConfobject SparkCase04 {import org.apache.spark.sql.SparkSessiondef main(args: Array[String]): Unit = {val conf = new SparkConf().setMaster("local[1]").setAppName("WC")val spark = SparkSession.builder().config(conf).getOrCreate()var df = spark.read.format("jdbc").option("url", "jdbc:mysql://node11:3306/wjobs?useSSL=false").option("dbtable", "user").option("user", "root").option("password", "root123").load()df.show()}}

http://www.dtcms.com/a/343046.html

相关文章:

  • CSS【详解】性能优化
  • 什么是区块链?从比特币到Web3的演进
  • 深入浅出集成学习:从理论到实战,解锁机器学习 “集体智慧”
  • 新的 SHAMOS MacOS 窃取程序利用单行终端命令攻击用户
  • OceanBase 分区裁剪(Partition Pruning)原理解读
  • python + unicorn + xgboost + pytorch 搭建机器学习训练平台遇到的问题
  • Spring Boot集成腾讯云人脸识别实现智能小区门禁系统
  • windows无法访问某个服务器共享文件夹
  • linux中ctype.h字符判断函数实现原理
  • linux中的iptables的简介与常用基础用法
  • springboot人事管理系统源码和论文
  • Python编程练习100例(含答案)
  • HTTP 与 HTTPS:网络通信幽径上的安全秘钥
  • Promise详解:Promise解决ajax回调嵌套问题
  • system\core\init\init.cpp----LoadBootScripts()解析init.rc(2)
  • 五大主流ETL数据集成平台推荐
  • 鸿蒙 NEXT开发中轻松实现人脸识别功能
  • hadoop-3.3.6和hbase-2.4.13
  • 日志收集(ELK)
  • k8s--NetworkPolicy资源对象
  • 打工人项目日报计划
  • 搭建FTP文件共享服务器
  • linux 之 virtio 子系统核心的数据结构
  • DeepSeek R2难产:近期DeepSeek-V3.1 发布更新并开源,成功实现迈向 Agent 时代的第一步
  • 信息收集4----(收集网站指纹信息)
  • CSS 3D动画,围绕旋转动画Demo
  • 常见 Linux 网络命令梳理
  • AGV 技术落地场景解析:从制造业到仓储物流,看自动导引车的行业应用
  • 【Ruoyi解密-02.登录流程:】登录-找密码不抓瞎
  • 封装FTPSClient连接ftps服务器