当前位置: 首页 > news >正文

Spark eventlog

Eventlog 示例
{
    "Event": "org.apache.spark.sql.execution.ui.SparkListenerSQLExecutionStart",
    "executionId": 0,
    "rootExecutionId": 0,
    "desc ription": "select round(a, 2), a from double_table",
    "details": "org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)\nsun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\nsun.reflect.NativeMetho    dAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\nsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\njava.lang.reflect.Method.invoke(Method.java:498)\norg.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)\norg.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain    (SparkSubmit.scala:1029)\norg.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194)\norg.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217)\norg.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)\norg.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120)\norg.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129)\norg.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)",
    "physicalPlanDescription": "==   Physical Plan ==\n* Project (3)\n+- * ColumnarToRow (2)\n   +- Scan parquet spark_catalog.default.double_table (1)\n    \n\n(1) Scan parquet spark_catalog.default.double_table\nOutput [1]: [a#0]\nBatched: true\nLocation: InMemoryFileInde    x [file:/home/hadoop/files/double_table]\nReadSchema: struct<a:double>\n\n(2) ColumnarToRow [codegen id : 1]\nInput [1]: [a#0]\n\n(3) Project [codegen id : 1]\nOutput [2]: [round(a#0, 2) AS round(a, 2)#1, a#0]\nInput [1]: [a#0]\n\n",
    " sparkPlanInfo": {
        "nodeName": "WholeStageCodegen (1)",
        "simpleString": "WholeStageCodegen (1)",
        "children": [
            {
                "nodeName": "Project",
                "simpleString": "Project [round(a#0, 2) AS round(a, 2)#1, a#0]",
                "children": [
                    {
                        "nodeName": "ColumnarToRow",
                        "simple String": "ColumnarToRow",
                        "children": [
                            {
                                "nodeName": "InputAdapter",
                                "simpleString": "InputAdapter",
                                "children": [
                                    {
                                        "nodeName": "Scan parquet spark_catalog.default.double_table",
                                        "simpleString": "FileScan parquet spark_catalog.default.double_table    [a#0] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/home/hadoop/files/d    ouble_table], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<a:double>",
                                        "children": [

                                        ],
                                        "metadata": {
                                            "Locat ion": "InMemoryFileIndex(1 paths)[file:/home/hadoop/files/double_table]",
                                            "ReadSchema": "struct<a:double>",
                                            "Format": "Par    quet",
                                            "Batched": "true",
                                            "PartitionFilters": "[]",
                                            "PushedFilters": "[]",
                                            "DataFilters": "[]"
                                        },
                                        "metrics": [
                                            {
                                                "name": "number of     files read",
                                                "accumulatorId": 5,
                                                "metricType": "sum"
                                            },
                                            {
                                                "name": "scan time",
                                                "accumulatorId": 4,
                                                "metricType": "timing"
                                            },
                                            {
                                                "nam e": "metadata time",
                                                "accumulatorId": 6,
                                                "metricType": "timing"
                                            },
                                            {
                                                "name": "size of files read",
                                                "accumulatorId": 7,
                                                "metricTyp e": "size"
                                            },
                                            {
                                                "name": "number of output rows",
                                                "accumulatorId": 3,
                                                "metricType": "sum"
                                            }
                                        ]
                                    }
                                ],
                                "metadata": {

                                },
                                "metrics": [

                                ]
                            }
                        ],
                        "met adata": {

                        },
                        "metrics": [
                            {
                                "name": "number of output rows",
                                "accumulatorId": 1,
                                "metricType": "sum"
                            },
                            {
                                "name": "number of input b    atches",
                                "accumulatorId": 2,
                                "metricType": "sum"
                            }
                        ]
                    }
                ],
                "metadata": {

                },
                "metrics": [

                ]
            }
        ],
        "metadata": {

        },
        "metrics": [
            {
                "name": "durat    ion",
                "accumulatorId": 0,
                "metricType": "timing"
            }
        ]
    },
    "time": 1741661558528,
    "modifiedConfigs": {

    },
    "jobTags": [

    ]
}

== Physical Plan ==
* Project (3)
+- * ColumnarToRow (2)
   +- Scan parquet spark_catalog.default.double_table (1)

对应于

==     Physical Plan ==\n* Project (3)\n+- * ColumnarToRow (2)\n   +- Scan parquet spark_catalog.default.double_table (1)\n    \n\n
(1) Scan parquet spark_catalog.default.double_table
Output [1]: [a#0]
Batched: true
Location: InMemoryFileIndex [file:/home/hadoop/files/double_table]
ReadSchema: struct<a:double>

(2) ColumnarToRow [codegen id : 1]
Input [1]: [a#0]

(3) Project [codegen id : 1]
Output [2]: [round(a#0, 2) AS round(a, 2)#1, a#0]
Input [1]: [a#0]

对应于

 "physicalPlanDescription": "==     Physical Plan ==\n* Project (3)\n+- * ColumnarToRow (2)\n   +- Scan parquet spark_catalog.default.double_table (1)\n    \n\n(1) Scan parquet spark_catalog.default.double_table\nOutput [1]: [a#0]\nBatched: true\nLocation: InMemoryFileIndex [file:/home/hadoop/files/double_table]\nReadSchema: struct<a:double>\n\n(2) ColumnarToRow [codegen id : 1]\nInput [1]: [a#0]\n\n(3) Project [codegen id : 1]\nOutput [2]: [round(a#0, 2) AS round(a, 2)#1, a#0]\nInput [1]: [a#0]\n\n",

相关文章:

  • AI重塑视觉艺术:DeepSeek与蓝耘通义万相2.1的图生视频奇迹
  • 神经网络的探秘:从基础到实战
  • ClickHouse 学习笔记
  • DataEase:一款国产开源数据可视化分析工具
  • Copy AI 技术浅析(一)
  • UE5.5 Niagara初始化粒子模块
  • L2-2 懂蛇语
  • Go Context深度剖析
  • 云原生服务网格:微服务通信的智能基础设施
  • 并发爬虫实战:多线程高效抓取王者荣耀全英雄皮肤
  • 分布式训练中的 rank 和 local_rank
  • WIFI无ip分配之解决方法(Solution to WiFi without IP allocation)
  • 【Help Manual】导出PDF中英文不在一行解决方案
  • 汉朔科技业绩高增长:市占率国内外遥遥领先,核心技术创新强劲
  • C和C++的内存管理 续篇
  • C#实现本地Deepseek模型及其他模型的对话v1.4
  • 在线商城服务器
  • 统计建模攻略|一文了解统计建模和其他建模比赛的区别
  • CentOS 7系统初始化及虚拟化环境搭建手册
  • 论文阅读 GMM-JCSFE Model(EEG Microstate)
  • 谷神星一号海射型遥五运载火箭发射成功
  • 外交部:巴基斯坦副总理兼外长达尔5月19日至21日访华
  • 远洋渔船上的命案
  • 俄乌刚谈完美国便筹划与俄乌领导人通话,目的几何?
  • “80后”南京大学天文与空间科学学院教授施勇加盟西湖大学
  • “复旦源”一源六馆焕新启幕,设立文化发展基金首期1亿元