当前位置: 首页 > news >正文

借助 TX Text Control:在 .NET C# 中验证 PDF/UA 文档

创建易于访问且符合规范的 PDF 文档正成为各行各业日益重要的需求。在本篇博文中,我们将探讨如何使用 Text Control 的 .NET 库验证 PDF/UA 文档,轻松确保生成的 PDF 符合无障碍标准。TX Text Control 34.0 将允许开发人员直接生成 PDF/UA 和 PDF/A-3a 文档,这对于长期、合规的文档归档而言是一项重大进步。

TX Text Control 官方试用版最新下载

什么是 PDF/UA 和 PDF/A-3a?

PDF/UA(通用无障碍)制定了确保所有人都能访问 PDF 文档的标准,包括使用辅助技术(如屏幕阅读器)的用户。

PDF/UA 确保文档的结构、阅读顺序和元素描述得到正确定义,以便所有内容都能在语义上被理解。

另一方面,PDF/A-3a 是 ISO 标准化的归档 PDF 格式系列的一部分。它保证文档可以原封不动地复制,包括嵌入式附件和可访问内容(“a”代表“可访问性”)。

这两个标准都要求文档包含逻辑结构、语义标签和准确描述内容的元数据。

为什么验证至关重要

在生成 PDF/UA 文件或设计模板的过程中,一些标签或描述性元素经常会丢失或应用错误。即使文档在视觉上看起来没有问题,但它可能不符合无障碍或存档标准,因此无法通过合规性检查。

例如:

  • 某个图表可能缺少描述性文字。
  • 表格可能缺少正确的表头定义。
  • 阅读顺序或标签层级结构可能被打乱了。
  • 可能未设置语言或文档标题等元数据。

如果没有验证,这些问题很容易被忽视。

PDF/UA验证

在 34.0 版本中,我们将引入一个验证库,旨在帮助开发人员检查生成的 PDF 文档的合规性。

该库分析:

  • 文档结构树和标签层次结构
  • 元数据和语言设置
  • 表格、图表、表单字段和超链接的描述性文本
  • 表头和数据单元格的关系
  • 以及 PDF/UA 规范要求的其他与辅助功能相关的属性

它以结构化的 JSON 格式生成详细报告,并为控制台应用程序提供文本输出。这使得开发人员能够将验证直接集成到自动化测试或质量保证 (QA) 流程中。

C# 中的示例用法

以下是一个如何在 C# 应用程序中使用验证库的简单示例:

using TXTextControl.PDF.Validation;var report = PdfUaValidator.Validate("documents/hyperlink.pdf");
report.PrintText();

在这个例子中,我们首先创建一个验证器实例,然后验证文档。验证结果会输出到控制台,并可以序列化为 JSON 格式以进行进一步分析。

生成的 JSON 报告详细概述了文档中发现的所有合规性问题:

{"filePath": "documents/hyperlink.pdf","pdfVersion": "1.7","isPass": true,"documentTitle": "This is a sample PDF/UA document","documentLanguage": "en-US","findings": [{"ruleId": "UA-CONFORMANCE","severity": "Info","passed": true,"message": "PDF/UA-1 conformance declaration found in XMP."},{"ruleId": "PDFA-CONFORMANCE","severity": "Info","passed": true,"message": "PDF/A-3A declaration found in XMP."},{"ruleId": "PDF-HEADER","severity": "Error","passed": true,"message": "Found PDF header %PDF-1.7."},{"ruleId": "PDF-XREF","severity": "Warning","passed": true,"message": "Cross-reference table/stream appears present."},{"ruleId": "UA-CATALOG","severity": "Error","passed": true,"message": "Catalog dictionary present."},{"ruleId": "UA-MARKED","severity": "Error","passed": true,"message": "/MarkInfo \u003C\u003C /Marked true \u003E\u003E found (Tagged PDF)."},{"ruleId": "UA-STRUCT","severity": "Error","passed": true,"message": "/StructTreeRoot present."},{"ruleId": "UA-MCID-ANCHOR","severity": "Info","passed": true,"message": "Marked content (/MCID) present and at least one page has /StructParents anchors."},{"ruleId": "UA-TEXT-MAPPING","severity": "Info","passed": true,"message": "Font ToUnicode maps present (text is likely accessible)."},{"ruleId": "UA-LANG","severity": "Error","passed": true,"message": "/Lang present at document/page level."},{"ruleId": "UA-METADATA","severity": "Warning","passed": true,"message": "XMP metadata packet detected."},{"ruleId": "UA-TITLE","severity": "Error","passed": true,"message": "Document title found (Info or XMP dc:title)."},{"ruleId": "UA-TABS","severity": "Warning","passed": true,"message": "Page /Tabs setting present."},{"ruleId": "UA-FIG-ALT","severity": "Info","passed": true,"message": "Figures detected: 3; descriptive text tokens (/Alt or /ActualText): 3."},{"ruleId": "UA-LINK-DESC","severity": "Info","passed": true,"message": "Links: 2; all appear to have nearby tooltip/contents/ActualText."},{"ruleId": "UA-FORMS-TU","severity": "Info","passed": true,"message": "AcroForm detected; tooltips (/TU) count: 3."},{"ruleId": "UA-TABLE-A-SUMMARY","severity": "Info","passed": true,"message": "Tables: 3; all have /A with /Summary."},{"ruleId": "UA-TABLE-HEADERS","severity": "Info","passed": true,"message": "Tables with headers: OK=1, missing/invalid=0."}],"tableSummaries": [{"index": 1,"summaryText": "Table description","summaryRaw": "(\u00FE\u00FF\u0000T\u0000a\u0000b\u0000l\u0000e\u0000 \u0000d\u0000e\u0000s\u0000c\u0000r\u0000i\u0000p\u0000t\u0000i\u0000o\u0000n)","hasOTable": true,"source": "Obj 58: A 74 0 R","thTotal": 3,"thWithScope": 3,"tdWithHeaders": 0,"headersOk": true,"headersApplicable": true},{"index": 2,"summaryText": "Inner table","summaryRaw": "(\u00FE\u00FF\u0000I\u0000n\u0000n\u0000e\u0000r\u0000 \u0000t\u0000a\u0000b\u0000l\u0000e)","hasOTable": true,"source": "Obj 59: A 96 0 R","thTotal": 0,"thWithScope": 0,"tdWithHeaders": 0,"headersOk": true,"headersApplicable": false},{"index": 3,"summaryText": "Third table","summaryRaw": "(\u00FE\u00FF\u0000T\u0000h\u0000i\u0000r\u0000d\u0000 \u0000t\u0000a\u0000b\u0000l\u0000e)","hasOTable": true,"source": "Obj 60: A 122 0 R","thTotal": 0,"thWithScope": 0,"tdWithHeaders": 0,"headersOk": true,"headersApplicable": false}],"links": [{"index": 1,"linkText": "Descriptive Text","linkTextRaw": "(\u00FE\u00FF\u0000D\u0000e\u0000s\u0000c\u0000r\u0000i\u0000p\u0000t\u0000i\u0000v\u0000e\u0000 \u0000T\u0000e\u0000x\u0000t)","targetType": "URI","targetValue": "http://www.textcontrol.com","targetRaw": "(http://www.textcontrol.com)","source": "Annot window"},{"index": 2,"linkText": "Descriptive Text","linkTextRaw": "(\u00FE\u00FF\u0000D\u0000e\u0000s\u0000c\u0000r\u0000i\u0000p\u0000t\u0000i\u0000v\u0000e\u0000 \u0000T\u0000e\u0000x\u0000t)","targetType": "URI","targetValue": "http://www.textcontrol.com","targetRaw": "(http://www.textcontrol.com)","source": "Annot window"}],"figures": [{"index": 1,"altText": "image in  table","altRaw": "(\u00FE\u00FF\u0000i\u0000m\u0000a\u0000g\u0000e\u0000 \u0000i\u0000n\u0000 \u0000 \u0000t\u0000a\u0000b\u0000l\u0000e)","source": "Figure obj 55"},{"index": 2,"altText": "Barcode not in table","altRaw": "(\u00FE\u00FF\u0000B\u0000a\u0000r\u0000c\u0000o\u0000d\u0000e\u0000 \u0000n\u0000o\u0000t\u0000 \u0000i\u0000n\u0000 \u0000t\u0000a\u0000b\u0000l\u0000e)","source": "Figure obj 56"},{"index": 3,"altText": "Image description","altRaw": "(\u00FE\u00FF\u0000I\u0000m\u0000a\u0000g\u0000e\u0000 \u0000d\u0000e\u0000s\u0000c\u0000r\u0000i\u0000p\u0000t\u0000i\u0000o\u0000n)","source": "Figure obj 57"}],"forms": [{"index": 1,"fieldName": "list item","fieldNameRaw": "(list item)","fieldType": "Ch","tooltip": "list item","tooltipRaw": "(list item)","source": "Obj 10"},{"index": 2,"fieldName": "company_name","fieldNameRaw": "(company_name)","fieldType": "Tx","tooltip": "company_name","tooltipRaw": "(company_name)","source": "Obj 13"},{"index": 3,"fieldName": "is_client","fieldNameRaw": "(is_client)","fieldType": "Btn","tooltip": "is_client","tooltipRaw": "(is_client)","source": "Obj 15"}],"standards": [{"standard": "PDF/UA","part": "1","conformance": null,"source": "XMP"},{"standard": "PDF/A","part": "3","conformance": "A","source": "XMP"}]
}

返回的Report对象提供了对验证结果的结构化访问,使得集成到现有工作流程中变得容易。

结论

即将发布的 TX Text Control 34.0 版本将为开发人员提供强大的工具,使他们能够直接在 .NET 应用程序中创建和验证符合 PDF/UA 和 PDF/A-3a 标准的文档。该验证库简化了确保可访问性和合规性的流程,使开发人员能够自信地满足行业标准。

http://www.dtcms.com/a/573861.html

相关文章:

  • 高光谱成像系统赋能烟叶分选(烟叶除杂、烟叶霉变、烟叶烟梗区分、烟叶等级分选)
  • Java NIO 深度解析:从 BIO 到 NIO 的演进与实战​
  • 聊聊AIoT开发效率与安全:从ARMINO IDK框架说起
  • 0.5、提示词中 System、User、Assistant 的基本概念
  • 响应式网站设计建设制作温岭app开发公司
  • 门户网站用什么程序做广州手机app开发
  • 用Python和FastAPI构建一个完整的企业级AI Agent微服务脚手架
  • 青岛网站域名备案查询个人网站做哪些内容
  • Leet热题100--208. 实现 Trie (前缀树)--中等
  • 应用分析网站网站社区建设
  • 【上海海事大学主办】第六届智能电网与能源工程国际学术会议(SGEE 2025)
  • 每月网站开发费用网站改版如何做301
  • Will Al Replace Humans? From Stage to Symbiosis.
  • Springcloud核心组件之Sentinel详解
  • 饰品企业网站建设程序开发的步骤
  • 聊城网站建设科技公司网站自己的
  • 计算机视觉·TagCLIP
  • 做网站流量是什么wordpress自定义表
  • 静态页优秀网站南通网站制作公司
  • C# 串口通讯中 SerialPort 类的关键参数和使用方法
  • STM32利用AES加密数据、解密数据
  • STM32在LVGL上实现移植FatFs文件系统(保姆级详细教程)
  • 二十三、STM32的ADC(三)(ADC多通道)
  • 刷网站建设免费模板下载个人简历
  • MTK平台WiFi学习--BeToCQ 测试须知
  • 【C++】哈希表详解(开放定址法+哈希桶)
  • 住房与住房建设部网站首页热力图 wordpress
  • MySQL 锁详解
  • Spring AOP和事物
  • 系列文章<九>(从LED显示屏的偏色问题问题到手机影像):从LED冬奥会、奥运会及春晚等大屏,到手机小屏,快来挖一挖里面都有什么