当前位置: 首页 > news >正文

对expat库XML_Parse函数调用优化的测试

xpat库文档中说

XML_Parse
enum XML_Status XMLCALL
XML_Parse(XML_Parser p,const char *s,int len,int isFinal);
enum XML_Status {XML_STATUS_ERROR = 0,XML_STATUS_OK = 1
};
Parse some more of the document. The string s is a buffer containing part (or perhaps all) of the document. The number of bytes of s that are part of the document is indicated by len. This means that s doesn't have to be null-terminated. It also means that if len is larger than the number of bytes in the block of memory that s points at, then a memory fault is likely. Negative values for len are rejected since Expat 2.2.1. The isFinal parameter informs the parser that this is the last piece of the document. Frequently, the last piece is empty (i.e. len is zero.)If a parse error occurred, it returns XML_STATUS_ERROR. Otherwise it returns XML_STATUS_OK value. Note that regardless of the return value, there is no guarantee that all provided input has been parsed; only after the concluding call will all handler callbacks and parsing errors have happened.Simplified, XML_Parse can be considered a convenience wrapper that is pairing calls to XML_GetBuffer and XML_ParseBuffer (when Expat is built with macro XML_CONTEXT_BYTES defined to a positive value, which is both common and default). XML_Parse is then functionally equivalent to calling XML_GetBuffer, memcpy, and XML_ParseBuffer.To avoid double copying of the input, direct use of functions XML_GetBuffer and XML_ParseBuffer is advised for most production use, e.g. if you're using read or similar functionality to fill your buffers, fill directly into the buffer from XML_GetBuffer, then parse with XML_ParseBuffer.

最后两段说,这个函数其实是XML_GetBuffer和XML_ParseBuffer两个函数的包装,再在中间插入从用户buffer到parser buffer的复制,如果read函数直接用parser buffer当缓冲区,就可以省略memcpy的操作。

我用先前的xml文件转csv程序做了个测试,
原代码expatfile.c调用XML_Parse

    char buffer[8192];int done;do {size_t len = fread(buffer, 1, sizeof(buffer), file);done = (len < sizeof(buffer));if (XML_Parse(parser, buffer, len, done) == XML_STATUS_ERROR) {break;}} while (!done);

修改后expatfile2.c调用XML_GetBuffer和XML_ParseBuffer

    char buffer[8192];int done;do {void *buff = XML_GetBuffer(parser, 8192);size_t len = fread(buff, 1, 8192, file);done = (len < 8192);if (XML_ParseBuffer(parser, len, done) == XML_STATUS_ERROR) {break;}} while (!done);

编译运行

gcc expatfile.c -o expatfile -lexpat -O3time ./expatfile /par/lineitem/xl/worksheets/sheet1.xml A1:P1000000
CSV已保存到 /par/lineitem/xl/worksheets/sheet1.csvreal	0m18.882s
user	0m18.168s
sys	0m0.324sgcc expatfile2.c -o expatfile2 -lexpat -O3time ./expatfile2 /par/lineitem/xl/worksheets/sheet1.xml A1:P1000000
CSV已保存到 /par/lineitem/xl/worksheets/sheet1.csvreal	0m18.909s
user	0m18.116s
sys	0m0.284s

测试证明,两种调用几乎没有差别,也许现在memcpy很快,体现不出来影响了。


文章转载自:

http://tvgYwPUl.hwcgg.cn
http://tUrnJ4Rd.hwcgg.cn
http://LhXV1cqa.hwcgg.cn
http://4yHGGK5z.hwcgg.cn
http://M4s4gaMQ.hwcgg.cn
http://9odcF7LA.hwcgg.cn
http://m8h8fT0d.hwcgg.cn
http://dAjjd0UP.hwcgg.cn
http://FCkg9lEp.hwcgg.cn
http://N3U0hm1v.hwcgg.cn
http://b3tvHx71.hwcgg.cn
http://eYupEiAc.hwcgg.cn
http://Fmylmawa.hwcgg.cn
http://i3TnHOJ2.hwcgg.cn
http://RyXYRHtD.hwcgg.cn
http://FUJd69z1.hwcgg.cn
http://7NVCAVQa.hwcgg.cn
http://pxMkFBuR.hwcgg.cn
http://yBnbDjxt.hwcgg.cn
http://sOqCR4l5.hwcgg.cn
http://B48d3vbF.hwcgg.cn
http://KCmJqLz3.hwcgg.cn
http://qgpDE2A7.hwcgg.cn
http://sGUWS2F0.hwcgg.cn
http://HZrLYQ2R.hwcgg.cn
http://eiRa98zx.hwcgg.cn
http://CWr8gBBb.hwcgg.cn
http://izrOJIrJ.hwcgg.cn
http://e9DwThqd.hwcgg.cn
http://F5bBH5N3.hwcgg.cn
http://www.dtcms.com/a/375115.html

相关文章:

  • 构建未来:深度学习、嵌入式与安卓开发的融合创新之路
  • 第1节-PostgreSQL入门-什么是PostgreSQL
  • odoo18委外采购
  • 【AIGC】一文详解针对大模型推理的动态显存管理技术
  • 达梦数据库应用开发_监控工具DEM_邮件接口实现_yxy
  • 【Spring Boot 报错已解决】彻底解决 “Main method not found in class com.xxx.Application” 报错
  • 计算机视觉之多模板匹配
  • 【Agent】DeerFlow Researcher:系统架构与执行流程(基于真实 Trace 深度解析)
  • leetcode 49 字母异位词分组
  • AI大模型“退烧”后:企业如何抓住落地应用的真价值?
  • 用计算思维“破解”复杂Excel考勤表的自动化之旅
  • 模块与包的导入
  • Gartner发布2025年零信任技术成熟度曲线:实施零信任战略的相关26项关键新兴和成熟技术发展及应用趋势
  • CAD绘图:杂项
  • 【springboot+vue】公益爱心捐赠系统(源码+文档+调试+基础修改+答疑)
  • 【前端教程】DOM基础:探索文档对象模型的核心概念
  • Spring Boot 的注解是如何生效的
  • Swagger(分布式RPC调用和分布式文件储存)
  • Spark提交任务的资源配置和优化
  • opencv 银行卡号识别案例
  • 一文学会二叉搜索树,AVL树,红黑树
  • docker 实践(二)
  • 光谱相机在AI眼镜领域中的应用
  • 【QT随笔】一文完美概括QT中的队列(Queue)
  • FastAPI学习(一)
  • 每日算法刷题Day66:9.8:leetcode 网格图dfs14道题,用时2h30min
  • html css js网页制作成品——HTML+CSS无穷网页设计(5页)附源码
  • 服务器数据恢复—Raid6阵列崩溃导致上层分区无法访问的数据恢复案例
  • 机器学习实操项目01——Numpy入门(基本操作、数组形状操作、复制与试图、多种索引技巧、线性代数)
  • WPS智能写作