当前位置: 首页 > wzjs >正文

android wap网站南宁网站建设哪里有

android wap网站,南宁网站建设哪里有,深圳坪山最新消息,背景墙素材高清图片免费最近博主被xml搞得很奔溃,因为在之前的小作业中使用ElementTree(元素树)解析xml文件非常的丝滑,但是当将xml文件改成大约2G的大文件时,他就罢工啦! xml的文件格式如下图所示: 之前的解析代码如下: def lo…

最近博主被xml搞得很奔溃,因为在之前的小作业中使用ElementTree(元素树)解析xml文件非常的丝滑,但是当将xml文件改成大约2G的大文件时,他就罢工啦!
xml的文件格式如下图所示:
在这里插入图片描述
之前的解析代码如下:

def load_xml(self, file):'''function: use the file in XML formatArgs:file (str): path to the xml fileReturns:         '''   # with open(file, 'r', encoding='utf-8') as f:#     xml = f.read()#load the xml file# root = ET.fromstring(xml) #root.tag is rowtree = ET.parse(file)root = tree.getroot()for child in root:docID = child.find('DOCNO').text #get docID# print(docID)content = child.find('TITLE').text +child.find('ARTIST').text +child.find('YEAR').text +child.find('LYRICS').text + child.find('GENRE').text #get content# print("content",content)self.docs_df.loc[docID] = content# use a dataframe to store each doc:docID,content(headline+text)  

之前的报错信息:

Traceback (most recent call last):File "/Users/xiaoxudemacbook/edinburgh/TTDS/cw/cw3/code/index.py", line 115, in <module>index = Index('data.xml', 'englishST.txt')File "/Users/xiaoxudemacbook/edinburgh/TTDS/cw/cw3/code/index.py", line 21, in __init__self.load_xml(xml_path)File "/Users/xiaoxudemacbook/edinburgh/TTDS/cw/cw3/code/index.py", line 34, in load_xmlroot = ET.fromstring(xml) #root.tag is documentFile "/Users/xiaoxudemacbook/opt/anaconda3/lib/python3.9/xml/etree/ElementTree.py", line 1344, in XMLparser.feed(text)
xml.etree.ElementTree.ParseError: out of memory: line 1, column 0

博主搜索过如何修复这个问题的方法,大部分人提到的是用lxml和iterparse,虽然它在google中是排列第一的解决方案,但是博主发现它已经是2014的,而且感觉修改起来比较复杂,博主还在CSDN中发了问答,然后突然想起我们的chatGPT小宝贝。它给出了非常清楚的4种解决方案,博主就贴在下面啦(ps:博主用第一种很简单的方法就解决了,但是根据菜鸟教程中Python XML解析中提供的3中方法,博主感觉好像SAX (simple API for XML )是针对大型xml文件的良方,有空的时候要回头看看,目前还是赶due要紧):

This error message suggests that the XML file being parsed by the load_xml() function is too large to fit in memory. When an XML file is parsed using the ET.fromstring() function, the entire contents of the file are loaded into memory as a string, which can be a problem for very large files.

To resolve this error, you can try one or more of the following solutions:

1. Use the ET.parse() function instead of ET.fromstring() to parse the XML file.

This function takes a file path as an argument, so the XML file is read and parsed incrementally, which can be more memory-efficient. Here’s an example of how you can modify the code:
makefile
Copy code

Replace this line:

root = ET.fromstring(xml)

With this line:

tree = ET.parse(xml_path)
root = tree.getroot()

2. Split the XML file into smaller files before parsing them

If the XML file contains a large number of documents, you can split it into smaller files that each contain a subset of the documents. This way, you can parse each file separately and avoid loading the entire XML file into memory at once.

3. Increase the amount of memory available to the Python process.

If you have enough memory available on your system, you can try increasing the amount of memory allocated to the Python process. One way to do this is by using a virtual environment and setting the PYTHONMALLOC environment variable to malloc. For example, you can try running the following command before running your Python script:

PYTHONMALLOC=malloc python index.py

This will tell Python to use the malloc memory allocator, which can increase the amount of memory available to the process.

4. Use a streaming XML parser.

If the XML file is too large to fit in memory even after trying the above solutions, you can use a streaming XML parser such as xml.sax or lxml.etree. These parsers allow you to process the XML file incrementally, without loading the entire file into memory at once. However, using a streaming parser can be more complex and require more code changes than the other solutions.

附:菜鸟教程Python XML解析:https://www.runoob.com/python/python-xml.html


文章转载自:

http://kNsYgJ1s.rrjzs.cn
http://rjaLZ9kV.rrjzs.cn
http://vbGbsGj6.rrjzs.cn
http://cUz051rl.rrjzs.cn
http://XMHjSCtk.rrjzs.cn
http://U5T69qz4.rrjzs.cn
http://K5ZGhUxa.rrjzs.cn
http://TK7NvfCr.rrjzs.cn
http://palFY6t9.rrjzs.cn
http://49CdUkjo.rrjzs.cn
http://SJwoSdbX.rrjzs.cn
http://Z92OUcWE.rrjzs.cn
http://qHbpkKga.rrjzs.cn
http://zt4fH82z.rrjzs.cn
http://d8LD0MZO.rrjzs.cn
http://ck0de89r.rrjzs.cn
http://eA3fgavn.rrjzs.cn
http://Io3q3qhf.rrjzs.cn
http://S3MZFDmB.rrjzs.cn
http://MEtsRc3x.rrjzs.cn
http://xuXXA9sF.rrjzs.cn
http://DwBOyJA0.rrjzs.cn
http://2XAL6CBM.rrjzs.cn
http://zz9zfF7o.rrjzs.cn
http://bjyIrXuA.rrjzs.cn
http://IM1rkkwx.rrjzs.cn
http://GQ8BYICc.rrjzs.cn
http://Rjd1J001.rrjzs.cn
http://qhELmNZ9.rrjzs.cn
http://KTHF7SLP.rrjzs.cn
http://www.dtcms.com/wzjs/745183.html

相关文章:

  • 网站建设订单模板首饰设计网站大全
  • 网站开发补充协议监控摄像头做斗鱼直播网站
  • 登录贵州省住房和城乡建设厅网站重庆事业单位招聘
  • 网站建设项目明细表如何建设好一个公司网站
  • 域名备案与网站备案的区别高端网站建设企业官网建设
  • 网站建设指导方案wordpress 3.8.1 中文
  • 安阳网站建设_wordpress 子站点函数
  • 建设证件查询官方网站wordpress 上传主题 ftp
  • 网站建设分金手指专业二怎么利用源码做网站
  • 阳谷聊城网站优化seo排名优化怎样
  • 常州好一点的网站建设工业设计公司经营范围
  • 屏幕分辨率 网站开发网站源码下载了属于侵权吗
  • 适合做公司网站的cms领导交给你一个网站你该怎么做
  • 免费产品推广网站设计网站公司收费
  • 指定网站怎么设置路由器只访问抖音小程序赚钱
  • 建设方面的知识 网站美间软装官网
  • 毕业设计做购物网站网站设计高度
  • 网站提高内容的丰富度创意怎么下载网站页面
  • 网站系统容量评估机械公司网站模板
  • 东莞做网站设计制作网站制造
  • 陕西省咸阳市建设银行网站微平台图片
  • 如何破解网站后台密码网络营销课程实训报告
  • 上海网站制作平台注册logo去哪里注册
  • 上海网站建设百度推广公司佛山互联网公司
  • 江苏住房建设厅主办网站阿里云做网站教程
  • 建设网站五个步骤wordpress excel插件
  • 免费做电子相册的网站水泵行业网站哪个做的好
  • 企业网站开发 流程软件开发一般需要多少钱
  • 深圳专业软件网站建设网站建设硬件和软件技术环境配置
  • 佛山网站建设定制开发实验室网站制作