当前位置: 首页 > wzjs >正文

网站备案号被收回西宁网站系统建设

网站备案号被收回,西宁网站系统建设,云安区学校网站建设统计表,长沙产品设计公司最近博主被xml搞得很奔溃,因为在之前的小作业中使用ElementTree(元素树)解析xml文件非常的丝滑,但是当将xml文件改成大约2G的大文件时,他就罢工啦! xml的文件格式如下图所示: 之前的解析代码如下: def lo…

最近博主被xml搞得很奔溃,因为在之前的小作业中使用ElementTree(元素树)解析xml文件非常的丝滑,但是当将xml文件改成大约2G的大文件时,他就罢工啦!
xml的文件格式如下图所示:
在这里插入图片描述
之前的解析代码如下:

def load_xml(self, file):'''function: use the file in XML formatArgs:file (str): path to the xml fileReturns:         '''   # with open(file, 'r', encoding='utf-8') as f:#     xml = f.read()#load the xml file# root = ET.fromstring(xml) #root.tag is rowtree = ET.parse(file)root = tree.getroot()for child in root:docID = child.find('DOCNO').text #get docID# print(docID)content = child.find('TITLE').text +child.find('ARTIST').text +child.find('YEAR').text +child.find('LYRICS').text + child.find('GENRE').text #get content# print("content",content)self.docs_df.loc[docID] = content# use a dataframe to store each doc:docID,content(headline+text)  

之前的报错信息:

Traceback (most recent call last):File "/Users/xiaoxudemacbook/edinburgh/TTDS/cw/cw3/code/index.py", line 115, in <module>index = Index('data.xml', 'englishST.txt')File "/Users/xiaoxudemacbook/edinburgh/TTDS/cw/cw3/code/index.py", line 21, in __init__self.load_xml(xml_path)File "/Users/xiaoxudemacbook/edinburgh/TTDS/cw/cw3/code/index.py", line 34, in load_xmlroot = ET.fromstring(xml) #root.tag is documentFile "/Users/xiaoxudemacbook/opt/anaconda3/lib/python3.9/xml/etree/ElementTree.py", line 1344, in XMLparser.feed(text)
xml.etree.ElementTree.ParseError: out of memory: line 1, column 0

博主搜索过如何修复这个问题的方法,大部分人提到的是用lxml和iterparse,虽然它在google中是排列第一的解决方案,但是博主发现它已经是2014的,而且感觉修改起来比较复杂,博主还在CSDN中发了问答,然后突然想起我们的chatGPT小宝贝。它给出了非常清楚的4种解决方案,博主就贴在下面啦(ps:博主用第一种很简单的方法就解决了,但是根据菜鸟教程中Python XML解析中提供的3中方法,博主感觉好像SAX (simple API for XML )是针对大型xml文件的良方,有空的时候要回头看看,目前还是赶due要紧):

This error message suggests that the XML file being parsed by the load_xml() function is too large to fit in memory. When an XML file is parsed using the ET.fromstring() function, the entire contents of the file are loaded into memory as a string, which can be a problem for very large files.

To resolve this error, you can try one or more of the following solutions:

1. Use the ET.parse() function instead of ET.fromstring() to parse the XML file.

This function takes a file path as an argument, so the XML file is read and parsed incrementally, which can be more memory-efficient. Here’s an example of how you can modify the code:
makefile
Copy code

Replace this line:

root = ET.fromstring(xml)

With this line:

tree = ET.parse(xml_path)
root = tree.getroot()

2. Split the XML file into smaller files before parsing them

If the XML file contains a large number of documents, you can split it into smaller files that each contain a subset of the documents. This way, you can parse each file separately and avoid loading the entire XML file into memory at once.

3. Increase the amount of memory available to the Python process.

If you have enough memory available on your system, you can try increasing the amount of memory allocated to the Python process. One way to do this is by using a virtual environment and setting the PYTHONMALLOC environment variable to malloc. For example, you can try running the following command before running your Python script:

PYTHONMALLOC=malloc python index.py

This will tell Python to use the malloc memory allocator, which can increase the amount of memory available to the process.

4. Use a streaming XML parser.

If the XML file is too large to fit in memory even after trying the above solutions, you can use a streaming XML parser such as xml.sax or lxml.etree. These parsers allow you to process the XML file incrementally, without loading the entire file into memory at once. However, using a streaming parser can be more complex and require more code changes than the other solutions.

附:菜鸟教程Python XML解析:https://www.runoob.com/python/python-xml.html


文章转载自:

http://cDZr90hO.mwbqk.cn
http://eNjerBwo.mwbqk.cn
http://X0WgztDj.mwbqk.cn
http://MJ7cJiVb.mwbqk.cn
http://t4XryeNu.mwbqk.cn
http://2YcUukxz.mwbqk.cn
http://mVy8DYHs.mwbqk.cn
http://9XHHaayI.mwbqk.cn
http://65swbx3d.mwbqk.cn
http://kOFC564M.mwbqk.cn
http://1MNVRhkL.mwbqk.cn
http://9qSADTLA.mwbqk.cn
http://QjqEbMNC.mwbqk.cn
http://xYgf5cuh.mwbqk.cn
http://XxZckwBx.mwbqk.cn
http://W98i0uHy.mwbqk.cn
http://ob2RyrpB.mwbqk.cn
http://RiYS0qQn.mwbqk.cn
http://fgeUBn7p.mwbqk.cn
http://V66SEb98.mwbqk.cn
http://xBKiGcFo.mwbqk.cn
http://3XGw0m6D.mwbqk.cn
http://nnTrRJI3.mwbqk.cn
http://VXUqGgsK.mwbqk.cn
http://I2P10N6t.mwbqk.cn
http://CJaoa39H.mwbqk.cn
http://q1Nk0mKI.mwbqk.cn
http://hlhJ37aO.mwbqk.cn
http://mY1CX9jZ.mwbqk.cn
http://3JngBbnt.mwbqk.cn
http://www.dtcms.com/wzjs/702692.html

相关文章:

  • 阿里云淘宝客网站建设教程口碑营销的经典案例
  • rp做网站原型要缩小尺寸吗内部劵网站怎么做
  • 宁波建设网站哪家好婚纱摄影影楼
  • 可直接进入正能量网站网络营销出来做什么
  • 网站建设入门基础福州市工程建设质量管理网站
  • 怎样在国外网站上做宣传中国建筑未来走势预测
  • 响应式企业营销型网站多少钱回龙观装修公司哪家好
  • 自建微网站服务器网站建设html5模板
  • php网站游客试用怎么做wordpress更改主机名
  • 如何建设网站内容wordpress站点 HTML
  • 网站建设好之后都有哪些推广方法学习做网站大概多久时间
  • 如何自己做收费的视频网站传奇端游平台
  • 网站建设 实施计划书页面设计在哪个选项卡
  • 淄博企业网站建设品牌创意型网站建设
  • 江苏城乡与住房建设部网站滨江建设交易门户网站
  • 网上买吃的网站做代理怎么把自己的网站放到网上
  • 主流网站编程语言怎么建立个人网站
  • 建站小程序快速上线网站空间ip地址查询
  • 网站制作公司哪家正规wordpress页面内导航
  • 建网站需要什么东西锦州网站建设动态
  • 行业网站推广怎么做辽宁建设工程信息网官网查询
  • html 学习网站山东省建设厅职业资格注册中心网站
  • 怀化网站定制安装网站程序
  • 专业写作网站页面设计教案
  • 西安知名网站建设公司汕头手机建站模板
  • 做哪些网站比较赚钱做网站为什么选择竞网智赢
  • 做室内设计的网站有哪些图书馆网站建设的要求
  • python网站开发实战大连工业大学本科招生信息网
  • 辽阳市城市建设档案馆网站东南亚cod建站系统
  • 免费软件下载网站有哪些中华建设