开发者

IOError with lxml etree parse function

开发者 https://www.devze.com 2023-03-13 07:07 出处:网络
I have a logic like : for root, dirs, files in os.walk(os.getcwd()): if \"info.xml\" in files: root = lxml.etree.parse(\"%s/info.xml\" % root)

I have a logic like :

for root, dirs, files in os.walk(os.getcwd()):
    if "info.xml" in files:
        root = lxml.etree.parse("%s/info.xml" % root)
        tag = root.xpath("/info/tagname")[0].text

when parse one info.xml which very deep in current path, met Error Message:

    Traceback (most recent call last):
  File "/home/work/merge开发者_JAVA技巧file.py", line 365, in <module>
  File "/home/work/mergefile.py", line 344, in merge_ejb_files
  File "/home/work/mergefile.py", line 63, in __init__
  File "/home/work/mergefile.py", line 78, in _parse_info2doc
  File "lxml.etree.pyx", line 2698, in lxml.etree.parse (src/lxml/lxml.etree.c:49590)
  File "parser.pxi", line 1491, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:71205)
  File "parser.pxi", line 1520, in lxml.etree._parseDocumentFromURL (src/lxml/lxml.etree.c:71488)
  File "parser.pxi", line 1420, in lxml.etree._parseDocFromFile (src/lxml/lxml.etree.c:70583)
  File "parser.pxi", line 975, in lxml.etree._BaseParser._parseDocFromFile (src/lxml/lxml.etree.c:67736)
  File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:63820)
  File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:64741)
  File "parser.pxi", line 563, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64056)
IOError: Error reading file '/home/work/ci/case/dc_daily/dc/213577/223922/223958/792536/info.xml': failed to load external entity "/home/work/ci/case/dc_daily/dc/213577/223922/223958/792536/info.xml"

but the file "/home/work/ci/case/dc_daily/dc/213577/223922/223958/792536/info.xml" exist and I can parse it with lxml under ipython IDE

Do you know what is the problem is? If you know it, help me please! Thank you!


Here's my solution, as per my comment above. I'm opening files for read, them closing them right after so I don't hit the 1024 file limit.

import lxml.etree as etree
for root,dirs,files in os.walk(os.getcwd()):
    if "info.xml" in files:
        with open('%s/info.xml'%root) as processfile: #use 'rb' if necessary
            xml = etree.parse(processfile)
            tag = root.xpath("/info/tagname")[0].text
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号