Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/360.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/xml/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
从xml文件获取路径(Python)_Python_Xml_Xml Parsing - Fatal编程技术网

从xml文件获取路径(Python)

从xml文件获取路径(Python),python,xml,xml-parsing,Python,Xml,Xml Parsing,我有300个XML文件,每个文件中都有一个路径(见代码),我想用Python列出这些路径的列表(.CSV) <da:AdminData> <da:Datax /> <da:DataID>223</da:DataID> <da:Date>2013-08-19</da:Date> <da:Time>13:27:25</da:Time> <da:Modific

我有300个XML文件,每个文件中都有一个路径(见代码),我想用Python列出这些路径的列表(.CSV)

 <da:AdminData>
    <da:Datax />
    <da:DataID>223</da:DataID>
    <da:Date>2013-08-19</da:Date>
    <da:Time>13:27:25</da:Time>
    <da:Modification>2013-08-19</da:Modification>
    <da:ModificationTime>13:27:25</da:ModificationTime>
    **<da:Path>D:\08\06\xxx-aaa_20130806_111339.dat</da:Path>**
    <da:ID>xxx-5225-fff</da:ID>

223
2013-08-19
13:27:25
2013-08-19
13:27:25
**D:\08\06\xxx-aaa\u 20130806\u 111339.dat**
xxx-5225-fff
我编写了以下代码,但不适用于子目录

import os, glob, re, time, shutil

xmlpath = r'D:'

outfilename = "result.csv"


list = glob.glob(os.path.join(xmlpath,'*.xml'))




output = ""

for file in list :

    fh = open(file)
    text = fh.read()
    pattern = "<da:Path>(.*)</da:Path>"
    pattern = re.compile(pattern);
    a = pattern.search(text)

    if  a:
        output += '\n' + a.group(1)




logfile = open(outfile, "w")
logfile.write(output)
logfile.close()
导入操作系统、全局、re、时间、shutil
xmlpath=r'D:'
outfilename=“result.csv”
list=glob.glob(os.path.join(xmlpath,'.*.xml'))
output=“”
对于列表中的文件:
fh=打开(文件)
text=fh.read()
pattern=“(*)”
模式=重新编译(模式);
a=模式。搜索(文本)
如果是:
输出+='\n'+a.group(1)
日志文件=打开(输出文件,“w”)
logfile.write(输出)
logfile.close()

要以递归方式进行glob,最好结合使用
os.walk
fnmatch.fnmatch
。例如:

import os
import fnmatch


def recursive_glob(rootdir, pattern):
    matching_files = []
    for d, _, fnames in os.walk(rootdir):
        matching_files.extend(
            os.path.join(d, fname) for fname in fnames
            if fnmatch.fnmatch(fname, pattern)
        )
    return matching_files


xmlfiles = recursive_glob(r"D:\", "*.xml")

您不应该使用regexp来解析xml。使用适当的xml解析器。带有子目录的示例条目是什么样子的?我得到一个空列表:(