Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/xml/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
用于xml列表的Python函数_Python_Xml_Parsing - Fatal编程技术网

用于xml列表的Python函数

用于xml列表的Python函数,python,xml,parsing,Python,Xml,Parsing,我已经像这样解析了XML文件。也许我只是抄得不好,但没关系,所以,这是: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE raml SYSTEM 'raml20.dtd'> <raml version="2.0" xmlns="raml20.xsd"> <cmData type="actual"> <manage

我已经像这样解析了XML文件。也许我只是抄得不好,但没关系,所以,这是:

     <?xml version="1.0" encoding="UTF-8"?>
        <!DOCTYPE raml SYSTEM 'raml20.dtd'>
        <raml version="2.0" xmlns="raml20.xsd">
        <cmData type="actual">
            <managedObject class="LN" distName="PTR" id="2425">
              <p name="aak">220</p>
              <p name="orp">05</p>
              <p name="name">Portro</p>
              <p name="optres">false</p>
              <p name="optblu">false</p>
              <p name="aoptdet">false</p>
              <p name="advcell">false</p>
              <list name="sibList">
                <item>
                  <p name="sibcity">177</p>
                  <p name="sibrep">2</p>
                </item>
                <item>
                  <p name="sibcity">177</p>
                  <p name="sibrep">1</p>
                </item>
              </list>
            </managedObject>
            <managedObject class="LN" distName="KRNS" id="93886">
              <p name="aak">150</p>
              <p name="orp">05</p>
              <p name="name">Portro</p>
              <p name="optres">false</p>
              <p name="optblu">tru</p>
              <p name="aoptdet">false</p>
              <p name="advcell">true</p>
              <list name="sibList">
                <item>
                  <p name="sibcity">177</p>
                  <p name="sibrep">1</p>
                </item>
                <item>
                  <p name="sibcity">180</p>
                  <p name="sibrep">2</p>
                </item>
               </list>
            </managedObject>
             ....
            <managedObject>
             ...
            </managedObject>

            ...
        </cmData>
        </raml>

下面是一个解析XML文件的解决方案,它将每个managedObject与所有其他对象进行比较,并打印出结果diff对象

import json
from xml.etree import ElementTree


tree = ElementTree.parse('raml20.xml')

ns = {'ns': 'raml20.xsd'}
nsP, nsList, nsItem = ('{%s}%s' % (ns['ns'], i) for i in ('p', 'list', 'item'))


def pkv(o):
    """Return dict with name:text of p elements"""
    return {k.attrib['name']: k.text for k in o.iter(nsP)}


def parse(tree):
    root = tree.getroot()
    objs = {}
    for mo in root.findall('./ns:cmData/ns:managedObject', ns):
        obj = pkv(mo)
        for i in mo.iter(nsList):
            obj[i.attrib['name']] = [pkv(j) for j in i.iter(nsItem)]
        objs[mo.attrib['distName']] = obj
    return objs


def diff_dicts(d1, d2, ignore_keys=set()):
    """Return dict with differences between the dicts provided as arguments"""
    k1 = set(d1.keys())
    k2 = set(d2.keys())
    diff = {}
    diff.update(
        {i: (d1[i], d2[i]) for i in (k1 & k2) - ignore_keys if d1[i] != d2[i]})
    diff.update({i: (d1.get(i), d2.get(i)) for i in (k1 ^ k2) - ignore_keys})
    return diff


def diff_lists(l1, l2):
    """Return dict with differences between lists of dicts provided as arguments"""
    diff = {}
    # note: assumes that lists are of same length
    for i, (d1, d2) in enumerate(zip(l1, l2)):
        d = diff_dicts(d1, d2)
        if d:
            diff[i] = d
    return diff


def diff_objects(o1, o2):
    """Return dict with differences between two objects (dicts) provided as arguments"""
    listkeys = set(
        i for o in (o1, o2) for i in o if isinstance(o.get(i), list))
    diff = diff_dicts(o1, o2, listkeys)
    for i in listkeys:
        if i in o1 and i in o2:
            diff.update({i: diff_lists(o1[i], o2[i])})
        else:
            diff.update({i: (o1.get(i), o2.get(i))})
    return diff


def compare_objects(objs):
    diffs = []
    keys = list(objs)
    for k1, k2 in zip(keys[:-1], keys[1:]):
        o1, o2 = objs[k1], objs[k2]
        diff = diff_objects(o1, o2)
        if diff:
            diffs.append((k1, k2, diff))
    return diffs


res = compare_objects(parse(tree))
print(json.dumps(res, indent=2))
我已经使用以下
raml20.xml
文件进行了测试:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE raml SYSTEM 'raml20.dtd'>
<raml version="2.0" xmlns="raml20.xsd">
  <cmData type="actual">
    <managedObject class="LN" distName="PTR" id="2425">
      <p name="aak">220</p>
      <p name="orp">05</p>
      <p name="name">Portro</p>
      <p name="optres">false</p>
      <p name="optblu">false</p>
      <p name="aoptdet">false</p>
      <p name="advcell">false</p>
      <list name="sibList">
        <item>
          <p name="sibcity">177</p>
          <p name="sibrep">2</p>
        </item>
        <item>
          <p name="sibcity">177</p>
          <p name="sibrep">1</p>
        </item>
      </list>
    </managedObject>
    <managedObject class="LN" distName="KRNS" id="93886">
      <p name="aak">150</p>
      <p name="orp">05</p>
      <p name="name">Portro</p>
      <p name="optres">false</p>
      <p name="optblu">tru</p>
      <p name="aoptdet">false</p>
      <p name="advcell">true</p>
      <list name="sibList">
        <item>
          <p name="sibcity">177</p>
          <p name="sibrep">1</p>
        </item>
        <item>
          <p name="sibcity">180</p>
          <p name="sibrep">2</p>
        </item>
       </list>
    </managedObject>
  </cmData>
</raml>

你的问题很难理解。嗯,基本上您希望遍历XML并处理所有
managedObject
节点?你试过lxml或BeautifulSoup吗?是的,我试过了。我已经更新了我的代码。但现在我无法向特定managedObject声明sibList。最后,我需要excel文件,其中managedObjects作为列,参数作为行。值将是文本,例如:220、05、Portro等。我需要提到的是,我使用了etree解析器@techouse该代码块的缩进非常不稳定,请将代码粘贴到其中,突出显示它,然后使用
{}
按钮格式化代码块。您对要执行的操作的描述非常模糊,但我要说的是,可能没有一个现有的函数可以完全执行您想要执行的操作,你可能需要自己写。@jovicbg不客气。如果答案解决了你的问题,请将其标记为已接受的答案。这很好。现在,我需要将所有managedObject与一个引用对象进行比较。如果您有一个引用managedObject,只需修改
compare\u objects
函数以与该特定对象进行比较。例如,参见本要点:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE raml SYSTEM 'raml20.dtd'>
<raml version="2.0" xmlns="raml20.xsd">
  <cmData type="actual">
    <managedObject class="LN" distName="PTR" id="2425">
      <p name="aak">220</p>
      <p name="orp">05</p>
      <p name="name">Portro</p>
      <p name="optres">false</p>
      <p name="optblu">false</p>
      <p name="aoptdet">false</p>
      <p name="advcell">false</p>
      <list name="sibList">
        <item>
          <p name="sibcity">177</p>
          <p name="sibrep">2</p>
        </item>
        <item>
          <p name="sibcity">177</p>
          <p name="sibrep">1</p>
        </item>
      </list>
    </managedObject>
    <managedObject class="LN" distName="KRNS" id="93886">
      <p name="aak">150</p>
      <p name="orp">05</p>
      <p name="name">Portro</p>
      <p name="optres">false</p>
      <p name="optblu">tru</p>
      <p name="aoptdet">false</p>
      <p name="advcell">true</p>
      <list name="sibList">
        <item>
          <p name="sibcity">177</p>
          <p name="sibrep">1</p>
        </item>
        <item>
          <p name="sibcity">180</p>
          <p name="sibrep">2</p>
        </item>
       </list>
    </managedObject>
  </cmData>
</raml>
[
  [
    "PTR",
    "KRNS",
    {
      "advcell": [
        "false",
        "true"
      ],
      "optblu": [
        "false",
        "tru"
      ],
      "sibcity": [
        "177",
        "180"
      ],
      "aak": [
        "220",
        "150"
      ],
      "sibrep": [
        "1",
        "2"
      ],
      "sibList": {
        "0": {
          "sibrep": [
            "2",
            "1"
          ]
        },
        "1": {
          "sibcity": [
            "177",
            "180"
          ],
          "sibrep": [
            "1",
            "2"
          ]
        }
      }
    }
  ]
]