Python lxml etree在之前查找最近的元素
xml文档的结构如下所示Python lxml etree在之前查找最近的元素,python,xml,xpath,xml-parsing,elementtree,Python,Xml,Xpath,Xml Parsing,Elementtree,xml文档的结构如下所示 <a> <b> <d> </b> <c attr1="important"/> <b> <d> </b> <c attr1="so important" /> <b></b> </a> for el in elems: it =
<a>
<b>
<d>
</b>
<c attr1="important"/>
<b>
<d>
</b>
<c attr1="so important" />
<b></b>
</a>
for el in elems:
it = el.getparent()
while it != None and it.tag != 'c':
prev = it.getprevious()
if prev == None:
it = it.getparent()
else:
it = prev
if it != None:
print el, it.get("attr1")
现在的任务是:
从当前
标记之前最近的
标记获取属性(如果有)。
天真的方法是做如下事情
<a>
<b>
<d>
</b>
<c attr1="important"/>
<b>
<d>
</b>
<c attr1="so important" />
<b></b>
</a>
for el in elems:
it = el.getparent()
while it != None and it.tag != 'c':
prev = it.getprevious()
if prev == None:
it = it.getparent()
else:
it = prev
if it != None:
print el, it.get("attr1")
但对我来说,这看起来并不简单——我是否在文档中遗漏了什么?如何在不实现自己的迭代器的情况下解决此问题?使用:
前面的轴表示文档中上下文节点之前的所有节点,但任何祖先节点、属性节点和命名空间节点除外
演示:
来自lxml导入etree的>>
>>>
>>>data=”“”
...
...
...
...
...
...
...
...
...
...
...
...
... """
>>>xmltree=etree.fromstring(数据)
>>>elems=xmltree.xpath(“//d”)
>>>
>>>对于元素中的el:
... 尝试:
... print el.xpath(“preference::c[@attr1]”[-1].get(“attr1”)
... 除索引器外:
... 打印“前面没有“c”元素。”
...
没有前面的“c”元素。
重要的
因为我需要最接近的一个,所以我认为应该是el.xpath(“previous::c[@attr1]”[-1]
不是吗?其他一切:非常感谢你
>>> from lxml import etree
>>>
>>> data = """
... <a>
... <b>
... <d/>
... </b>
...
... <c attr1="important"/>
... <b>
... <d/>
... </b>
... <c attr1="so important" />
... <b></b>
... </a>
... """
>>> xmltree = etree.fromstring(data)
>>> elems = xmltree.xpath('//d')
>>>
>>> for el in elems:
... try:
... print el.xpath("preceding::c[@attr1]")[-1].get("attr1")
... except IndexError:
... print "No preceding 'c' element."
...
No preceding 'c' element.
important