Python 有条件etree lxml的错误

Python 有条件etree lxml的错误,python,lxml,xml.etree,Python,Lxml,Xml.etree,我正在尝试删除介于之间的所有内容,如果介于之间的是数字66: 我得到以下错误:TypeError:类型为“NoneType”的参数不可iterable…如果element.tag==“answer”和element.text中的“-66”: 这有什么不对?有什么帮助吗 #!/usr/local/bin/python2.7 # -*- coding: UTF-8 -*- from lxml import etree planhtmlclear_utf=u""" <questionair

我正在尝试删除介于之间的所有内容,如果介于之间的是数字66:

我得到以下错误:TypeError:类型为“NoneType”的参数不可iterable…如果element.tag==“answer”和element.text中的“-66”:

这有什么不对?有什么帮助吗

#!/usr/local/bin/python2.7
# -*- coding: UTF-8 -*- 

from lxml import etree

planhtmlclear_utf=u"""
<questionaire>
<question>
<questiontext>What's up?</questiontext>
<answer></answer>
</question>
<question>
<questiontext>Cool?</questiontext>
<answer>-66</answer>
</question>
</questionaire>

"""

html = etree.fromstring(planhtmlclear_utf)
questions = html.xpath('/questionaire/question')
for question in questions:
    for element in question.getchildren():
        if element.tag == 'answer' and '-66' in element.text:
            html.xpath('/questionaire')[0].remove(question)
print etree.tostring(html) 
#/usr/local/bin/python2.7
#-*-编码:UTF-8-*-
从lxml导入etree
planhtmlclear_utf=u”“”
怎么了?
酷?
-66
"""
html=etree.fromstring(planhtmlclear\u utf)
questions=html.xpath('/questionaire/question')
有关问题:
对于有问题的元素。getchildren():
如果element.tag==“answer”和element.text中的“-66”:
html.xpath('/questionaire')[0]。删除(问题)
打印etree.tostring(html)

element.text在某些迭代中似乎没有。错误是它无法通过None查找“-66”,因此请先检查element.text是否为None,如下所示:

html = etree.fromstring(planhtmlclear_utf)
questions = html.xpath('/questionaire/question')
for question in questions:
    for element in question.getchildren():   
        if element.tag == 'answer' and element.text and '-66' in element.text:
            html.xpath('/questionaire')[0].remove(question)
print etree.tostring(html) 
from lxml import etree
import BeautifulSoup

planhtmlclear_utf=u"""
<questionaire>
<question>
<questiontext>What's up?</questiontext>
<answer></answer>
</question>
<question>
<questiontext>Cool?</questiontext>
<answer>-66</answer>
</question>
</questionaire>"""

html = etree.fromstring(planhtmlclear_utf)
questions = html.xpath('/questionaire/question')
for question in questions:
    for element in question.getchildren():   
        if element.tag == 'answer' and element.text and '-66' in element.text:
            html.xpath('/questionaire')[0].remove(question)

soup = BeautifulSoup.BeautifulStoneSoup(etree.tostring(html))
print soup.prettify()
它在xml中失败的行是


或者,要以更紧凑的方式执行此操作:

from lxml import etree
import BeautifulSoup    

# abbreviating to reduce answer length...
planhtmlclear_utf=u"<questionaire>.........</questionaire>"

html = etree.fromstring(planhtmlclear_utf)
[question.getparent().remove(question) for question in html.xpath('/questionaire/question[answer/text()="-66"]')]
print BeautifulSoup.BeautifulStoneSoup(etree.tostring(html)).prettify()
从lxml导入etree
进口美联
#缩写以缩短答案长度。。。
planhtmlclear_utf=u“…”
html=etree.fromstring(planhtmlclear\u utf)
[question.getparent().html.xpath('/questionaire/question[answer/text()
打印BeautifulSoup.BeautifulStoneSoup(etree.tostring(html)).prettify()

检查
元素.text
是否为
None
,另一种方法是优化XPath:

questions = html.xpath('/questionaire/question[answer/text()="-66"]')
for question in questions:
    question.getparent().remove(question)
括号
[…]
的意思是“这样”。所以


哇,这真的很有帮助!谢谢!也许你可以再帮我一步:-P我现在得到输出:怎么了。。。。。。。所以答案没有完全显示出来…为什么?这解决了问题,他没有触及其他答案元素…通过上面的例子,我得到了答案元素…但我不知道为什么…无论如何,这个解决方案是有效的!不,对不起…他正在剪空答题标签…为什么总是这样?我不确定我是否理解这个问题。你是说
被缩短为
?没关系;它们是等价的。是的。这就是我的意思……但我能做些什么来防止这种情况?因为我需要正确格式化的标签。。?非常感谢<代码>将lxml.html作为lh导入
。然后
lh.tostring(html)
questions = html.xpath('/questionaire/question[answer/text()="-66"]')
for question in questions:
    question.getparent().remove(question)
question                          # find all question elements
[                                 # such that 
  answer                          # it has an answer subelement
    /text()                       # whose text 
  =                               # equals
  "-66"                           # "-66"
]