Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/339.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何使用lxml查找所有src标记并替换它们_Python_Html_Html Parsing_Lxml_Lxml.html - Fatal编程技术网

Python 如何使用lxml查找所有src标记并替换它们

Python 如何使用lxml查找所有src标记并替换它们,python,html,html-parsing,lxml,lxml.html,Python,Html,Html Parsing,Lxml,Lxml.html,我想使用lxml获取src内容,并用空格替换它们。 但车身仍然无法更换 请帮帮我谢谢你 import re import lxml.html #the content of source.log is a webpage source code I got by scrapy with open("source.log", "r") as bb: c_str = bb.read() body = c_str.decode('utf-8') doc = lxml.html.

我想使用lxml获取src内容,并用空格替换它们。 但车身仍然无法更换 请帮帮我谢谢你

import re
import lxml.html
#the content of source.log is a webpage source code I got by scrapy
with open("source.log", "r") as bb:
    c_str = bb.read()
    body =  c_str.decode('utf-8')


doc  = lxml.html.fromstring(body)
src  = doc.xpath("//@src")

for ss in src:
    re.search(ss,body)
    body.replace(str(ss),'')
    print body
例如: 如果尸体是

'src="http://pic/1379181836.jpg"/><br>紅心<br></div><div>tel:12345678</div>' \
           'src="http://pic/4447918.jpg"/>'
我想要的结果是:

'src=""/><br>紅心<br></div><div>tel:12345678</div>' \
           'src=""/>'

至少,您需要将替换的结果指定给主体:

不过,我个人不喜欢这种方法。最好找到所有具有src属性的标记,并将属性值设置为空字符串:

for element in doc.xpath("//*[@src]"):
    element.attrib['src'] = ''

print lxml.html.tostring(doc)

非常感谢。你是对的。你的代码非常漂亮。
for element in doc.xpath("//*[@src]"):
    element.attrib['src'] = ''

print lxml.html.tostring(doc)