Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/mercurial/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python:用xpath替换提取后的html img src_Python_Html_Xpath - Fatal编程技术网

Python:用xpath替换提取后的html img src

Python:用xpath替换提取后的html img src,python,html,xpath,Python,Html,Xpath,我从中提取了一些html代码,现在我可以看到除图像之外的所有代码,因为它们的src不正确 #!C:/Python27/python from lxml import etree import requests q = "http://www.dlib.org/dlib/november14/giannakopoulos/11giannakopoulos.html" page = requests.get(q) tree = etree.HTML(page.text) element = tre

我从中提取了一些html代码,现在我可以看到除图像之外的所有代码,因为它们的src不正确

#!C:/Python27/python
from lxml import etree
import requests

q = "http://www.dlib.org/dlib/november14/giannakopoulos/11giannakopoulos.html"
page = requests.get(q)
tree = etree.HTML(page.text)
element = tree.xpath('./body/form/table[3]/tr/td/table[5]')
content = etree.tostring(element[0])
print "Content-type: text\n\n"
print content.strip()
现在,我阅读了正确的img src(我想要的),并将其放入一个数组中:

pic=[]
link = q.rsplit("/",1)
images = tree.xpath("//img/@src")
for i in images:
    if i.find('.gif') == -1:
        pic.append(link[0]+"/"+i)

如何用数组中的src替换刮取的src?

我很确定这就是您要寻找的

link = q.rsplit("/",1)
images = tree.xpath("//img")

for idx, image in enumerate(images):
    if '.gif' not in image.attrib['src']:
        images[idx].attrib['src'] = link[0]+'/'+image.attrib['src']

for image in images:
    print image.attrib['src']
它循环浏览每个选定的图像,如果
'.gif'
不在image
src
属性中,它会将
src
属性更新为您指定的PNG/JPG路径

输出

../../../img2/space.gif
../../../img2/search2.gif
../../../img2/space.gif
../../../img2/D-Lib-blocks.gif
../../../img2/transparent.gif
../../../img2/magazine.gif
../../../img2/transparent.gif
../../../img2/transparent.gif
../../../img2/space.gif
../../../img2/space.gif
http://www.dlib.org/dlib/november14/giannakopoulos/giann-formula1.png
http://www.dlib.org/dlib/november14/giannakopoulos/giann-fig1-sm.png
http://www.dlib.org/dlib/november14/giannakopoulos/giann-fig2.png
http://www.dlib.org/dlib/november14/giannakopoulos/giann-fig3.png
http://www.dlib.org/dlib/november14/giannakopoulos/giann-fig4.png
http://www.dlib.org/dlib/november14/giannakopoulos/giannakopoulos.jpg
http://www.dlib.org/dlib/november14/giannakopoulos/foufoulas.jpg
http://www.dlib.org/dlib/november14/giannakopoulos/stamatogiannakis.png
http://www.dlib.org/dlib/november14/giannakopoulos/dimitropoulos.jpg
http://www.dlib.org/dlib/november14/giannakopoulos/manola.jpg
http://www.dlib.org/dlib/november14/giannakopoulos/ioannidis.png
../../../img2/transparent.gif