Python 抓取时保存网页中的图像/表格

Python 抓取时保存网页中的图像/表格,python,web-scraping,python-requests,Python,Web Scraping,Python Requests,我需要从这个网站上抓取一张图片: 例如,对于堆栈溢出,towardsdatascience URL stackoverflow.com towardsdatascience.com 我不知道如何将表格/图像上的信息包含在 <div class="sparkline" style="width: 1225px;"><div id="wm-graph-anchor"><div id="wm-ipp

我需要从这个网站上抓取一张图片: 例如,对于
堆栈溢出,towardsdatascience

URL

stackoverflow.com
towardsdatascience.com
我不知道如何将表格/图像上的信息包含在

<div class="sparkline" style="width: 1225px;"><div id="wm-graph-anchor"><div id="wm-ipp-sparkline" title="Explore captures for this URL" style="height: 77px;"><canvas class="sparkline-canvas" width="1225" height="75" alt="sparklines"></canvas></div></div><div id="year-labels"><span class="sparkline-year-label">1996</span><span class="sparkline-year-label">1997</span><span class="sparkline-year-label">1998</span><span class="sparkline-year-label">1999</span><span class="sparkline-year-label">2000</span><span class="sparkline-year-label">2001</span><span class="sparkline-year-label">2002</span><span class="sparkline-year-label">2003</span><span class="sparkline-year-label">2004</span><span class="sparkline-year-label">2005</span><span class="sparkline-year-label">2006</span><span class="sparkline-year-label">2007</span><span class="sparkline-year-label">2008</span><span class="sparkline-year-label">2009</span><span class="sparkline-year-label">2010</span><span class="sparkline-year-label">2011</span><span class="sparkline-year-label">2012</span><span class="sparkline-year-label">2013</span><span class="sparkline-year-label">2014</span><span class="sparkline-year-label">2015</span><span class="sparkline-year-label">2016</span><span class="sparkline-year-label">2017</span><span class="sparkline-year-label">2018</span><span class="sparkline-year-label">2019</span><span class="sparkline-year-label selected-year">2020</span></div></div>

你能给我一些关于如何获取这些图像的信息吗?

如果你在for循环中有每个图像URL,你可以使用python库
urllib下载图像。request
函数
urretrive

首先在脚本开始时使用

import os
from urllib.parse import urlparse
import urllib.request
然后使用

for url in df_url['URL']:
  urllib.request.urlretrieve(url,os.path.basename(urlparse(url).path))

如果您不想使用URL basename保存,请不要进行前两次导入。

谢谢。您知道导入iv class=“sparkline”的标签是否正确吗?我不知道哪一个对应于那个图像。更新:它不工作,因为我得到一个ValueError:
ValueError:未知url类型:
。url类型正确,但如果使用url lib.request,则会出现此错误
for url in df_url['URL']:
  urllib.request.urlretrieve(url,os.path.basename(urlparse(url).path))