Python 如何使用Beatifulsoup从缩略图获取完整图像数据？_Python_Html_Parsing_Beautifulsoup

Python 如何使用Beatifulsoup从缩略图获取完整图像数据？

python html parsing

Python 如何使用Beatifulsoup从缩略图获取完整图像数据？,python,html,parsing,beautifulsoup,Python,Html,Parsing,Beautifulsoup,我试图在Python上编写一个程序，根据搜索查询下载一个随机图像。以下是我迄今为止的看法： import requests from bs4 import BeautifulSoup import random query = 'pets' #This can be anything, this is just for demonstration adlt = 'on' count = '10' #I tried using Google but Bing is more cooperat

我试图在Python上编写一个程序，根据搜索查询下载一个随机图像。以下是我迄今为止的看法：

import requests
from bs4 import BeautifulSoup
import random

query = 'pets' #This can be anything, this is just for demonstration 
adlt = 'on'
count = '10'

#I tried using Google but Bing is more cooperative
URL='https://bing.com/images/search?q=' + query + '&safeSearch=' + adlt + '&count=' + count

html_page = requests.get(URL)

soup = BeautifulSoup(html_page.content, 'html.parser')

images = soup.find_all('img')

example = random.choice(images)

imageLink = example.attrs['src']

print(imageLink)

所以，这段代码的作用是它进入Bing的图像引擎，并在那里获取所有的标签。然后它随机选择一个，并在终端上打印它的URL。但是你可能知道，Bing和Google的图像引擎上显示的不是真实的图像，而是它的一个较小版本，你需要点击它来访问真实的图像。那么，从这个缩略图中获取的数据，我如何才能访问真实的图像

以下是缩略图的html代码，以备需要：

<img class="mimg" style="color: rgb(157, 102, 46);" height="180" width="323" src="https://th.bing.com/th/id/OIP.1lJSjlsM4xmvJQTDwkOcbgHaEH?w=323&h=180&c=7&o=5&dpr=1.25&pid=1.7" alt="Image result for pets" data-thhnrepbd="1" data-bm="180">

下面是缩略图完整图像的代码：

<img src="http://www.insuranceportals.us/wp-content/uploads/2018/07/Pets-Health-Insurance-Wise-Investment-Or-Waste-of-Money.jpeg" alt="See the source image" class=" nofocus" tabindex="0" aria-label="See the source image">

页面是动态加载的，因此

请求

不支持该页面。我们可以用它来代替刮削页面

安装时使用：

pip Install selenium

从下载正确的ChromeDriver

输出：

<img alt="Turtle" data-bm="78" data-priority="2" data-thhnrepbd="1" height="42" src2="https://th.bing.com/th?q=Pet+Turtle&amp;w=42&amp;h=42&amp;c=1&amp;p=0&amp;pid=InlineBlock&amp;mkt=en-US&amp;adlt=moderate&amp;t=1" width="42"/>

你自己回答的。。你需要“点击它”自己去看图片。也就是说，找到用户单击时给出的url，并按照它进行操作。缩略图是存储在Bing服务器上的文件的本地缓存副本，无法仅从缩略图访问原始url。@JeffUK那么，有没有办法使用Beatifulsoup或任何其他API“单击图像”？您可以使用Seleniumpython@Gealber据我所知，Selenium基本上是一个浏览器模拟器。每次启动Selenium时，您选择的浏览器都会打开，这并不方便。我更喜欢在后台运行的东西。是的，确实如此，但您也可以将Selenium配置为在后台作为无头浏览器运行。类似这样的

driver\u选项。添加参数（“--headless”）

。问题是，您无法用bs4模拟

点击，因为它只是一个html解析器，一个很好的解析器，但恰恰如此。它无法为您处理JavaScript这可能是我的代码所做的另一种方式。问题仍然存在，您得到的输出是指向图像缩略图的链接，而不是实际图像。
<img alt="Turtle" data-bm="78" data-priority="2" data-thhnrepbd="1" height="42" src2="https://th.bing.com/th?q=Pet+Turtle&amp;w=42&amp;h=42&amp;c=1&amp;p=0&amp;pid=InlineBlock&amp;mkt=en-US&amp;adlt=moderate&amp;t=1" width="42"/>