使用bs4进行Python抓取会产生错误的输出

使用bs4进行Python抓取会产生错误的输出,python,beautifulsoup,screen-scraping,Python,Beautifulsoup,Screen Scraping,我试图从这个html代码中提取src from bs4 import BeautifulSoup soup = BeautifulSoup(data.text, 'html.parser') title = soup.find_all(attrs={'class': 'main-image-class'})[0].get('src') 但输出是数据:image/gif;base64。如何获取src链接 网站代码: <img src="https://en.aw-lab.com/dw/

我试图从这个html代码中提取src

from bs4 import BeautifulSoup


soup = BeautifulSoup(data.text, 'html.parser')
title = soup.find_all(attrs={'class': 'main-image-class'})[0].get('src')
但输出是数据:image/gif;base64。如何获取src链接

网站代码:

<img src="https://en.aw-lab.com/dw/image/v2/BCLG_PRD/on/demandware.static/-/Sites-awlab-master-catalog/default/dwbf1e5118/images/bata/large-sport-shoe-8047751-0.jpg?sw=843" alt="NIKE AIR JORDAN 1 MID, GREEN" title="NIKE AIR JORDAN 1 MID, GREEN" class="main-image-class" srcset="https://en.aw-lab.com/dw/image/v2/BCLG_PRD/on/demandware.static/-/Sites-awlab-master-catalog/default/dwbf1e5118/images/bata/large-sport-shoe-8047751-0.jpg?sw=710 710w, https://en.aw-lab.com/dw/image/v2/BCLG_PRD/on/demandware.static/-/Sites-awlab-master-catalog/default/dwbf1e5118/images/bata/large-sport-shoe-8047751-0.jpg?sw=1420 710w, https://en.aw-lab.com/dw/image/v2/BCLG_PRD/on/demandware.static/-/Sites-awlab-master-catalog/default/dwbf1e5118/images/bata/large-sport-shoe-8047751-0.jpg?sw=556 556w, https://en.aw-lab.com/dw/image/v2/BCLG_PRD/on/demandware.static/-/Sites-awlab-master-catalog/default/dwbf1e5118/images/bata/large-sport-shoe-8047751-0.jpg?sw=1112 556w, https://en.aw-lab.com/dw/image/v2/BCLG_PRD/on/demandware.static/-/Sites-awlab-master-catalog/default/dwbf1e5118/images/bata/large-sport-shoe-8047751-0.jpg?sw=843 843w, https://en.aw-lab.com/dw/image/v2/BCLG_PRD/on/demandware.static/-/Sites-awlab-master-catalog/default/dwbf1e5118/images/bata/large-sport-shoe-8047751-0.jpg?sw=1686 843w" sizes="(max-width: 767px) 710px, (max-width: 1199px) and (min-width: 768px) 556px, (min-width: 1200px) 843px">


您的代码在提供的HTML中对我有效。必须有另一个元素具有相同的类。我也进行了检查,并按预期工作,以url形式提供输出。如果我使用bs4刮取整个HTML代码,我将得到以下src链接“data:image/gif;base64,r0lgodlhaqaaaaaaap//waaach5baeaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaicraeaw==”.从
soup返回的集合长度是多少。查找所有(attrs={'class':'main image class})