Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/ant/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
试图抓取图像URL,但无法使用BeautifulSoup和python获取_Python_Web Scraping_Beautifulsoup_Python Requests_Scrapinghub - Fatal编程技术网

试图抓取图像URL,但无法使用BeautifulSoup和python获取

试图抓取图像URL,但无法使用BeautifulSoup和python获取,python,web-scraping,beautifulsoup,python-requests,scrapinghub,Python,Web Scraping,Beautifulsoup,Python Requests,Scrapinghub,我正在删除此链接: 并获取图像URL from urllib.request import urlopen from bs4 import BeautifulSoup import json AMEXurl = ['https://www.americanexpress.com/in/credit-cards/all-cards/?sourcecode=A0000FCRAA&cpid=100370494&dsparms=dc_pcrid_408453063287_kword_

我正在删除此链接:

并获取图像URL

from urllib.request import urlopen
from bs4 import BeautifulSoup
import json


AMEXurl = ['https://www.americanexpress.com/in/credit-cards/all-cards/?sourcecode=A0000FCRAA&cpid=100370494&dsparms=dc_pcrid_408453063287_kword_american%20express%20credit%20card_match_e&gclid=Cj0KCQiApY6BBhCsARIsAOI_GjaRsrXTdkvQeJWvKzFy_9BhDeBe2L2N668733FSHTHm96wrPGxkv7YaAl6qEALw_wcB&gclsrc=aw.ds']
identity = ['filmstrip_container']

html_1 = urlopen(AMEXurl[0])
soup_1 = BeautifulSoup(html_1,'lxml')
address = soup_1.find('div',attrs={"class" : identity[0]})

for x in address.find_all('div', class_ = 'filmstrip-imgContainer'):
    print(x.find('div').get('img'))
但我得到的输出如下:

None
None
None
None
None
None
None
下面是我试图从中获取图像URL的html代码的图像:

这是我想从中获取URL的页面部分

我想知道代码中是否有任何更改,以便获得所有图像URL。

试试这个

import urllib
from urllib.request import urlopen
from bs4 import BeautifulSoup
import json
import requests
import re

AMEXurl = ['https://www.americanexpress.com/in/credit-cards/all-cards/?sourcecode=A0000FCRAA&cpid=100370494&dsparms=dc_pcrid_408453063287_kword_american%20express%20credit%20card_match_e&gclid=Cj0KCQiApY6BBhCsARIsAOI_GjaRsrXTdkvQeJWvKzFy_9BhDeBe2L2N668733FSHTHm96wrPGxkv7YaAl6qEALw_wcB&gclsrc=aw.ds']
identity = ['filmstrip_container']

r = requests.get(AMEXurl[0])

html_1 = urlopen(AMEXurl[0])

soup_1 = BeautifulSoup(r.content,'lxml')
提取所有图像 显示png文件的所有图像标记。 显示svg文件的所有图像标记。 编辑 试试这个

import urllib
from urllib.request import urlopen
from bs4 import BeautifulSoup
import json
import requests
import re

AMEXurl = ['https://www.americanexpress.com/in/credit-cards/all-cards/?sourcecode=A0000FCRAA&cpid=100370494&dsparms=dc_pcrid_408453063287_kword_american%20express%20credit%20card_match_e&gclid=Cj0KCQiApY6BBhCsARIsAOI_GjaRsrXTdkvQeJWvKzFy_9BhDeBe2L2N668733FSHTHm96wrPGxkv7YaAl6qEALw_wcB&gclsrc=aw.ds']
identity = ['filmstrip_container']

r = requests.get(AMEXurl[0])

html_1 = urlopen(AMEXurl[0])

soup_1 = BeautifulSoup(r.content,'lxml')
提取所有图像 显示png文件的所有图像标记。 显示svg文件的所有图像标记。 编辑
它们是从脚本标记动态加载的。您可以很容易地从响应的.text中正则化它们。下面的正则表达式专门匹配您说要检索并在图片中显示的7个图像

import requests, re

r = requests.get('https://www.americanexpress.com/in/credit-cards/all-cards/?sourcecode=A0000FCRAA&cpid=100370494&dsparms=dc_pcrid_408453063287_kword_american%20express%20credit%20card_match_e&gclid=Cj0KCQiApY6BBhCsARIsAOI_GjaRsrXTdkvQeJWvKzFy_9BhDeBe2L2N668733FSHTHm96wrPGxkv7YaAl6qEALw_wcB&gclsrc=aw.ds').text
p = re.compile(r'imgurl":"(.*?)"')
links = p.findall(r)
print(links)

正则表达式解释:


你是不是决定选择与之相匹配的更贵的硒

links = [i['src'] for i in driver.find_all_elements_with_css_selector('.filmstrip-imgContainer img')]

它们是从脚本标记动态加载的。您可以很容易地从响应的.text中正则化它们。下面的正则表达式专门匹配您说要检索并在图片中显示的7个图像

import requests, re

r = requests.get('https://www.americanexpress.com/in/credit-cards/all-cards/?sourcecode=A0000FCRAA&cpid=100370494&dsparms=dc_pcrid_408453063287_kword_american%20express%20credit%20card_match_e&gclid=Cj0KCQiApY6BBhCsARIsAOI_GjaRsrXTdkvQeJWvKzFy_9BhDeBe2L2N668733FSHTHm96wrPGxkv7YaAl6qEALw_wcB&gclsrc=aw.ds').text
p = re.compile(r'imgurl":"(.*?)"')
links = p.findall(r)
print(links)

正则表达式解释:


你是不是决定选择与之相匹配的更贵的硒

links = [i['src'] for i in driver.find_all_elements_with_css_selector('.filmstrip-imgContainer img')]

非常感谢你的代码,但不幸的是,我无法获得所有7张卡的所有图像URL。非常感谢你的代码,但不幸的是,我无法获得所有7张卡的所有图像URL。如果你不介意的话,我很高兴你能帮我现在申请,并了解更多的URL。我现在正在工作。稍后我会回答你。如果你能帮我申请,我会很高兴。如果你不介意的话,我也会学习更多的网址。我现在正在工作。我稍后会回答你。
links = [i['src'] for i in driver.find_all_elements_with_css_selector('.filmstrip-imgContainer img')]