img爬虫属性错误:';int';对象没有属性';img&x27;python 3.7.6.4
您正在将img爬虫属性错误:';int';对象没有属性';img&x27;python 3.7.6.4,python,beautifulsoup,web-crawler,Python,Beautifulsoup,Web Crawler,您正在将item存储为for循环中的int:范围(1,3)。然后,您尝试在其上应用.img函数。实际上,您想要将其应用到的是您的beautifulsoup对象soup 你还需要清理你的清单。在其中一个中,您正确地连接了https:,而在另一个中,您没有连接 看看这里,看看我对您的代码所做的修改: # -*- coding: utf-8 -*- import requests from bs4 import BeautifulSoup headers = { "User-Age
item
存储为for循环中的int:范围(1,3)
。然后,您尝试在其上应用.img
函数。实际上,您想要将其应用到的是您的beautifulsoup对象soup
你还需要清理你的清单。在其中一个中,您正确地连接了https:,而在另一个中,您没有连接
看看这里,看看我对您的代码所做的修改:
# -*- coding: utf-8 -*-
import requests
from bs4 import BeautifulSoup
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.122 Safari/537.36"
}
def main(url):
with requests.Session() as req:
for item in range(1, 3):
print(f"Extracting Page# {item}")
r = req.get(url.format(item), headers=headers)
soup = BeautifulSoup(r.content, 'html.parser', from_encoding='utf-8')
if 'http' in item.img['src']:
target = [[item.img['alt'], f'{item.img["src"]}']
for item in soup.select("dt.image")]
else:
target = [[item.img['alt'], f'https:{item.img["src"]}']
for item in soup.select("dt.image")]
for el in target:
print(f"{el[0]}.jpg")
r = req.get(el[1])
with open(f"{el[0]}.jpg", 'wb') as f:
f.write(r.content)
main("https://www.coupang.com/np/categories/311357?page={}")
欢迎来到StackOverflow。请看一看,然后编辑您的问题:将item.img['src']中的
如果是'http':
更改为如果是soup.img['src']:
import requests
from bs4 import BeautifulSoup
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.122 Safari/537.36"
}
def main(url):
with requests.Session() as req:
for item in range(1, 3):
print(f"Extracting Page# %s" %item)
r = req.get(url.format(item), headers=headers)
soup = BeautifulSoup(r.content, 'html.parser', from_encoding='utf-8')
if 'http' in soup.img['src']:
target = [[each.img['alt'], 'https:' + each.img["src"]]
for each in soup.select("dt.image")]
else:
target = [[each.img['alt'], 'https:' + each.img["src"]]
for each in soup.select("dt.image")]
for el in target:
print('%s.jpg' %el[0])
r = req.get(el[1])
with open('%s.jpg' %el[0], 'wb') as f:
f.write(r.content)
main("https://www.coupang.com/np/categories/311357?page={}")