Python抓取bs4类型错误：'；非类型'；对象不可下标_Python_Web Scraping

Python抓取bs4类型错误：'；非类型'；对象不可下标

python web-scraping

Python抓取bs4类型错误：'；非类型'；对象不可下标,python,web-scraping,Python,Web Scraping,我希望你很好。你能告诉我为什么我能正确地使用我的刮削脚本吗：）它与其他网站协同工作。我是初学者，所以我可能犯了一个基本的错误 import requests from bs4 import BeautifulSoup import time import csv links = [] for i in range(1): url = '*******/recettes/?page={}' + str(i) res = requests.get(url,headers={'U

我希望你很好。你能告诉我为什么我能正确地使用我的刮削脚本吗：）它与其他网站协同工作。我是初学者，所以我可能犯了一个基本的错误

import requests
from bs4 import BeautifulSoup
import time
import csv

links = []
for i in range(1):
    url = '*******/recettes/?page={}' + str(i)
    res  = requests.get(url,headers={'User-Agent': 'Mozilla/5.0'})
    response = requests.get(url)
    print(response)
    if response.ok:
        print('Page: ' + str(i))
        soup = BeautifulSoup(response.text, "html.parser")
        divs = soup.findAll('div', class_ = 'field-item even')
        for div in divs:
            a = div.find('a')
            link = a['href']
            links.append('*******' + link)
        time.sleep(3)
print(len(links))

with open('urls3.txt', 'w') as file:
    for link in links:
        file.write(link + '\n')

"""

with open('urls3.txt', 'r') as inf:
  with open('recipes3.csv', 'w') as outf:
        outf.write('titre,image,url,\n')
        for row in inf:
            url = row.strip()
            response = requests.get(url)
            if response.ok:
                soup = BeautifulSoup(response.text, "html.parser")
                titre = soup.find('h1')
                image = soup.find('img', {"id":"recipe-media-viewer-thumbnail-1"})['src']
                print(titre.text, image, url)
                outf.write(str(titre.text) + ',' + str(image) + ',' + str(url) +  '\n')
            time.sleep(1)    
"""

你能告诉我为什么这里有个错误吗：

<Response [200]>
Page: 0
Traceback (most recent call last):
  File "ex3.py", line 18, in <module>
    link = a['href']
TypeError: 'NoneType' object is not subscriptable


页码：0
回溯（最近一次呼叫最后一次）：
文件“ex3.py”，第18行，在
link=a['href']
TypeError:“非类型”对象不可下标

我找到了答案，我把它贴在这里：）给有兴趣的人

try:    
    image = soup.find('img', {"id":"recipe-media-viewer-thumbnail-1"})['src']
except Exception as e:
    image = None

似乎在

下没有任何

标记。但是如果没有URL，就很难分辨。

a=div.find（'a'）

这里找不到a，因此

是

None

links=[]对于范围（1）中的i:URL='{}'+str（i）response=requests.get（URL）print（response）if response.ok:print（'Page:'+str（i））soup=BeautifulSoup（response.text，“html.parser”)divs=soup.findAll（'div'，class='field item偶数'）用于divs中的div:a=div.find（'a'）link=a['href']links.append（''+link）time.sleep（3）print（len（links））对于链接：）即使缺少一些a，我如何显示其他结果？因为例如，在其他脚本中它可以工作，但在显示了5-6个结果之后，我再次看到：Cake au chorizo Traceback（最近一次调用）：文件“ex2.py”，第38行，在image=soup.find（'img'，{“id”：“recipe-media-viewer-thumbnail-1”}）['src']TypeError:“非类型”对象不可下标