HTTP错误508:循环检测到python urllib.request_Python_Web Scraping_Web Crawler_Urllib

HTTP错误508:循环检测到python urllib.request

python web-scraping web-crawler

HTTP错误508:循环检测到python urllib.request,python,web-scraping,web-crawler,urllib,Python,Web Scraping,Web Crawler,Urllib,我正在用下面的代码抓取一个网站，在我运行了两次之后，第三次显示错误为 HTTP错误508:检测到循环如何预防？有时它会工作，而另一些时候它会出现此错误猜测这是一种“内部服务器错误”，这表明服务器进入了一个循环，如下所述：所以，这是一个服务器错误，不是你的 req = Request(url, headers={'User-Agent': 'Mozilla/5.0'}) webpage = urlopen(req).read() soup=BeautifulSoup(webpage) liL

我正在用下面的代码抓取一个网站，在我运行了两次之后，第三次显示错误为

HTTP错误508:检测到循环

如何预防？有时它会工作，而另一些时候它会出现此错误

猜测这是一种“内部服务器错误”，这表明服务器进入了一个循环，如下所述：

所以，这是一个服务器错误，不是你的

req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
webpage = urlopen(req).read()
soup=BeautifulSoup(webpage)

liList=soup.find('div',attrs={'class':'columns-list'})
links=[]
for a in liList.find_all('a'):

    req = Request(a.attrs['href'], headers={'User-Agent': 'Mozilla/5.0'})
    webpage = urlopen(req).read()
    data=BeautifulSoup(webpage)
    h=data.find("div",attrs={'class':'first-h2'})

    print(h.h2.text)
    print(data.find("h5"))

It indicates that the server terminated an operation because it encountered
an infinite loop while processing a request with "Depth: infinity". This
status indicates that the entire operation failed.