Web爬虫-TooManyRedirects:超过30个重定向。（python）_Python_Web Crawler

Web爬虫-TooManyRedirects:超过30个重定向。（python）

python web-crawler

Web爬虫-TooManyRedirects:超过30个重定向。（python）,python,web-crawler,Python,Web Crawler,我试着跟随youtube教程中的一个但是我遇到了一些问题。有人能帮忙吗？我是python新手，我知道有一两个类似的问题，但我读过，不明白。有人能帮我吗？谢谢导入请求从bs4导入BeautifulSoup def trade_spider（最大页数）：页码=1 页面嗯，您试图抓取的页面似乎完全被破坏了：尝试插入您的web浏览器：当我尝试使用Chrome时，我收到错误消息：此网页有一个重定向循环错误\u太多\u重定向您必须自行决定如何处理爬虫中的此类断页。该论坛的url已更改

我试着跟随youtube教程中的一个但是我遇到了一些问题。有人能帮忙吗？我是python新手，我知道有一两个类似的问题，但我读过，不明白。有人能帮我吗？谢谢

导入请求
从bs4导入BeautifulSoup
def trade_spider（最大页数）：
页码=1
页面嗯，您试图抓取的页面似乎完全被破坏了：尝试插入您的web浏览器：当我尝试使用Chrome时，我收到错误消息：
此网页有一个重定向循环
错误\u太多\u重定向
您必须自行决定如何处理爬虫中的此类断页。
该论坛的url已更改
代码的两个修改
改变了论坛
1.网址（“+
str（第页））

allow_redirects=False（如果有，禁用重定向）
导入请求
从bs4导入BeautifulSoup
def trade_spider（最大页数）：
页码=1
“你好，谢谢你的帮助。你能给我解释一下吗？”仓鼠希望你能理解
import requests
from bs4 import BeautifulSoup
def trade_spider(max_pages):
    page = 1
    while page <= max_pages:
        url = "https://www.thenewboston.com/forum/home.php?page=" + str(page)
       source_code = requests.get(url)
        plain_text = source_code.text
        soup = BeautifulSoup(plain_text)
        for link in soup.findAll('a', {'class': 'post-title'}):
            href = link.get('href')
            print(href)
        page += 1
trade_spider(2)

Traceback (most recent call last):
  File "C:/Users/User/PycharmProjects/Basic/WebCrawlerTest.py", line 19, in <module>
    trade_spider(2)
  File "C:/Users/User/PycharmProjects/Basic/WebCrawlerTest.py", line 9, in trade_spider
    source_code = requests.get(url)
  File "C:\Users\User\AppData\Roaming\Python\Python34\site-packages\requests\api.py", line 69, in get
    return request('get', url, params=params, **kwargs)
  File "C:\Users\User\AppData\Roaming\Python\Python34\site-packages\requests\api.py", line 50, in request
    response = session.request(method=method, url=url, **kwargs)
  File "C:\Users\User\AppData\Roaming\Python\Python34\site-packages\requests\sessions.py", line 465, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Users\User\AppData\Roaming\Python\Python34\site-packages\requests\sessions.py", line 594, in send
    history = [resp for resp in gen] if allow_redirects else []
  File "C:\Users\User\AppData\Roaming\Python\Python34\site-packages\requests\sessions.py", line 594, in <listcomp>
    history = [resp for resp in gen] if allow_redirects else []
  File "C:\Users\User\AppData\Roaming\Python\Python34\site-packages\requests\sessions.py", line 114, in resolve_redirects
    raise TooManyRedirects('Exceeded %s redirects.' % self.max_redirects)
requests.exceptions.TooManyRedirects: Exceeded 30 redirects.

import requests
from bs4 import BeautifulSoup
def trade_spider(max_pages):
    page = 1
    while page <= max_pages:
        url = "https://www.thenewboston.com/forum/recent_activity.php?page=" + str(page)
        print url
        source_code = requests.get(url, allow_redirects=False)
        plain_text = source_code.text
        soup = BeautifulSoup(plain_text)
        for link in soup.findAll('a', {'class': 'post-title'}):

            href = link.get('href')
            print(href)
        page += 1
print trade_spider(2)