Web scraping 从标签中检索所有推文。使用tweepy从速率限制恢复时出现问题_Web Scraping_Twitter_Tweepy_Ratelimit

Web scraping 从标签中检索所有推文。使用tweepy从速率限制恢复时出现问题

web-scraping twitter

Web scraping 从标签中检索所有推文。使用tweepy从速率限制恢复时出现问题,web-scraping,twitter,tweepy,ratelimit,Web Scraping,Twitter,Tweepy,Ratelimit,我试图抓取一个标签“nationaldoughnutday”的所有推文，但由于速率限制，未能抓取参考下面的代码，我尝试将代码放入while循环中，这样当速率限制重置时，我可以从最后一个爬网日期（直到_日期）恢复刮取然而，我不断得到这个错误反复和我的爬虫似乎并没有重新开始爬虫后，睡了很长一段时间 TweepError Failed to send request: ('Connection aborted.', error (10054, 'An existing connection was

我试图抓取一个标签“nationaldoughnutday”的所有推文，但由于速率限制，未能抓取

参考下面的代码，我尝试将代码放入while循环中，这样当速率限制重置时，我可以从最后一个爬网日期（直到_日期）恢复刮取
然而，我不断得到这个错误反复和我的爬虫似乎并没有重新开始爬虫后，睡了很长一段时间

TweepError Failed to send request: ('Connection aborted.', error (10054, 'An existing connection was forcibly closed by the remote host')) Sleeping... TweepError Failed to send request: ('Connection aborted.', error (10054, 'An existing connection was forcibly closed by the remote host')) Sleeping... TweepError Failed to send request: ('Connection aborted.', error (10054, 'An existing connection was forcibly closed by the remote host'))
我曾尝试移除内部的try-catch循环，但也没有帮助

auth = tweepy.OAuthHandler(consumer_key, consumer_secret) auth.set_access_token(access_token, access_token_secret) api = tweepy.API(auth, wait_on_rate_limit=True,wait_on_rate_limit_notify=True) query = '#nationaldoughnutday' untill_date = '01-07-2019' while True: try: #outer try catch tweets = tweepy.Cursor(api.search, q=query + '-filter:retweets', rpp=100, lang='en',tweet_mode='extended',until = until_date).items() for tweet in tweets: try: #inner try catch print "tweet : ", tweet.created_at #this is so that if i reconnect with cursor, i will start with the date before the last crawled tweet until_date = tweet.created_at.date() - datetime.timedelta(days=1) except tweepy.TweepError as e: print 'Inner TweepyError', e time.sleep(17 * 60) break except tweepy.TweepError as e: print 'Inner TweepyError', print "sleeping ...." time.sleep(17 * 60) continue except StopIteration: break

提前谢谢你
尝试添加此
wait\u on\u rate\u limit=True
这并不能解决问题，因为删除此速率限制与twitter API有关，但仍有助于停止显示错误