Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/281.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 处理环路的连接中断、错误行为_Python_Function_For Loop_While Loop_Tweepy - Fatal编程技术网

Python 处理环路的连接中断、错误行为

Python 处理环路的连接中断、错误行为,python,function,for-loop,while-loop,tweepy,Python,Function,For Loop,While Loop,Tweepy,我有以下For循环,它使用Tweepy为一系列用户获取追随者ID: def download_followers(user, api): all_followers = [] try: for page in tweepy.Cursor(api.followers_ids, screen_name=user).pages(): all_followers.extend(map(str, page)) return all_f

我有以下For循环,它使用
Tweepy
为一系列用户获取追随者ID:

def download_followers(user, api):
    all_followers = []
    try:
        for page in tweepy.Cursor(api.followers_ids, screen_name=user).pages():
            all_followers.extend(map(str, page))
        return all_followers
    except tweepy.TweepError:
        print('Could not access user {}. Skipping...'.format(user))
main_api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
函数的调用方式如下:

for username in lookup_users:
    user_followers = download_followers(username, main_api)
    if user_followers:

        new_followers = pd.DataFrame({
            "Handles": username,
            "Follower_ID": user_followers,
            "Start_Date": today})

        new_followers_df = new_followers_df.append(new_followers)


        print('Finished outputting: {} at {}'.format(username, datetime.now().strftime('%Y/%m/%d %H:%M:%S')))
Rate limit reached. Sleeping for: 895
'Could not access user @barackobama. Trying again...'
Rate limit reached. Sleeping for: 895
Finished outputting: @barackobama at 2017/07/01 10:36:07
Rate limit reached. Sleeping for: 895
Rate limit reached. Sleeping for: 895
Rate limit reached. Sleeping for: 895
Finished outputting: @donaldtrump at 2017/07/01 10:36:07
Finished outputting: @barackobama at 2017/07/01 10:36:07
Finished outputting: @donaldtrump at 2017/07/01 10:36:07
Finished outputting: @georgebush at 2017/07/01 10:36:07
Rate limit reached. Sleeping for: 895
Finished outputting: @richardnixon at 2017/07/01 10:41:08
根据每个
用户
可能拥有的追随者数量,
Twitter的API
可能需要调用两到三次才能获取所有
用户的追随者

因此,在对api进行另一次调用之前还有15分钟的休息时间。通过将以下参数添加到Tweepy,可以解决此问题:

def download_followers(user, api):
    all_followers = []
    try:
        for page in tweepy.Cursor(api.followers_ids, screen_name=user).pages():
            all_followers.extend(map(str, page))
        return all_followers
    except tweepy.TweepError:
        print('Could not access user {}. Skipping...'.format(user))
main_api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
结果是这样的:

Rate limit reached. Sleeping for: 895
Rate limit reached. Sleeping for: 895
Finished outputting: @barackobama at 2017/07/01 10:36:07
在此情况下,
API
两次达到其极限。每次等待15分钟,然后抓住所有
@barackobama
的追随者

但是,有时循环的
失败。打印信息:

    'Could not access user @barackobama. Skipping...'
这主要是由于连接问题、twitter api没有发送正确的请求,或者一个拥有大量追随者的帐户和Tweepy的软件包无法相应地处理它

为了解释可能的连接失败,我尝试用
While True
参数包装api,如下所示:

 def download_followers(user, api):
    all_followers = []
    while True:

        try:

            for page in tweepy.Cursor(api.followers_ids, screen_name=user).pages():

                all_followers.extend(map(str, page))

                return all_followers

        except tweepy.TweepError:
            print('Could not access user {}. Trying Again...'.format(user))
            continue
        break
但是,通过这种方式包装函数,for循环无法正常工作
只对每个
用户
迭代一次,而不是抓住其所有追随者,然后转到“查找”用户列表中的下一个
用户

例如,
改为
,其行为方式如下:

for username in lookup_users:
    user_followers = download_followers(username, main_api)
    if user_followers:

        new_followers = pd.DataFrame({
            "Handles": username,
            "Follower_ID": user_followers,
            "Start_Date": today})

        new_followers_df = new_followers_df.append(new_followers)


        print('Finished outputting: {} at {}'.format(username, datetime.now().strftime('%Y/%m/%d %H:%M:%S')))
Rate limit reached. Sleeping for: 895
'Could not access user @barackobama. Trying again...'
Rate limit reached. Sleeping for: 895
Finished outputting: @barackobama at 2017/07/01 10:36:07
Rate limit reached. Sleeping for: 895
Rate limit reached. Sleeping for: 895
Rate limit reached. Sleeping for: 895
Finished outputting: @donaldtrump at 2017/07/01 10:36:07
Finished outputting: @barackobama at 2017/07/01 10:36:07
Finished outputting: @donaldtrump at 2017/07/01 10:36:07
Finished outputting: @georgebush at 2017/07/01 10:36:07
Rate limit reached. Sleeping for: 895
Finished outputting: @richardnixon at 2017/07/01 10:41:08
其作用方式如下:

for username in lookup_users:
    user_followers = download_followers(username, main_api)
    if user_followers:

        new_followers = pd.DataFrame({
            "Handles": username,
            "Follower_ID": user_followers,
            "Start_Date": today})

        new_followers_df = new_followers_df.append(new_followers)


        print('Finished outputting: {} at {}'.format(username, datetime.now().strftime('%Y/%m/%d %H:%M:%S')))
Rate limit reached. Sleeping for: 895
'Could not access user @barackobama. Trying again...'
Rate limit reached. Sleeping for: 895
Finished outputting: @barackobama at 2017/07/01 10:36:07
Rate limit reached. Sleeping for: 895
Rate limit reached. Sleeping for: 895
Rate limit reached. Sleeping for: 895
Finished outputting: @donaldtrump at 2017/07/01 10:36:07
Finished outputting: @barackobama at 2017/07/01 10:36:07
Finished outputting: @donaldtrump at 2017/07/01 10:36:07
Finished outputting: @georgebush at 2017/07/01 10:36:07
Rate limit reached. Sleeping for: 895
Finished outputting: @richardnixon at 2017/07/01 10:41:08
因此,只在每个用户上迭代一次


有什么我做错了吗?

返回语句在
for
循环中,因此程序在第一次迭代后退出
for
循环