Python 如何从Twitter搜索API创建熊猫数据框架？_Python_Api_Pandas_Twitter

Python 如何从Twitter搜索API创建熊猫数据框架？

python api pandas twitter

Python 如何从Twitter搜索API创建熊猫数据框架？,python,api,pandas,twitter,Python,Api,Pandas,Twitter,我正在使用Twitter搜索API，它返回一个字典字典。我的目标是从响应字典中的键列表创建一个数据帧此处的API响应示例：我在状态字典中有一个键列表 keys = ["created_at", "text", "in_reply_to_screen_name", "source"] 我想循环遍历Statuses字典中返回的每个键值，并将它们放在一个以键为列的数据帧中目前有代码可以单独循环一个键，并分配给list，然后附加到dataframe，但需要一种一次执行多个键的方法。当前代码如下：

我正在使用Twitter搜索API，它返回一个字典字典。我的目标是从响应字典中的键列表创建一个数据帧

此处的API响应示例：

我在状态字典中有一个键列表

keys = ["created_at", "text", "in_reply_to_screen_name", "source"]

我想循环遍历Statuses字典中返回的每个键值，并将它们放在一个以键为列的数据帧中

目前有代码可以单独循环一个键，并分配给list，然后附加到dataframe，但需要一种一次执行多个键的方法。当前代码如下：

#w is the word to be queired
w = 'keyword'
#count of tweets to return
count = 1000

#API call
query = twitter.search.tweets(q= w, count = count)

def data_l2 (q, k1, k2):

    data = []

    for results in q[k1]:
        data.append(results[k2])

    return(data)

screen_names = data_l3(query, "statuses", "user", "screen_name")

data = {'screen_names':screen_names,
       'tweets':tweets}
frame=pd.DataFrame(data)
frame

我将与大家分享我在使用Twitter API时提出的一个更通用的解决方案。假设您在一个名为

my_ID

的列表中拥有想要获取的tweet的ID:

# Fetch tweets from the twitter API using the following loop:
list_of_tweets = []
# Tweets that can't be found are saved in the list below:
cant_find_tweets_for_those_ids = []
for each_id in my_ids:   
    try:
        list_of_tweets.append(api.get_status(each_id))
    except Exception as e:
        cant_find_tweets_for_those_ids.append(each_id)

然后在这个代码块中，我们将已下载的每个tweepy状态对象的json部分分离出来，并将它们全部添加到一个列表中

my_list_of_dicts = []
for each_json_tweet in list_of_tweets:
    my_list_of_dicts.append(each_json_tweet._json)

…我们将此列表写入一个txt文件：

with open('tweet_json.txt', 'w') as file:
        file.write(json.dumps(my_list_of_dicts, indent=4))

现在，我们将从tweet_json.txt文件创建一个数据帧（我添加了一些与我正在处理的用例相关的键，但您可以添加特定的键）：

几个问题：Twitter API是否返回JSON？你能使用pd.read_json（）吗？你能修复函数调用的缩进吗？干杯我无法使用pandas read_json函数读入。谢谢，我更正了缩进。您可以通过示例响应链接查看实际的API响应。您是否可以在拥有完整代码的位置共享该回购协议？

my_demo_list = []
with open('tweet_json.txt', encoding='utf-8') as json_file:  
    all_data = json.load(json_file)
    for each_dictionary in all_data:
        tweet_id = each_dictionary['id']
        whole_tweet = each_dictionary['text']
        only_url = whole_tweet[whole_tweet.find('https'):]
        favorite_count = each_dictionary['favorite_count']
        retweet_count = each_dictionary['retweet_count']
        created_at = each_dictionary['created_at']
        whole_source = each_dictionary['source']
        only_device = whole_source[whole_source.find('rel="nofollow">') + 15:-4]
        source = only_device
        retweeted_status = each_dictionary['retweeted_status'] = each_dictionary.get('retweeted_status', 'Original tweet')
        if retweeted_status == 'Original tweet':
            url = only_url
        else:
            retweeted_status = 'This is a retweet'
            url = 'This is a retweet'

        my_demo_list.append({'tweet_id': str(tweet_id),
                             'favorite_count': int(favorite_count),
                             'retweet_count': int(retweet_count),
                             'url': url,
                             'created_at': created_at,
                             'source': source,
                             'retweeted_status': retweeted_status,
                            })
        tweet_json = pd.DataFrame(my_demo_list, columns = ['tweet_id', 'favorite_count', 
                                                       'retweet_count', 'created_at',
                                                       'source', 'retweeted_status', 'url'])