Python 下载tweet,但缺少hashtags
我试图关闭一个用户的所有推文,但是当我下载的数据丢失了它的标签 例如,这条推特应该有5个标签。但我下载的数据显示如下:Python 下载tweet,但缺少hashtags,python,twitter,tweepy,Python,Twitter,Tweepy,我试图关闭一个用户的所有推文,但是当我下载的数据丢失了它的标签 例如,这条推特应该有5个标签。但我下载的数据显示如下: b'RT @gcosma1: Fantastic opportunity! PhD Studentship: Energy Prediction in Buildings using Artificial Intelligence\nthe_url #\xe2\x80\xa6' 有人知道为什么会这样吗?这困扰了我很长时间,我找不到解决办法。 这是我的密码: import t
b'RT @gcosma1: Fantastic opportunity! PhD Studentship: Energy Prediction in Buildings using Artificial Intelligence\nthe_url #\xe2\x80\xa6'
有人知道为什么会这样吗?这困扰了我很长时间,我找不到解决办法。
这是我的密码:
import tweepy
import csv
import json
consumer_key = 'XXX'
consumer_secret = 'XXX'
access_token = 'XXX'
access_token_secret = 'XXX'
def get_all_tweets(screen_name):
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
all_the_tweets = []
new_tweets = api.user_timeline(screen_name=screen_name, count=200)
all_the_tweets.extend(new_tweets)
oldest_tweet = all_the_tweets[-1].id - 1
t_no = 201
while len(all_the_tweets) != t_no:
new_tweets = api.user_timeline(screen_name=screen_name,count=200, max_id=oldest_tweet, tweet_mode="extended")
t_no = len(all_the_tweets)
all_the_tweets.extend(new_tweets)
oldest_tweet = all_the_tweets[-1].id - 1
print ('...%s tweets have been downloaded so far' % len(all_the_tweets))
# transforming the tweets into a 2D array that will be used to populate the csv
outtweets = [[tweet.id_str, tweet.created_at,
tweet.text.encode('utf8')] for tweet in all_the_tweets]
# writing to the csv file
with open(screen_name + '_tweets.csv', 'w', encoding='utf8') as f:
writer = csv.writer(f)
writer.writerow(['id', 'created_at', 'text'])
writer.writerows(outtweets)
if __name__ == '__main__':
get_all_tweets(input("Enter the twitter handle of the person whose tweets you want to download:- "))
这似乎只发生在转发中。原始tweet的文本似乎包含了所有的hashtag。如果你看它原来的推文,你会发现它是
Fantastic opportunity! PhD Studentship: Energy Prediction in Buildings using Artificial Intelligence\nthe_url #DeepLearning #MachineLearning #AI #DataScience #PhD the_url2'
所以,你可以这样做
new_tweets = api.user_timeline(screen_name='gcosma1', count=200, tweet_mode="extended")
tweet_text = []
for tweet in new_tweets:
#Check if it is a retweet. If yes, add the original tweet
if hasattr(tweet, 'retweeted_status'):
tweet_text.append(tweet.retweeted_status.full_text)
else:
tweet_text.append(tweet.full_text)
print(tweet_text)