Python 3.x Twitter API geo/search中包含的参数_未按预期工作
我正在做一个项目,其中包括在推特上搜索推特,并分析美国政府的情绪。为了实现这一点,我定义了一个按人口划分的美国前100个城市的列表,然后使用搜索方法以及“contained_In”参数查询每个城市的推文,如推特API文档中所述: 我正在使用Python 3.6.5和Tweepy 3.6.0。这是我的密码:Python 3.x Twitter API geo/search中包含的参数_未按预期工作,python-3.x,twitter,tweepy,Python 3.x,Twitter,Tweepy,我正在做一个项目,其中包括在推特上搜索推特,并分析美国政府的情绪。为了实现这一点,我定义了一个按人口划分的美国前100个城市的列表,然后使用搜索方法以及“contained_In”参数查询每个城市的推文,如推特API文档中所述: 我正在使用Python 3.6.5和Tweepy 3.6.0。这是我的密码: import tweepy # creating authentication auth = tweepy.OAuthHandler(consumer_key, consumer_secr
import tweepy
# creating authentication
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
# connecting to the Twitter API using the above authentication
api = tweepy.API(auth)
cities_data = {
'Akron': {'region': 'Midwest', 'state': 'Ohio'},
'Albuquerque': {'region': 'West', 'state': 'New Mexico'},
}
# let's say we want to gauge how Americans feel about the immigration
# policies of the Trump Administration. So the following is our query
q = '(immigration OR immigrants) AND (families OR family) AND (separate OR separation) AND trump)'
# define a function to convert a list of tweets into a pandas dataframe
def toDataFrame(tweets):
df = pd.DataFrame()
df['tweetID'] = [tweet.id for tweet in tweets]
df['tweetText'] = [tweet.text for tweet in tweets]
df['tweetRetweetCt'] = [tweet.retweet_count for tweet in tweets]
df['tweetFavoriteCt'] = [tweet.favorite_count for tweet in tweets]
df['tweetSource'] = [tweet.source for tweet in tweets]
df['tweetCreated'] = [tweet.created_at for tweet in tweets]
df['userID'] = [tweet.user.id for tweet in tweets]
df['userScreen'] = [tweet.user.screen_name for tweet in tweets]
df['userName'] = [tweet.user.name for tweet in tweets]
df['userCreateDt'] = [tweet.user.created_at for tweet in tweets]
df['userDesc'] = [tweet.user.description for tweet in tweets]
df['userFollowerCt'] = [tweet.user.followers_count for tweet in tweets]
df['userFollowingCt'] = [tweet.user.friends_count for tweet in tweets]
df['userLocation'] = [tweet.user.location for tweet in tweets]
df['userTimezone'] = [tweet.user.time_zone for tweet in tweets]
df['coordinates'] = [tweet.coordinates for tweet in tweets]
df['geolocation'] = [tweet.geo for tweet in tweets]
df['place'] = [tweet.place for tweet in tweets]
return df
tweets_df = pd.DataFrame() # initialize an empty global dataframe
# loop through the dictionary of cities_data and retrieve the tweets in
# each city
for city, city_data in cities_data.iteritems():
# get the city's place_id
city_id = api.geo_search(
query='{}'.format(city),
wait_on_rate_limit=True,
granularity='city'
)[0].id
cities_data[city]['id'] = city_id
city_tweets = [] # create an empty list to contain all the tweets
# get the tweets in each city
for tweet in tweepy.Cursor(api.search, q=q + '-filter:retweets', contained_within=city_id, granularity='city', wait_on_rate_limit=True, lang="en").items(10):
city_tweets.append(tweet)
# convert list of tweets into a pandas dataframe of tweets
city_tweets_df = toDataFrame(city_tweets)
# append each row with city name, state name, and region.
city_tweets_df['city'] = city
city_tweets_df['state'] = city_data['state']
city_tweets_df['region'] = city_data['region']
# add city dataframe to global dataframe
tweets_df = tweets_df.append(city_tweets_df, ignore_index=True)
如您所见,我使用contained_with参数,以便仅检索由其place_id定义的特定于城市的tweet
然而,如果有人显示生成的全球数据帧(tweets_df),他们会发现,在阿克伦和阿尔伯克基这两个城市返回的tweets完全相同,尽管一个在西部,另一个在中西部
如果有人能够解释为什么会发生这种情况,或者让我知道我做错了什么,我会非常感激。提前非常感谢。好的,所以我联系了Twitter技术团队,他们告诉我,我无法使用他们的免费API获得基于地理位置的准确推文。取而代之的是,我不得不用他们的高级手机支付一些月费。所以我希望这能帮助那些试图使用免费API分析推文的人