Python 3.x 通过美味的汤，我收集推特数据。我能够获取数据，但可以'；无法保存在csv文件中_Python 3.x_Twitter_Web Scraping_Beautifulsoup

Python 3.x 通过美味的汤，我收集推特数据。我能够获取数据，但可以'；无法保存在csv文件中

python-3.x twitter web-scraping

Python 3.x 通过美味的汤，我收集推特数据。我能够获取数据，但可以'；无法保存在csv文件中,python-3.x,twitter,web-scraping,beautifulsoup,Python 3.x,Twitter,Web Scraping,Beautifulsoup,我在Twitter上搜索用户名、推文、回复、转发，但无法保存在CSV文件中代码如下：从urllib.request导入urlopen 从bs4导入BeautifulSoup file=“5_twitterBBC.csv” f=打开（文件“w”） Headers=“tweet\u用户、tweet\u文本、回复、转发\n” f、写入（标题）对于范围（0,5）内的页面： url=”https://twitter.com/BBCWorld“.格式（第页） html=urlopen（url） so

我在Twitter上搜索用户名、推文、回复、转发，但无法保存在CSV文件中

代码如下：

从urllib.request导入urlopen
从bs4导入BeautifulSoup
file=“5_twitterBBC.csv”
f=打开（文件“w”）
Headers=“tweet\u用户、tweet\u文本、回复、转发\n”
f、 写入（标题）
对于范围（0,5）内的页面：
url=”https://twitter.com/BBCWorld“.格式（第页）
html=urlopen（url）
soup=BeautifulSoup（html，“html.parser”）
tweets=soup.find_all（“div”，“class”：“js stream item”}）
对于推文中的推文：
尝试：
如果tweet.find（'p'，{“class”：'tweet-text'}）：
tweet_user=tweet.find（'span'，{class:'username'}）.text.strip（）
tweet_text=tweet.find（'p'，{class:'tweet-text'}）.text.encode（'utf8'）.strip（）
repress=tweet.find（'span'，{“class”：“ProfileTweet actionCount”}）.text.strip（）
retweets=tweet.find（'span'，{“类”：“ProfileTweet操作--retweet”}）.text.strip（）
打印（推特用户、推特文本、回复、转发）
f、 写（“{}.format（tweet_user）。替换（“，”，“{}.format（tweet_text）+“，{}.format（tweet_text）+.”，替换（“，”）+”，{}.format（retweets）+“\n”）
除：属性错误
f、 关闭（）

我获取数据，但无法保存在CSV文件中。有人向我解释如何在CSV文件中保存数据。

正如你所看到的，你在这里查找tweets时只犯了一个小错误。

tweets=soup.find_all（“div”，“class”：“js stream item”}）

，你忘了传递参数键名，它应该是这样的

tweets=soup.find_all（“div”，attrs={“class”：“js stream item”}）

filename = "output.csv"
f = open(filename, "w",encoding="utf-8")
headers = " tweet_user, tweet_text, replies, retweets \n"
f.write(headers)

***your code***

      ***loop****

     f.write(''.join(tweet_user + [","] + tweet_text + [","] + replies + [","] + retweets + [","] + ["\n"]) )
f.close()

这是一个可行的解决方案，但它只获取前20条推文

请提供您收到的一些错误消息，以及理想情况下哪些代码行有问题。代码中没有任何错误消息。请注意，以这种方式删除Twitter网站明显违反服务条款（第4节），并可能导致您的IP地址被阻止。您应该使用开发者API。非常感谢，它几乎可以完美地工作：-）您知道如何获取特定月份的数据吗？提前谢谢你！pls我目前无法创建一个新的线程…pls我可以在我的网站上嵌入Twitter推文的数量有任何限制吗？我网站上的每篇帖子都会包含一条嵌入的推文，我每页大概有10篇帖子。。thanks@Seybanks在不违反任何推特政策的情况下，尽可能多。

from urllib.request import urlopen
from bs4 import BeautifulSoup
file = "5_twitterBBC.csv"
f = open(file, "w")
Headers = "tweet_user, tweet_text,  replies,  retweets\n"
f.write(Headers)
url = "https://twitter.com/BBCWorld"
html = urlopen(url)
soup = BeautifulSoup(html, "html.parser")

# Gets the tweet
tweets = soup.find_all("li", attrs={"class":"js-stream-item"})

# Writes tweet fetched in file
for tweet in tweets:
    try:
        if tweet.find('p',{"class":'tweet-text'}):
            tweet_user = tweet.find('span',{"class":'username'}).text.strip()
            tweet_text = tweet.find('p',{"class":'tweet-text'}).text.encode('utf8').strip()
            replies = tweet.find('span',{"class":"ProfileTweet-actionCount"}).text.strip()
            retweets = tweet.find('span', {"class" : "ProfileTweet-action--retweet"}).text.strip()
            # String interpolation technique
            f.write(f'{tweet_user},/^{tweet_text}$/,{replies},{retweets}\n')
    except: AttributeError
f.close()