TwitterWebScraping空列表Python

TwitterWebScraping空列表Python,python,web-scraping,beautifulsoup,twitter,Python,Web Scraping,Beautifulsoup,Twitter,我正试图从推特上搜刮一些结果,它把我扔到了错误的下面 import requests import re from bs4 import BeautifulSoup url = u'https://twitter.com/search?q=' query = u'q=cruise&src=typed_query' r = requests.get(url+query) soup = BeautifulSoup(r.text,'html.parser') tweets = [] f

我正试图从推特上搜刮一些结果,它把我扔到了错误的下面

import requests
import re
from bs4 import BeautifulSoup

url = u'https://twitter.com/search?q='
query = u'q=cruise&src=typed_query'

r = requests.get(url+query)
soup = BeautifulSoup(r.text,'html.parser')

tweets = []

for item in soup.findAll('span',attrs={"class":"css-901oao css-16my406 r-poiln3 r-bcqeeo r-qvutc0"}):
    result = [item.get_text(strip=True, separator=" ")]
    tweets.append(result.text.encode("utf-8"))

f = open('search.csv', 'w')
f.write(r.text)

'charmap' codec can't encode character '\U0001f602' in position 17391: character maps to <undefined>
当我尝试打印(tweets)时,它会给我一个空列表,而对于f.write(r.text),它会给我以下错误

import requests
import re
from bs4 import BeautifulSoup

url = u'https://twitter.com/search?q='
query = u'q=cruise&src=typed_query'

r = requests.get(url+query)
soup = BeautifulSoup(r.text,'html.parser')

tweets = []

for item in soup.findAll('span',attrs={"class":"css-901oao css-16my406 r-poiln3 r-bcqeeo r-qvutc0"}):
    result = [item.get_text(strip=True, separator=" ")]
    tweets.append(result.text.encode("utf-8"))

f = open('search.csv', 'w')
f.write(r.text)

'charmap' codec can't encode character '\U0001f602' in position 17391: character maps to <undefined>
“charmap”编解码器无法对17391位置的字符“\U0001f602”进行编码:字符映射到

现代页面最常见的问题是:
twitter
使用
JavaScript
HTML
添加元素,但
请求
/
美化页面
无法运行JavaScript。您可能需要控制真正的web浏览器,它可以运行
JavaScript
。或者您应该使用
Twitter API
来获取数据,而不必进行刮削。检查
print('\U0001f602')
-它会给我表情符号,最终您应该获取
r.content
(解码前的字节数),而不是
r.text
(解码后的字符串),并将其保存在
bytes
模式-
打开(…,'wb'))
谢谢。我用Selenium试过,效果很好