使用Python将数据从Wikipedia提取到txt文件
我已经实现了以下代码来从Wikipedia页面提取数据使用Python将数据从Wikipedia提取到txt文件,python,file,web-scraping,io,Python,File,Web Scraping,Io,我已经实现了以下代码来从Wikipedia页面提取数据 import bs4 import sys import requests res = requests.get('https://en.wikipedia.org/wiki/Agriculture' ) res.raise_for_status() wiki = bs4.BeautifulSoup(res.text,"html.parser") for i in wiki.select('p'): print(i.getText
import bs4
import sys
import requests
res = requests.get('https://en.wikipedia.org/wiki/Agriculture' )
res.raise_for_status()
wiki = bs4.BeautifulSoup(res.text,"html.parser")
for i in wiki.select('p'):
print(i.getText())
这段代码根据我的需要从页面中提取所有数据。然而,我想使用Python将其存储在文本文件中,但我无法。
如果文本文件是从url本身提取的,那么它的名称应该更好,这样它就可以在多个wiki页面上使用 试试这个:
wiki_page = 'Agriculture'
res = requests.get(f'https://en.wikipedia.org/wiki/{wiki_page}' )
res.raise_for_status()
wiki = bs4.BeautifulSoup(res.text,"html.parser")
# open a file named as your wiki page in write mode
with open(wiki_page+".txt", "w", encoding="utf-8") as f:
for i in wiki.select('p'):
# write each paragraph to the file
f.write(i.getText())
试试这个:
wiki_page = 'Agriculture'
res = requests.get(f'https://en.wikipedia.org/wiki/{wiki_page}' )
res.raise_for_status()
wiki = bs4.BeautifulSoup(res.text,"html.parser")
# open a file named as your wiki page in write mode
with open(wiki_page+".txt", "w", encoding="utf-8") as f:
for i in wiki.select('p'):
# write each paragraph to the file
f.write(i.getText())
试试这个。这是参考资料。
试试这个。这是参考资料。
以下内容与所有python版本兼容:
import bs4
import sys
import requests
url = "https://en.wikipedia.org/wiki/Agriculture"
res = requests.get(url)
res.raise_for_status()
wiki = bs4.BeautifulSoup(res.text,"html.parser")
file_to_write = open(url.split('/')[-1]+".txt", "a") # append mode
for i in wiki.select('p'):
text_to_write = i.getText().encode('utf-8')
print(text_to_write)
file_to_write.write(text_to_write)
file_to_write.close()
以下内容与所有python版本兼容:
import bs4
import sys
import requests
url = "https://en.wikipedia.org/wiki/Agriculture"
res = requests.get(url)
res.raise_for_status()
wiki = bs4.BeautifulSoup(res.text,"html.parser")
file_to_write = open(url.split('/')[-1]+".txt", "a") # append mode
for i in wiki.select('p'):
text_to_write = i.getText().encode('utf-8')
print(text_to_write)
file_to_write.write(text_to_write)
file_to_write.close()
你好尝试它给出了以下错误。UnicodeEncodeError:“charmap”编解码器无法对位置74中的字符“\u016b”进行编码:字符映射到Happy to help:您好。尽管如此,但为什么不创建.txt文件?@Sca已更新,现在将在您的工作目录hi中创建Agriculture.txt。尝试它给出了以下错误。UnicodeEncodeError:“charmap”编解码器无法对位置74中的字符“\u016b”进行编码:字符映射到Happy to help:您好。尽管如此,但为什么不创建.txt文件?@Sca已更新,现在将在您的工作目录中创建Agriculture.txt