Python 将html页面从网站写入CSV文件时出错
当我尝试在my_html.html中编写网页的html时,会弹出此错误。请指导我如何能成功地写它 错误: 文件“C:\Users\DRB\AppData\Local\Programs\Python38-32\lib\encodings\cp1252.py”,第19行,在encode中 返回codecs.charmap\u encode(输入、自身错误、编码表)[0] UnicodeEncodeError:“charmap”编解码器无法对位置84032中的字符“\u21e3”进行编码:字符映射到Python 将html页面从网站写入CSV文件时出错,python,html,csv,web,Python,Html,Csv,Web,当我尝试在my_html.html中编写网页的html时,会弹出此错误。请指导我如何能成功地写它 错误: 文件“C:\Users\DRB\AppData\Local\Programs\Python38-32\lib\encodings\cp1252.py”,第19行,在encode中 返回codecs.charmap\u encode(输入、自身错误、编码表)[0] UnicodeEncodeError:“charmap”编解码器无法对位置84032中的字符“\u21e3”进行编码:字符映射到
import requests
def url_to_file(url, fname= "web_txt.html"):
response = requests.get(url)
html_text = response.text
if response.status_code == 200:
with open(fname, "w") as r:
r.write(str(html_text))
return html_text
return "Failed to perform its task."
url = "https://www.geeksforgeeks.org/absolute-relative-pathnames-unix/"
print(url_to_file(url))
尝试以二进制模式打开页面并保存响应的
.content
,而不是.text
:
import requests
def url_to_file(url, fname="web_txt.html"):
response = requests.get(url)
html_content = response.content # <-- use .content
if response.status_code == 200:
with open(fname, "wb") as r: # <-- open file in binary mode
r.write(html_content)
return html_content.decode('utf-8', 'ignore') # <-- decode content as utf-8
return "Failed to perform its task."
url = "https://www.geeksforgeeks.org/absolute-relative-pathnames-unix/"
print(url_to_file(url))
导入请求
定义url到文件(url,fname=“web_txt.html”):
response=requests.get(url)
html_content=response.content#
...
并保存web_txt.html
<!DOCTYPE html>
<!--[if IE 7]>
<html class="ie ie7" lang="en-US" prefix="og: http://ogp.me/ns#">
<![endif]-->
...<!DOCTYPE html>
<!--[if IE 7]>
<html class="ie ie7" lang="en-US" prefix="og: http://ogp.me/ns#">
<![endif]-->
...