Python 如何修复unicodeencodeerror'ascii'编解码器无法编码字符
我用python编写了以下代码:Python 如何修复unicodeencodeerror'ascii'编解码器无法编码字符,python,url,unicode,Python,Url,Unicode,我用python编写了以下代码: import urllib from BeautifulSoup import BeautifulSoup from yattag import Doc, indent urls = [] sock1 = urllib.urlopen("http://www.tubtun.com/videos/") htmlSource1 = sock1.read() sock1.close() soup1 = BeautifulSoup(htmlSource1) f = o
import urllib
from BeautifulSoup import BeautifulSoup
from yattag import Doc, indent
urls = []
sock1 = urllib.urlopen("http://www.tubtun.com/videos/")
htmlSource1 = sock1.read()
sock1.close()
soup1 = BeautifulSoup(htmlSource1)
f = open ('sitemap.xml','w');
for i in soup1.findAll('a'):
if (i.get('href')):
if (i["href"].find("http://www.tubtun.com/video/") == 0):
if(i["href"][0:1]=="u'"):
i["href"]=i["href"][2:]
sock = urllib.urlopen(i["href"])
htmlSource = sock.read()
sock.close()
soup = BeautifulSoup(htmlSource)
thumbnailUrl = soup.find('meta',{'itemprop':"thumbnailUrl"})
name = soup.find('meta',{'itemprop':"name"})
description = soup.find('meta',{'itemprop':"description"})
duration = soup.find('meta',{'itemprop':"duration"})
contentURL = soup.find('meta',{'itemprop':"contentURL"})
embedURL = soup.find('meta',{'itemprop':"embedURL"})
uploadDate = soup.find('meta',{'itemprop':"uploadDate"})
vif = soup.find('meta',{'property':"og:video"})
doc, tag, text = Doc().tagtext()
with tag('url'):
with tag('loc'):
text(i["href"])
with tag('video:video'):
with tag('video:title'):
text(name["content"])
with tag('video:description'):
text(description["content"])
with tag('video:thumbnail_loc'):
text(thumbnailUrl["content"])
with tag('video:player_loc'):
text(vif["content"])
with tag('video:publication_date'):
text(uploadDate["content"][0:10])
result = indent(
doc.getvalue(),
indentation = ' '*4,
newline = '\r\n'
)
f.write(result)
f.close()
该脚本将为tubtun.com网站生成站点地图视频
运行脚本时,我遇到以下错误:
unicodeencodeerror“ascii”编解码器无法对字符进行编码
有解决方案吗?尝试f.writeresult.编码“UTF-8”。相同的错误:UnicodeError:URL u'\u304b\u308f\u3044\u3044'包含非ASCII字符整个错误消息是什么?错误发生在哪一行?脚本最初运行正常,但在生成150个站点地图视频后,它显示此错误:URL u'tubtun.com/video/Beauty_of_wild_Animages_4kUltra-HD_--Sony_FDR-AX1_\u304b\u308f\u3044\u3044'包含非ASCII字符当我打开视频链接的url时,我得到的视频不再存在!!!