Python 2.7,unicode-序号不在范围内
我试图将一个字符串从一个xml文件拉入另一个文件(HTML),但当我尝试运行脚本时,会出现以下错误:Python 2.7,unicode-序号不在范围内,python,xml,python-2.7,unicode,Python,Xml,Python 2.7,Unicode,我试图将一个字符串从一个xml文件拉入另一个文件(HTML),但当我尝试运行脚本时,会出现以下错误: UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 124: ordinal not in range(128) 这是Python代码: f = open('web/tv.html', 'a') counter = 0 for showname in os.listdir('xml/add
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 124: ordinal not in range(128)
这是Python代码:
f = open('web/tv.html', 'a')
counter = 0
for showname in os.listdir('xml/additional'):
tree = et.parse('xml/additional/%s/en.xml' % showname)
root = tree.getroot()
series = root.find('Series')
description = series.find('Overview').text
cell = '\n<tr><td>' + showname + '</td><td>' + description + '</td></tr>'
f.write(cell)
f.append(u'</table></div></body></html>')
f=open('web/tv.html','a')
计数器=0
对于os.listdir('xml/additional')中的showname:
tree=et.parse('xml/additional/%s/en.xml'%showname)
root=tree.getroot()
series=root.find('series')
description=series.find('Overview')。text
单元格='\n'+showname+''+description+''
f、 写入(单元)
f、 附加(u“”)
这是XML文件的一个示例:
<Series>
<Overview>From Emmy Award-winner Dan Harmon comes "Community", a smart comedy series about higher education – and lower expectations. The student body at Greendale Community College is made up of high-school losers, newly divorced housewives, and old people who want to keep their minds active. Within these not-so-hallowed halls, Community focuses on a band of misfits, at the center of which is a fast-talkin' lawyer whose degree has been revoked, who form a study group and end up learning a lot more about themselves than they do about their course work.</Overview>
<other>stuff</other>
</Series>
艾美奖得主丹·哈蒙(Dan Harmon)出演了《社区》(Community),这是一部关于高等教育和低期望的聪明喜剧系列。格林代尔社区学院的学生群体由高中失败者、新近离婚的家庭主妇和希望保持头脑活跃的老年人组成。在这些不那么神圣的大厅里,社区关注的是一群不合群的人,他们的中心是一位被吊销学位的健谈律师,他组建了一个学习小组,最终对自己的了解要比对课程工作的了解多得多。
东西
有人能告诉我我做错了什么吗?我发现Unicode非常复杂。您将Unicode与ByTestRing混合在一起;XML结果是Unicode值,其中包括。如果不先进行编码,则无法将结果写入纯文本文件 使用以下代码将
说明
编码为ASCII文本:
description = description.encode('ascii', 'xmlcharrefreplace')
它对ASCII以外的任何代码点使用HTML实体:
>>> description = u'... a smart comedy series about higher education – and lower expectations.'
>>> description.encode('ascii', 'xmlcharrefreplace')
'... a smart comedy series about higher education – and lower expectations.'
f.append
应在文件对象上失败