python json转储unicode错误
我试图将字典存储为utf-8编码的json文档,但我似乎做错了什么,不知道是什么。我已经在下面发布了stacktrace和函数python json转储unicode错误,python,json,unicode,Python,Json,Unicode,我试图将字典存储为utf-8编码的json文档,但我似乎做错了什么,不知道是什么。我已经在下面发布了stacktrace和函数 def parse_contents(res_dict, file): content_payload = res_dict['parse']['wikitext']['*'] sections_payload = res_dict['parse']['sections'] db = {} #parse_captures = ("Owner", "Descriptio
def parse_contents(res_dict, file):
content_payload = res_dict['parse']['wikitext']['*']
sections_payload = res_dict['parse']['sections']
db = {}
#parse_captures = ("Owner", "Description", "Usage", "Examples", "Options", "Misc.")
def now_next_iter(iterable):
import itertools
a, b = itertools.tee(sections_payload)
next(b, None)
return itertools.izip(a, b)
def remove_tags(text):
import re
return re.sub('<[^<]+?>', '', text)
for cur, nxt in now_next_iter(sections_payload):
if cur['toclevel'] == 2:
head = cur['line']
db[head] = {}
elif cur['toclevel'] == 3:
line = cur['line']
ibo = cur['byteoffset']
fbo = nxt['byteoffset']
content = remove_tags(content_payload[ibo:fbo])
db[head][line] = content #.encode('utf-8')
with io.open(file, 'w', encoding='utf8') as json_db:
s = json.dumps( db, sort_keys=True, indent=4,
separators=(',', ': '))
json_db.write(s.encode('utf-8'))
输出:
这令人困惑,因为我认为s.encode('utf-8')应该将其更改为unicode。
您可能需要设置json.dumps可选参数'sure_ascii=False',和/或在json.dumps中设置encoding='UTF8',而不仅仅是file.open()调用,这将允许json包使用其选项处理非ascii数据
请参阅此处的文档:根据您的建议更新了问题,没有运气:(nvm,我是个白痴,我在尝试1时取出了s.encode,成功了。谢谢
with io.open(file, 'w', encoding='utf8') as json_db:
s = json.dumps( db, sort_keys=True, indent=4,
ensure_ascii=False, encoding='UTF8', separators=(',', ': '))
s = s.encode('utf-8')
json_db.write(s)