Python Json.dump使用'失败；必须是unicode，而不是str'；打字错误_Python_Json_Python 2.7_Unicode_Encoding

Python Json.dump使用'失败；必须是unicode，而不是str'；打字错误

python json python-2.7 unicode encoding

Python Json.dump使用'失败；必须是unicode，而不是str'；打字错误,python,json,python-2.7,unicode,encoding,Python,Json,Python 2.7,Unicode,Encoding,我有一个json文件，它碰巧有大量的中文和日文（以及其他语言）字符。我正在使用io将其加载到python 2.7脚本中 with io.open('multiIdName.json', encoding="utf-8") as json_data: cards = json.load(json_data) 我向json添加了一个新属性，一切正常。然后我尝试将其写回另一个文件： with io.open("testJson.json",'w',encoding="utf-8") as o

我有一个json文件，它碰巧有大量的中文和日文（以及其他语言）字符。我正在使用

io将其加载到python 2.7脚本中
with io.open('multiIdName.json', encoding="utf-8") as json_data:
    cards = json.load(json_data)

我向json添加了一个新属性，一切正常。然后我尝试将其写回另一个文件：
with io.open("testJson.json",'w',encoding="utf-8") as outfile:
        json.dump(cards, outfile, ensure_ascii=False)

这时我得到了错误TypeError:必须是unicode，而不是str

我尝试将outfile作为二进制文件编写（，io.open（“testJson.json”，“wb”）作为outfile:
），但最终我得到了以下结果：
{"multiverseid": 262906, "name": "\u00e6\u00b8\u00b8\u00e9\u009a\u00bc\u00e7\u008b\u00ae\u00e9\u00b9\u00ab", "language": "Chinese Simplified"}

我认为用相同的编码打开并编写它就足够了，还有Sure_ascii标志，但显然不行。我只想在运行脚本之前保留文件中存在的字符，而不将它们转换为\u。能否尝试以下操作
with io.open("testJson.json",'w',encoding="utf-8") as outfile:
  outfile.write(unicode(json.dumps(cards, ensure_ascii=False)))

此错误的原因是Python 2中的json.dumps
的愚蠢行为：
>>> json.dumps({'a': 'a'}, ensure_ascii=False)
'{"a": "a"}'
>>> json.dumps({'a': u'a'}, ensure_ascii=False)
u'{"a": "a"}'
>>> json.dumps({'a': 'ä'}, ensure_ascii=False)
'{"a": "\xc3\xa4"}'
>>> json.dumps({u'a': 'ä'}, ensure_ascii=False)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/json/__init__.py", line 250, in dumps
    sort_keys=sort_keys, **kw).encode(obj)
  File "/usr/lib/python2.7/json/encoder.py", line 210, in encode
    return ''.join(chunks)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)

在这种情况下，我相信您可以使用json.dump
写入打开的二进制文件；但是，如果需要对结果对象执行更复杂的操作，则可能需要上面的代码

一种解决方案是通过切换到Python3来结束这种编码/解码的疯狂状态。
JSON模块为您处理编码和解码，因此您可以简单地以二进制模式打开输入和输出文件。JSON模块采用UTF-8编码，但可以使用load（）
和dump（）
方法上的encoding
属性进行更改
with open('multiIdName.json', 'rb') as json_data:
    cards = json.load(json_data)

然后：


多亏了@Antti Haapala，Python 2.x JSON模块根据对象的内容提供Unicode或str
在通过io
写入之前，您必须添加一个检测，以确保结果是Unicode：
with io.open("testJson.json", 'w', encoding="utf-8") as outfile:
    my_json_str = json.dumps(my_obj, ensure_ascii=False)
    if isinstance(my_json_str, str):
        my_json_str = my_json_str.decode("utf-8")

    outfile.write(my_json_str)

我不确定。我相信这是因为您正在以utf-8编码文件的形式打开文件指针，但您正在转储一个字符串类型对象（cards
）。啊，应该提到的是，cards是一个json对象：cards=json.load（json_数据）
您添加的新属性是什么？有可能写一封信吗？看来已经成功了，谢谢。我假设outfile.write从json.dumps获取输出，然后将其写入文件？很好：）是的。写入（内容）-将内容写入输出文件。而outfile指的是“testJson.json”文件。看到更多的危险！您得到了一个隐含的str->Unicode转换，没有定义编码。在Python2.x中，默认编码是ASCII，因此如果JSON包含非ASCII字符，您将得到一个UnicodeCodeError
异常您可以将8位字符串指定给JSON，并且输出仍然会中断。当我这样做时，我得到：UnicodeCodeError:“ASCII”编解码器无法对位置1-2的字符进行编码：序号不在范围内（128）
您确定在两个open（）
调用上都设置了b模式吗？是的，我确定。它在json.dump行上。我认为你的答案更好地回答了这个问题。想在你的答案中添加写部分，我将删除我的答案吗？@AlastairMcCormack naah现在很忙，我今天已经点击了repcap：如果我成功地将我的脚本转换为py3，编码处理将如何改变？
with open("testJson.json", 'wb') as outfile:
    json.dump(cards, outfile, ensure_ascii=False)

with io.open("testJson.json", 'w', encoding="utf-8") as outfile:
    my_json_str = json.dumps(my_obj, ensure_ascii=False)
    if isinstance(my_json_str, str):
        my_json_str = my_json_str.decode("utf-8")

    outfile.write(my_json_str)