Python 如何从包含具有重复键的对象的JSON文档中保留所有键值对?
我在正确完成这项工作时遇到了一些困难,但我有如下数据:Python 如何从包含具有重复键的对象的JSON文档中保留所有键值对?,python,json,python-2.7,duplicates,deserialization,Python,Json,Python 2.7,Duplicates,Deserialization,我在正确完成这项工作时遇到了一些困难,但我有如下数据: { "completedProtocol": "Extract", "map": [ { "sampleIDsIn": [{ "clarityId": "claritySample1", "espId": "ESP024254" }, { "clarityId": "claritySample1", "espId": "ESP024255" }, { "clarityId":
{
"completedProtocol": "Extract",
"map": [
{
"sampleIDsIn": [{ "clarityId": "claritySample1", "espId": "ESP024254" }, { "clarityId": "claritySample1", "espId": "ESP024255" }, { "clarityId": "claritySample1", "espId": "ESP024256"}],
"sampleIDsOut": ["claritySample3", "claritySample4", "claritySample5"],
"files":["http://fileserver.net/path/to/datafile3"]
}
],
"map": [
{
"sampleIDsIn": [{ "clarityId": "claritySample1", "espId": "ESP024258" }, { "clarityId": "claritySample1", "espId": "ESP024259" }, { "clarityId": "claritySample1", "espId": "ESP024260"}],
"sampleIDsOut": ["claritySample3", "claritySample4", "claritySample5"],
"files":["http://fileserver.net/path/to/datafile3"]
}
]
}
我想把它转换成:
[{"map": [
{
"sampleIDsIn": [{ "clarityId": "claritySample1", "espId": "ESP024254" }, { "clarityId": "claritySample1", "espId": "ESP024255" }, { "clarityId": "claritySample1", "espId": "ESP024256"}],
"sampleIDsOut": ["claritySample3", "claritySample4", "claritySample5"],
"files":["http://fileserver.net/path/to/datafile3"]
}
]},
{"map":[
{
"sampleIDsIn": [{ "clarityId": "claritySample1", "espId": "ESP024258" }, { "clarityId": "claritySample1", "espId": "ESP024259" }, { "clarityId": "claritySample1", "espId": "ESP024260"}],
"sampleIDsOut": ["claritySample3", "claritySample4", "claritySample5"],
"files":["http://fileserver.net/path/to/datafile3"]
}
]}]
到目前为止,我的代码是:
import json
obj = json.loads(body)
newData = [dct for dct in obj if 'map' in dct]
但这只会带来:
[u'map']
如果我只是在主体上使用json.load
,它只返回map
的第二个值,覆盖第一个值
注意:我想要一系列的单条指令;我不想在单个键下收集这些值
有什么想法吗?您可以使用自定义的
对象\u pairs\u hook
函数强制json.loads()
返回单个项dict的列表,而不是覆盖重复键的单个dict:
import json
def keep_duplicates(ordered_pairs):
result = []
for key, value in ordered_pairs:
result.append({key: value})
return result
从:
对象\u对\u钩子是一个可选函数,将使用
使用成对的有序列表解码任何对象文字的结果。
将使用object_pairs_hook的返回值,而不是
dict
。此功能可用于实现依赖
按照键和值对解码的顺序(例如,
collections.OrderedDict()
将记住插入顺序)。如果
还定义了对象挂钩,对象挂钩优先
用法:
>>> json.loads('{"a": 1, "a": 2, "a": 3}', object_pairs_hook=keep_duplicates)
[{u'a': 1}, {u'a': 2}, {u'a': 3}]
在您的情况下,由于您显然对“map”
键以外的任何东西都不感兴趣,因此您可以在之后过滤结果:
all_data = json.loads(body, object_pairs_hook=keep_duplicates)
map_data = [x for x in all_data if 'map' in x]
…这将为您提供问题中指定的确切结果