在Python中联合多个嵌套JSON
我有多个json文件,其中包含我需要合并的关系数据,每个文件都有一个记录,其中commonkey是所有文件中的commonkey,在下面的示例中,a0、a1是commonkey。该值是多个键(如Key1、key2等)的嵌套字典,如下所示,我需要合并多个json文件并获得dboutput.json中所示的输出,文件名作为合并操作中的索引。这样的问题是一个合并丢失信息的问题,但在我的情况下,我不希望任何更新替换现有密钥或跳过更新,在命中现有密钥的情况下,将创建另一个由文件名索引的嵌套字典,如下所示: 例如: 文件db1.json: "a0": { "commonkey": [ "a1", "parentkeyvalue1" ], "key1": "kvalue1", "key2": "kvalue2" "keyp": "kvalue2abc" }, "a1": { ... }在Python中联合多个嵌套JSON,python,json,dictionary,merge,union,Python,Json,Dictionary,Merge,Union,我有多个json文件,其中包含我需要合并的关系数据,每个文件都有一个记录,其中commonkey是所有文件中的commonkey,在下面的示例中,a0、a1是commonkey。该值是多个键(如Key1、key2等)的嵌套字典,如下所示,我需要合并多个json文件并获得dboutput.json中所示的输出,文件名作为合并操作中的索引。这样的问题是一个合并丢失信息的问题,但在我的情况下,我不希望任何更新替换现有密钥或跳过更新,在命中现有密钥的情况下,将创建另一个由文件名索引的嵌套字典,如下所示:
对于所有文件都是相同的,因此无需重复一个简单的解决方案是迭代每个JSON对象,并在您看到的每个“commonkey”中添加字典对。下面是一个示例,您将每个JSON文件加载到一个列表中,然后迭代合并它们
#!/usr/bin/python
import json
# Hardcoded list of JSON files
dbs = [ "db1.json", "db2.json" ]
output = dict() # stores all the merged output
for db in dbs:
# Name the JSON obj and load it
db_name = db.split(".json")[0]
obj = json.load(open(db))
# Iterate through the common keys, adding them only if they're new
for common_key, data in obj.items():
if common_key not in output:
output[common_key] = dict(commonkey=data["commonkey"])
# Within each common key, add key, val pairs
# subindexed by the database name
for key, val in data.items():
if key != "commonkey":
if key in output[common_key]:
output[common_key][key][db_name] = val
else:
output[common_key][key] = {db_name: val}
# Output resulting json to file
open("dboutput.json", "w").write(
json.dumps( output, sort_keys=True, indent=4, separators=(',', ': ') )
)
我终于得到了:
class NestedDict(collections.OrderedDict):
"""Implementation of perl's autovivification feature."""
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
value = self[item] = type(self)()
return value
def mergejsons(jsns):
##use auto vification Nested Dict
op=nesteddict.NestedDict()
for j in jsns:
jdata=json.load(open(j))
jname=j.split('.')[0][-2:]
for commnkey,val in jdata.items():
for k,v in val.items():
if k!='commonkey':
op[commnkey][k][jname]=v
if op[commnkey].has_key('commonkey'):
continue
else:
op[commnkey][k][jname]=v
您对待键
“a0”
和“commonkey”
的方式不同于“key2”
@JanneKarila,您能更清楚一点吗?我要指出的是,您在所需的输出中没有“commonkey”:{“db1”:[“a1”,“parentkeyvalue1”],“db2:[“a1”,“parentkeyvalue1”]}
。程序如何知道哪些键需要合并,哪些键不需要合并?@JanneKarila its因为commonkey是真正常见的,不需要重复,所以所有文件都是一样的+1谢谢,我已经用嵌套的dict和vivification解决了类似的问题
"commonkey": [
"a1",
"parentkeyvalue1"
],
#!/usr/bin/python
import json
# Hardcoded list of JSON files
dbs = [ "db1.json", "db2.json" ]
output = dict() # stores all the merged output
for db in dbs:
# Name the JSON obj and load it
db_name = db.split(".json")[0]
obj = json.load(open(db))
# Iterate through the common keys, adding them only if they're new
for common_key, data in obj.items():
if common_key not in output:
output[common_key] = dict(commonkey=data["commonkey"])
# Within each common key, add key, val pairs
# subindexed by the database name
for key, val in data.items():
if key != "commonkey":
if key in output[common_key]:
output[common_key][key][db_name] = val
else:
output[common_key][key] = {db_name: val}
# Output resulting json to file
open("dboutput.json", "w").write(
json.dumps( output, sort_keys=True, indent=4, separators=(',', ': ') )
)
class NestedDict(collections.OrderedDict):
"""Implementation of perl's autovivification feature."""
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
value = self[item] = type(self)()
return value
def mergejsons(jsns):
##use auto vification Nested Dict
op=nesteddict.NestedDict()
for j in jsns:
jdata=json.load(open(j))
jname=j.split('.')[0][-2:]
for commnkey,val in jdata.items():
for k,v in val.items():
if k!='commonkey':
op[commnkey][k][jname]=v
if op[commnkey].has_key('commonkey'):
continue
else:
op[commnkey][k][jname]=v