Python 如何将嵌套字典解析为数据帧?
我有一个JSON文件,每一行如下所示:Python 如何将嵌套字典解析为数据帧?,python,json,dataframe,Python,Json,Dataframe,我有一个JSON文件,每一行如下所示: { "id": { "val": "dkjbskjb", "type": "cookie" }, "country": "US", "region": "Blank", "events": [ { "tap": "Device", "c": 98678, "ts": 12988685, "remove": [
{
"id": {
"val": "dkjbskjb",
"type": "cookie"
},
"country": "US",
"region": "Blank",
"events": [
{
"tap": "Device",
"c": 98678,
"ts": 12988685,
"remove": [
12,
13
]
}
]
}
我应该如何在python中解析它并将其保存到带有列的数据框中:
然后
你做了什么来解决这个问题?在我看来,您需要的是指南/教程或文档,而不是堆栈溢出。我甚至不清楚如何从事件创建列是什么意思,因为它是一个嵌套列表。好吧,我将文件展平并对其进行分析,以便创建列,即使是为列表创建列!!但问题是,如何对整个数据集运行它。我发布了代码。我仍然不明白
事件的具体内容。实际文件中的“删除”部分数据在每行中的长度不同。您能否共享足够的数据,使格式及其怪癖变得明显?
def flatten_json(y):
out = {}
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
i = 0
for a in x:
flatten(a, name + str(i) + '_')
i += 1
else:
out[name[:-1]] = x
flatten(y)
return out
jsonObj = json.loads(behavior_s3['mess'][0])
flat = flatten_json(jsonObj)
results = pd.DataFrame()
special_cols = []
columns_list = list(flat.keys())
for item in columns_list:
try:
row_idx = re.findall(r'\_(\d+)\_', item )[0]
except:
special_cols.append(item)
continue
column = re.findall(r'\_\d+\_(.*)', item )[0]
column = column.replace('_', '')
row_idx = int(row_idx)
value = flat[item]
results.loc[row_idx, column] = value
for item in special_cols:
results[item] = flat[item]