Python 如何用字符串规范化嵌套JSON？_Python_Json_Pandas_Dataframe_Nested

Python 如何用字符串规范化嵌套JSON？

python json pandas dataframe

Python 如何用字符串规范化嵌套JSON？,python,json,pandas,dataframe,nested,Python,Json,Pandas,Dataframe,Nested,我想用包含另一个dict的字符串从嵌套的JSON规范化并创建dataframe。我已经试过了 with open('/content/drive/My Drive/conversation_data.json', 'r') as f: data = json.load(f) table = pd.json_normalize(data, 'conversations') table 但它返回所有以行分隔的单个字符串。如何返回具有对话id、作者id等的数据帧表这是JSON： [ {

我想用包含另一个dict的字符串从嵌套的JSON规范化并创建dataframe。我已经试过了

with open('/content/drive/My Drive/conversation_data.json', 'r') as f:
  data = json.load(f)

table = pd.json_normalize(data, 'conversations')
table

但它返回所有以行分隔的单个字符串。如何返回具有对话id、作者id等的数据帧表

这是JSON：

[
  {
    "data_loaded": "2019-12-21 12:00:22.189441 UTC",
    "ticket_id": "222815",
    "ticket_created_at": "2019-12-21T12:07:52Z",
    "conversations": "{\"conversations\":[{\"conversation_id\":\"866229422292\",\"author_id\":\"391349919632\",\"body\":\"==========Write below this ...\",\"created_at\":\"2019-12-21T12:07:52Z\",\"via_channel\":\"email\"}]}"
  }
]

字符串本身似乎是一个JSON片段。它实际上并不包含这些反斜杠（它们是打印字符串时如何表示字符串的一部分），因此您只需将其反馈给JSON解析器即可

json.load

和

json.dump

与文件一起使用；对字符串进行操作的相应函数是

json.loads

和

json.dumps

（用“s”表示“s”字符串）

例如：

# pull out the embedded JSON string from the parsed JSON, then re-parse it
conversations = json.loads(data[0]["conversations"])

字符串本身似乎是一个JSON片段。它实际上并不包含这些反斜杠（它们是打印字符串时如何表示字符串的一部分），因此您只需将其反馈给JSON解析器即可

json.load

和

json.dump

与文件一起使用；对字符串进行操作的相应函数是

json.loads

和

json.dumps

（用“s”表示“s”字符串）

例如：

# pull out the embedded JSON string from the parsed JSON, then re-parse it
conversations = json.loads(data[0]["conversations"])

请尝试以下操作：

data = [
  {
    "data_loaded": "2019-12-21 12:00:22.189441 UTC",
    "ticket_id": "222815",
    "ticket_created_at": "2019-12-21T12:07:52Z",
    "conversations": "{\"conversations\":[{\"conversation_id\":\"866229422292\",\"author_id\":\"391349919632\",\"body\":\"==========Write below this ...\",\"created_at\":\"2019-12-21T12:07:52Z\",\"via_channel\":\"email\"}]}"
  }
]

conversations = json.loads(data[0]['conversations'])

table = pd.json_normalize(conversations, 'conversations')
print(table)

请尝试以下操作：

data = [
  {
    "data_loaded": "2019-12-21 12:00:22.189441 UTC",
    "ticket_id": "222815",
    "ticket_created_at": "2019-12-21T12:07:52Z",
    "conversations": "{\"conversations\":[{\"conversation_id\":\"866229422292\",\"author_id\":\"391349919632\",\"body\":\"==========Write below this ...\",\"created_at\":\"2019-12-21T12:07:52Z\",\"via_channel\":\"email\"}]}"
  }
]

conversations = json.loads(data[0]['conversations'])

table = pd.json_normalize(conversations, 'conversations')
print(table)