Python 如何处理反序列化数据结构
如果状态已达到,并且状态已开始,我希望获取时间戳。但我发现我在这里使用的当前方法(maxsplit)存在一个问题。由于下面包含(\n\n)的行,我得到下面提到的错误。有些数据有“\n\n”,有些没有。我们如何处理这个案子?我只需要状态,是否可以过滤掉Cupdate键Python 如何处理反序列化数据结构,python,json,python-3.x,Python,Json,Python 3.x,如果状态已达到,并且状态已开始,我希望获取时间戳。但我发现我在这里使用的当前方法(maxsplit)存在一个问题。由于下面包含(\n\n)的行,我得到下面提到的错误。有些数据有“\n\n”,有些没有。我们如何处理这个案子?我只需要状态,是否可以过滤掉Cupdate键 "detail": "Info from Railway\nTrain: T12049 \nStatus: Reached\nCupdate: Arrival Info: On Time.\n\nIRC
"detail": "Info from Railway\nTrain: T12049 \nStatus: Reached\nCupdate: Arrival Info: On Time.\n\nIRCTC"
代码:
response.json
{
"ndatas": [
{
"results": {
"pnr_number": "PNR9087651232",
"Reservation Date": "2020-09-29T10:33:55.000+0000",
"Current State": "Waiting List",
"BookingLogs": [
{
"pnr_category": "agent",
"tstp": "2020-09-29T10:54:56.000+0000",
"detail": "Booking Closed: Updated customer"
},
{
"pnr_category": "Railway",
"tstp": "2020-09-29T10:56:41.000+0000",
"detail": "Tatkal tickets reservation is open"
},
{
"pnr_category": "booking_via_irctc",
"tstp": "2020-09-29T10:56:54.000+0000",
"detail": "Info from Railway\nTrain: T12049 \nStatus: Started\nCupdate: Functioning on Time"
},
{
"pnr_category": "booking_via_irctc",
"tstp": "2020-09-30T14:44:34.000+0000",
"detail": "Info from Railway\nTrain: T12049 \nStatus: Reached\nCupdate: On Time"
},
{
"pnr_category": "agent",
"tstp": "2020-10-01T07:12:20.000+0000",
"detail": "All bookings Truncated"
},
{
"pnr_category": "booking_via_irctc",
"tstp": "2020-10-07T15:30:16.000+0000",
"detail": "Info from Railway\nTrain: T12049 \nStatus: Cancelled\nCupdate: Heavy Rain"
}
],
"from": "Kolkatta",
"to loc": "Mumbai"
}
},
{
"results": {
"pnr_number": "PNR90876512322",
"Reservation Date": "2020-09-29T10:33:55.000+0000",
"Current State": "Waiting List",
"BookingLogs": [
{
"pnr_category": "agent",
"tstp": "2020-09-29T10:54:56.000+0000",
"detail": "Booking Closed: Updated customer"
},
{
"pnr_category": "Railway",
"tstp": "2020-09-29T10:56:41.000+0000",
"detail": "Tatkal tickets reservation is open"
},
{
"pnr_category": "booking_via_irctc",
"tstp": "2020-09-29T10:56:54.000+0000",
"detail": "Info from Railway\nTrain: T12049 \nStatus: Started\nCupdate: Functioning on Time"
},
{
"pnr_category": "booking_via_irctc",
"tstp": "2020-09-30T14:44:34.000+0000",
"detail": "Info from Railway\nTrain: T12049 \nStatus: Reached\nCupdate: Arrival Info: On Time.\n\nIRCTC"
},
{
"pnr_category": "agent",
"tstp": "2020-10-01T07:12:20.000+0000",
"detail": "All bookings Truncated"
},
{
"pnr_category": "booking_via_irctc",
"tstp": "2020-10-07T15:30:16.000+0000",
"detail": "Info from Railway\nTrain: T12049 \nStatus: Cancelled\nCupdate: Heavy Rain"
}
],
"from": "Kolkatta",
"to loc": "Mumbai"
}
}
]
}
错误:
C:\DeveloperArea\PycharmProjects\python-rest-client\venv\Scripts\python.exe C:/DeveloperArea/PycharmProjects/python-rest-client/demo_data.py
Traceback (most recent call last):
File "C:/DeveloperArea/PycharmProjects/python-rest-client/demo_data.py", line 48, in <module>
main()
File "C:/DeveloperArea/PycharmProjects/python-rest-client/demo_data.py", line 31, in main
results = [
File "C:/DeveloperArea/PycharmProjects/python-rest-client/demo_data.py", line 34, in <listcomp>
*extract_started_reached_tstps(record["results"]["BookingLogs"]),
File "C:/DeveloperArea/PycharmProjects/python-rest-client/demo_data.py", line 14, in extract_started_reached_tstps
details = dict(line.split(': ', maxsplit=1) for line in log["detail"].splitlines()[1:])
ValueError: dictionary update sequence element #3 has length 1; 2 is required
Process finished with exit code 1
C:\DeveloperArea\PycharmProjects\python rest client\venv\Scripts\python.exe C:/DeveloperArea/PycharmProjects/python rest client/demo_data.py
回溯(最近一次呼叫最后一次):
文件“C:/DeveloperArea/PycharmProjects/pythonrestclient/demo_data.py”,第48行,在
main()
文件“C:/DeveloperArea/PycharmProjects/pythonrestclient/demo_data.py”,第31行,在main中
结果=[
文件“C:/DeveloperArea/PycharmProjects/pythonrestclient/demo_data.py”,第34行,在
*提取\u开始\u到达\u tstps(记录[“结果”][“预订日志]),
文件“C:/DeveloperArea/PycharmProjects/python rest client/demo_data.py”,第14行,在extract_start_reacted_tstps中
对于日志[“detail”]中的行,details=dict(line.split(“:”,maxslit=1)。splitlines()[1:]
ValueError:字典更新序列元素#3的长度为1;需要2
进程已完成,退出代码为1
您的问题实际上与json无关。问题是如何处理反序列化的数据结构,尤其是如何解析此字符串。我认为您可能应该只使用正则表达式,这里有一个模式,仅在单独的换行上拆分,使用负向后看和负向前看:
>>> regex = re.compile("(?<!\n)\n(?!\n)")
>>> regex.split(s)
['Info from Railway', 'Train: T12049 ', 'Status: Reached', 'Cupdate: Arrival Info: On Time.\n\nIRCTC']
我如何将正则表达式合并到我的代码中?你能帮我吗?我从来没有使用过正则表达式,这个主题对我来说是很新的。我尝试过类似的方法,但没有效果。任何指针都会有帮助
>>> regex = re.compile("(?<!\n)\n(?!\n)")
>>> regex.split(s)
['Info from Railway', 'Train: T12049 ', 'Status: Reached', 'Cupdate: Arrival Info: On Time.\n\nIRCTC']
>>> regex = re.compile(r"Status: (\S+)")
>>> regex.search(s).group()
'Status: Reached'
>>> regex.search(s).group(1)
'Reached'