Pandas 将字节数据转换为数据帧

Pandas 将字节数据转换为数据帧,pandas,python-3.8,Pandas,Python 3.8,我有以下资料: {"links":[{"rel":"self","href":"https://api.pjm.com"}, {"rel":"next","href":"https://api.pjm.com"},{"rel":"metadata","href":"https://api.pjm.com/api/v1/ftr_cong_lmp/metadata"}], "items":[{"effective_day":"2020-12-01T00:00:00","terminate_day":

我有以下资料:

{"links":[{"rel":"self","href":"https://api.pjm.com"},
{"rel":"next","href":"https://api.pjm.com"},{"rel":"metadata","href":"https://api.pjm.com/api/v1/ftr_cong_lmp/metadata"}],
"items":[{"effective_day":"2020-12-01T00:00:00","terminate_day":"2020-12-31T00:00:00","pnode_name":"02AMSTED138 KV  TR2","offpeak_clmp":-0.290000,"onpeak_clmp":-0.240000,"24hour_clmp":-0.270000,"lt_sim_offpeak_clmp":-0.240000,"lt_sim_onpeak_clmp":-0.220000,"lt_sim_clmp":-0.240000},{"effective_day":"2020-12-01T00:00:00","terminate_day":"2020-12-31T00:00:00","pnode_name":"02AMSTED138 KV  TR6","offpeak_clmp":-0.290000,"onpeak_clmp":-0.240000,"24hour_clmp":-0.270000,"lt_sim_offpeak_clmp":-0.240000,"lt_sim_onpeak_clmp":-0.220000,"lt_sim_clmp":-0.240000},{"effective_day":"2020-12-01T00:00:00","terminate_day":"2020-12-31T00:00:00","pnode_name":"02CPP_NH138 KV  TR2","offpeak_clmp":0.010000,"onpeak_clmp":1.530000,"24hour_clmp":0.660000,"lt_sim_offpeak_clmp":0.010000,"lt_sim_onpeak_clmp":1.520000,"lt_sim_clmp":0.660000}],"searchSpecification":{"rowCount":25,"sort":"terminate_day","order":"Desc","startRow":1,"isActiveMetadata":true,"fields":["24hour_clmp","effective_day","lt_sim_clmp","lt_sim_offpeak_clmp","lt_sim_onpeak_clmp","offpeak_clmp","onpeak_clmp","pnode_name","terminate_day"],"filters":[{"effective_day":"2020-01-01T00:00:00.0000000 to 2020-12-31T23:59:59.0000000"}]},"totalRows":163378}'
我试图将上述数据放入数据框,因此我尝试以下操作:

from io import StringIO    
s=str(bytes_data,'utf-8')    
data = StringIO(s)     
df=pd.read_csv(data)
但是它给了我一个空的数据框,列中有完整的数据

编辑:

有关资料载于此处:

{"effective_day":"2020-12-01T00:00:00","terminate_day":"2020-12-31T00:00:00","pnode_name":"02AMSTED138 KV  TR2","offpeak_clmp":-0.290000,"onpeak_clmp":-0.240000,"24hour_clmp":-0.270000,"lt_sim_offpeak_clmp":-0.240000,"lt_sim_onpeak_clmp":-0.220000,"lt_sim_clmp":-0.240000}

i、 e.我试图将上述内容放入一个数据框中,其中列作为上述字典的键,但如何从原始数据中仅提取这些项以将其放入数据框中。

您可以将字符串数据求值到字典中,并使用它创建数据框:

pd.DataFrame(eval(s)['items'])
在您需要定义表达式中使用的
true
值之前,例如通过
true=true

结果:

         effective_day        terminate_day  ... lt_sim_onpeak_clmp  lt_sim_clmp
0  2020-12-01T00:00:00  2020-12-31T00:00:00  ...              -0.22        -0.24
1  2020-12-01T00:00:00  2020-12-31T00:00:00  ...              -0.22        -0.24
2  2020-12-01T00:00:00  2020-12-31T00:00:00  ...               1.52         0.66
但是,出于安全原因,建议使用而不是
eval
。在这种情况下,
true
的变量定义不起作用,因此需要在字符串中手动替换它:

import ast
pd.DataFrame(ast.literal_eval(s.replace('true','True'))['items'])

示例JSON中的数组长度不同。请缩短bytestring,并解释当列的长度不同时,生成的数据帧应该是什么样子。上面的编辑是否回答了您的问题?谢谢。但现在还不清楚你是否有字典、类似字节的对象或其他东西。另外,您想要的最终数据帧是什么样子的?我已经用粗体字母突出显示了它,以显示最终输出应有的内容。