Python 如何将json数据帧转换为普通数据帧?
我有一个数据框架,里面有很多json数据 例如:Python 如何将json数据帧转换为普通数据帧?,python,json,pandas,Python,Json,Pandas,我有一个数据框架,里面有很多json数据 例如: {"serial": "000000001fb105ea", "sensorType": "acceleration", "data": [1603261123.328814, 0.171875, -0.9609375, 0.0234375]} {"serial": "000000001fb105ea", &
{"serial": "000000001fb105ea", "sensorType": "acceleration", "data": [1603261123.328814, 0.171875, -0.9609375, 0.0234375]}
{"serial": "000000001fb105ea", "sensorType": "acceleration", "data": [1603261125.0605137, 0.0859375, -0.984375, 0.0]}
{"serial": "000000001fb105ea", "sensorType": "strain", "data": [1603261126.3532753, 0.9649793604217437]}
{"serial": "000000001fb105ea", "sensorType": "acceleration", "data": [1603261127.6988888, 0.0390625, -1.0, 0.125]}
{"serial": "000000001fb105ea", "sensorType": "acceleration", "data": [1603261128.8530502, 0.078125, -0.9921875, 0.0]}
有两种类型的数据。应变传感器和加速度传感器
我想解析这些json数据并转换为标准格式。我只需要json对象的数据部分。结果是,数据中的每个值都应该有4列
Date: 21.20.2020:09:18:46 x:0.171875 y:-0.9609375 z:0.0234375
我尝试了json_规范化,但出现了这个错误
AttributeError: 'str' object has no attribute 'itervalues'
如何将数据部分解析为4列数据帧
谢谢。如果输入数据在
json
文件中,请使用:
cols = ['Date','x','y','z']
df = pd.DataFrame(pd.read_json('json.json', lines=True)['data'].tolist(), columns=cols)
df['Date'] = pd.to_datetime(df['Date'], unit='s')
print (df)
Date x y z
0 2020-10-21 06:18:43.328814030 0.171875 -0.960938 0.023438
1 2020-10-21 06:18:45.060513735 0.085938 -0.984375 0.000000
2 2020-10-21 06:18:46.353275299 0.964979 NaN NaN
3 2020-10-21 06:18:47.698888779 0.039062 -1.000000 0.125000
4 2020-10-21 06:18:48.853050232 0.078125 -0.992188 0.000000
如果输入为带有列的数据帧,则:
cols = ['Date','x','y','z']
df = pd.DataFrame(pd.json_normalize(df['col'])['data'].tolist(), columns=cols)
df['Date'] = pd.to_datetime(df['Date'], unit='s')
print (df)
Date x y z
0 2020-10-21 06:18:43.328814030 0.171875 -0.960938 0.023438
1 2020-10-21 06:18:45.060513735 0.085938 -0.984375 0.000000
2 2020-10-21 06:18:46.353275299 0.964979 NaN NaN
3 2020-10-21 06:18:47.698888779 0.039062 -1.000000 0.125000
4 2020-10-21 06:18:48.853050232 0.078125 -0.992188 0.000000
编辑:
像.xls
那样亲自保存csv不是个好主意,因为这样读取\u excel
会产生奇怪的错误,但您可以使用:
import ast
df = pd.read_csv('15-10-2020-OO.xls')
cols = ['Date','x','y','z']
data = [x['data'] for x in df['Data'].apply(ast.literal_eval)]
df = pd.DataFrame(data, columns=cols)
df['Date'] = pd.to_datetime(df['Date'], unit='s')
print (df)
Date x y z
0 2020-10-15 07:21:16.159236193 0.085938 -0.972656 0.003906
1 2020-10-15 07:21:17.597931385 0.089844 -0.968750 0.003906
2 2020-10-15 07:21:18.838171959 0.089844 -0.972656 0.003906
3 2020-10-15 07:21:20.338105917 0.085938 -0.972656 0.003906
4 2020-10-15 07:21:21.768864155 0.089844 -0.984375 0.003906
... ... ... ...
8457 2020-10-15 08:59:57.907007933 0.085938 -0.972656 0.003906
8458 2020-10-15 08:59:58.371274233 0.089844 -0.976562 0.003906
8459 2020-10-15 08:59:58.833237648 0.085938 -0.976562 0.003906
8460 2020-10-15 08:59:59.313337088 1.517057 NaN NaN
8461 2020-10-15 08:59:59.863240004 0.089844 -0.968750 0.007812
[8462 rows x 4 columns]
不,输入数据在csv文件中。您可以为csv进行编辑吗?谢谢:)@notNowOnlyCoding-csv文件看起来怎么样?是否可以将
json.json
更改为file.csv
?就是这样。有什么解决办法吗?@notnownonlycoding-我答案的第二段?发生了这个关键错误。知道吗?df是图片中代码之前的数据帧。