Python Can';t调用“数据透视框架上的json:returns”;ValueError:标签数组大小与相应的数据形状不匹配;
我有一个包含传感器时间序列数据的数据库。我想在前端使用的图形库要求将数据重塑为每个传感器的一列,而不是数据集中的垂直格式:Python Can';t调用“数据透视框架上的json:returns”;ValueError:标签数组大小与相应的数据形状不匹配;,python,pandas,Python,Pandas,我有一个包含传感器时间序列数据的数据库。我想在前端使用的图形库要求将数据重塑为每个传感器的一列,而不是数据集中的垂直格式: >>> for d in dataset: print d ... [datetime.datetime(2014, 9, 26, 0, 56, 0, 598000), u'motion', 0.0] [datetime.datetime(2014, 9,
>>> for d in dataset: print d
...
[datetime.datetime(2014, 9, 26, 0, 56, 0, 598000), u'motion', 0.0]
[datetime.datetime(2014, 9, 26, 0, 56, 7, 698000), u'motion', 1.0]
[datetime.datetime(2014, 9, 26, 0, 58, 20, 298000), u'motion', 0.0]
[datetime.datetime(2014, 9, 26, 2, 21, 27, 893000), u'door', 0.0]
[datetime.datetime(2014, 9, 26, 2, 21, 37, 793000), u'door', 1.0]
[datetime.datetime(2014, 9, 26, 2, 21, 53, 893000), u'door', 0.0]
在stackoverflow和pandas文档的帮助下(谢谢!)我找到了数据透视的方法:
>>> import pandas as pd
>>> pd.__version__
'0.14.1'
>>>
>>> df = pd.DataFrame(dataset, columns=['tstamp', 'tag', 'value'])
>>> dfp = df.pivot('tstamp', 'tag')
>>> dfp
value
tag door motion
tstamp
2014-09-26 00:56:00.598000 NaN 0
2014-09-26 00:56:07.698000 NaN 1
2014-09-26 00:58:20.298000 NaN 0
2014-09-26 02:21:27.893000 0 NaN
2014-09-26 02:21:37.793000 1 NaN
2014-09-26 02:21:53.893000 0 NaN
>>>
现在我一直在尝试用JSON输出数据:
>>> dfp.to_json()
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/var/www/environment/default/local/lib/python2.7/site-packages/pandas/core/generic.py", line 853, in to_json
default_handler=default_handler)
File "/var/www/environment/default/local/lib/python2.7/site-packages/pandas/io/json.py", line 34, in to_json
date_unit=date_unit, default_handler=default_handler).write()
File "/var/www/environment/default/local/lib/python2.7/site-packages/pandas/io/json.py", line 77, in write
default_handler=self.default_handler)
ValueError: Label array sizes do not match corresponding data shape
但是我不确定下一步该怎么办。透视图正在制作一个列具有多索引的数据帧。由于所有列的顶层
值
,都是相同的,您可以简单地删除它:
dfp.columns = dfp.columns.droplevel(0)
然后调用到_json
工作:
In [20]: dfp.to_json()
Out[20]: '{"door":{"1411692960598":null,"1411692967698":null,"1411693100298":null,"1411698087893":0.0,"1411698097793":1.0,"1411698113893":0.0},"motion":{"1411692960598":0.0,"1411692967698":1.0,"1411693100298":0.0,"1411698087893":null,"1411698097793":null,"1411698113893":null}}'
或者,更好的是,在调用
pivot
时指定values
列:
In [26]: dfp = df.pivot(index='tstamp', columns='tag', values='value'); dfp
Out[26]:
tag door motion
tstamp
2014-09-26 00:56:00.598000 NaN 0
2014-09-26 00:56:07.698000 NaN 1
2014-09-26 00:58:20.298000 NaN 0
2014-09-26 02:21:27.893000 0 NaN
2014-09-26 02:21:37.793000 1 NaN
2014-09-26 02:21:53.893000 0 NaN
现在调用
到_json
就可以了,因为columns索引是平面的。这就行了!我发现为了保持所需的tstamp
方向,我可以使用dfp.to_json(orient='index')
。我想知道是否有一种方法可以直接使用json而不是pivot?谢谢你的补充建议。更好的是,pivot将列值更改为列标签。没有一个orient
选项(分割、记录、索引、列、值)能做到这一点,所以我认为您需要调用pivot。
In [26]: dfp = df.pivot(index='tstamp', columns='tag', values='value'); dfp
Out[26]:
tag door motion
tstamp
2014-09-26 00:56:00.598000 NaN 0
2014-09-26 00:56:07.698000 NaN 1
2014-09-26 00:58:20.298000 NaN 0
2014-09-26 02:21:27.893000 0 NaN
2014-09-26 02:21:37.793000 1 NaN
2014-09-26 02:21:53.893000 0 NaN