添加新项目时如何使用类似xarray的pandas面板
我已将pandas面板转换为xarray,但无法像使用pandas面板那样轻松添加新项目、长轴和短轴。代码如下:添加新项目时如何使用类似xarray的pandas面板,pandas,panel,python-xarray,xarray,Pandas,Panel,Python Xarray,Xarray,我已将pandas面板转换为xarray,但无法像使用pandas面板那样轻松添加新项目、长轴和短轴。代码如下: import numpy as np import pandas as pd import xarray as xr panel = pd.Panel(np.random.randn(3, 4, 5), items=['one', 'two', 'three'], major_axis=pd.date_range('1/1/2000', p
import numpy as np
import pandas as pd
import xarray as xr
panel = pd.Panel(np.random.randn(3, 4, 5), items=['one', 'two', 'three'],
major_axis=pd.date_range('1/1/2000', periods=4),
minor_axis=['a', 'b', 'c', 'd','e'])
例如,如果我想添加新项目,我可以:
panel.four=pd.DataFrame(np.ones((4,5)),index=pd.date_range('1/1/2000', periods=4), columns=['a', 'b', 'c', 'd','e'])
panel.four
a b c d e
2000-01-01 1.0 1.0 1.0 1.0 1.0
2000-01-02 1.0 1.0 1.0 1.0 1.0
2000-01-03 1.0 1.0 1.0 1.0 1.0
2000-01-04 1.0 1.0 1.0 1.0 1.0
我很难在xarray中增加项目、长轴/短轴
px=panel.to_xarray()
#px gives me
<xarray.DataArray (items: 3, major_axis: 5, minor_axis: 4)>
array([[[-0.440081, -0.888226, 0.158702, 2.107577],
[ 0.917835, -0.174557, 0.501626, 0.116761],
[ 0.406988, 1.95184 , -1.345948, 2.960774],
[-1.905529, 0.25793 , 0.076162, 1.954012],
[ 0.499675, 1.87567 , -1.698771, -1.143766]],
[[ 0.070269, -1.151737, -0.344155, -0.506383],
[-2.199357, -0.040909, 0.491984, -0.333431],
[-0.113155, -0.668475, 2.366683, -0.421863],
[-0.567336, -0.302224, 1.638386, -0.038545],
[ 0.55067 , -0.409266, -0.27916 , -0.942144]],
[[ 1.269171, -0.151471, -0.664072, 0.269168],
[-0.486492, 0.59632 , -0.191977, 0.22537 ],
[ 0.069231, -0.345793, -0.450797, -2.982 ],
[-0.42338 , -0.849736, 0.965738, -0.544596],
[-1.455378, -0.256441, -1.204572, -0.347749]]])
Coordinates:
* items (items) object 'one' 'two' 'three'
* major_axis (major_axis) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 ...
* minor_axis (minor_axis) object 'a' 'b' 'c' 'd'
#how should I add a fourth item, increase/delete major axis, minor axis?
px=panel.to_xarray()
#px给了我
数组([-0.440081,-0.888226,0.158702,2.107577],
[ 0.917835, -0.174557, 0.501626, 0.116761],
[ 0.406988, 1.95184 , -1.345948, 2.960774],
[-1.905529, 0.25793 , 0.076162, 1.954012],
[ 0.499675, 1.87567 , -1.698771, -1.143766]],
[[ 0.070269, -1.151737, -0.344155, -0.506383],
[-2.199357, -0.040909, 0.491984, -0.333431],
[-0.113155, -0.668475, 2.366683, -0.421863],
[-0.567336, -0.302224, 1.638386, -0.038545],
[ 0.55067 , -0.409266, -0.27916 , -0.942144]],
[[ 1.269171, -0.151471, -0.664072, 0.269168],
[-0.486492, 0.59632 , -0.191977, 0.22537 ],
[ 0.069231, -0.345793, -0.450797, -2.982 ],
[-0.42338 , -0.849736, 0.965738, -0.544596],
[-1.455378, -0.256441, -1.204572, -0.347749]]])
协调:
*项目(项目)对象“一”“二”“三”
*长轴日期时间64[ns]2000-01-01 2000-01-02 2000-01-03。。。
*次轴(次轴)对象“a”“b”“c”“d”
#如何添加第四项,增加/删除长轴、短轴?
xarray分配没有熊猫面板那么优雅。假设我们想在上面的数据数组中添加第四项。以下是它的工作原理:
four=xr.DataArray(np.ones((1,4,5)), coords=[['four'],pd.date_range('1/1/2000', periods=4),['a', 'b', 'c', 'd','e']],
dims=['items','major_axis','minor_axis'])
pxc=xr.concat([px,four],dim='items')
无论操作是在项目上还是在长轴/短轴上,都以类似的逻辑为准。用于删除和使用
pxc.drop(['four'], dim='items')
xarray.DataArray
在内部基于单个NumPy数组,因此无法有效地调整其大小或附加到该数组。您最好的选择是使用xarray.concat
创建一个新的、更大的数据阵列
如果要将项添加到pd.Panel
中,您可能要查看的数据结构是xarray.Dataset
。这些是从多索引数据帧(相当于一个面板)构造的最简单方法:
# First, make a DataFrame with a MultiIndex
>>> df = panel.to_frame()
>>> df.head()
one two three
major minor
2000-01-01 a 0.278958 0.676034 -1.544726
b -0.918150 -2.707339 -0.552987
c 0.023479 0.175528 -0.817556
d 1.798001 -0.142016 1.390834
e 0.256575 0.265369 -1.829766
# Now, convert the DataFrame with a MultiIndex to xarray
>>> ds = df.to_xarray()
>>> ds
<xarray.Dataset>
Dimensions: (major: 4, minor: 5)
Coordinates:
* major (major) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04
* minor (minor) object 'a' 'b' 'c' 'd' 'e'
Data variables:
one (major, minor) float64 0.279 -0.9182 0.02348 1.798 0.2566 2.41 ...
two (major, minor) float64 0.676 -2.707 0.1755 -0.142 0.2654 ...
three (major, minor) float64 -1.545 -0.553 -0.8176 1.391 -1.83 ...
# You can assign a DataFrame if it has the right column/index names
>>> ds['four'] = pd.DataFrame(np.ones((4,5)),
... index=pd.date_range('1/1/2000', periods=4, name='major'),
... columns=pd.Index(['a', 'b', 'c', 'd', 'e'], name='minor'))
# or just pass a tuple directly:
>>> ds['five'] = (('major', 'minor'), np.zeros((4, 5)))
>>> ds
<xarray.Dataset>
Dimensions: (major: 4, minor: 5)
Coordinates:
* major (major) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04
* minor (minor) object 'a' 'b' 'c' 'd' 'e'
Data variables:
one (major, minor) float64 0.279 -0.9182 0.02348 1.798 0.2566 2.41 ...
two (major, minor) float64 0.676 -2.707 0.1755 -0.142 0.2654 ...
three (major, minor) float64 -1.545 -0.553 -0.8176 1.391 -1.83 ...
four (major, minor) float64 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ...
five (major, minor) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
#首先,创建一个具有多索引的数据帧
>>>df=面板至框架()
>>>df.head()
123
大调小调
2000-01-01 a 0.278958 0.676034-1.544726
b-0.918150-2.707339-0.552987
c 0.023479 0.175528-0.817556
d 1.798001-0.142016 1.390834
e 0.256575 0.265369-1.829766
#现在,将具有多索引的数据帧转换为xarray
>>>ds=df.to_xarray()
>>>ds
尺寸:(大调:4,小调:5)
协调:
*主要(主要)日期时间64[ns]2000-01-01 2000-01-02 2000-01-03 2000-01-04
*次要(次要)对象“a”“b”“c”“d”“e”
数据变量:
一个(大调,小调)浮动64 0.279-0.9182 0.02348 1.798 0.2566 2.41。。。
两个(大调,小调)浮动64 0.676-2.707 0.1755-0.142 0.2654。。。
三个(大调、小调)浮动64-1.545-0.553-0.8176 1.391-1.83。。。
#如果数据帧具有正确的列/索引名,则可以为其分配数据帧
>>>ds['four']=pd.DataFrame(np.one((4,5)),
…索引=pd.日期范围('1/1/2000',期间=4,名称='major'),
…columns=pd.Index(['a','b','c','d','e'],name='minor'))
#或者直接传递一个元组:
>>>ds['five']=(('major','minor'),np.zero((4,5)))
>>>ds
尺寸:(大调:4,小调:5)
协调:
*主要(主要)日期时间64[ns]2000-01-01 2000-01-02 2000-01-03 2000-01-04
*次要(次要)对象“a”“b”“c”“d”“e”
数据变量:
一个(大调,小调)浮动64 0.279-0.9182 0.02348 1.798 0.2566 2.41。。。
两个(大调,小调)浮动64 0.676-2.707 0.1755-0.142 0.2654。。。
三个(大调、小调)浮动64-1.545-0.553-0.8176 1.391-1.83。。。
四个(大调,小调)浮动64 1.01.01.01.01.01.01.01.01.01.01.01.01.01.01.0。。。
五个(大调,小调)浮动64 0.0 0.0.0.0.0.0.0.0.0.0.0.0.0。。。
有关从pandas.Panel过渡到xarray的更多信息,请阅读xarray文档中的本节: