添加新项目时如何使用类似xarray的pandas面板

添加新项目时如何使用类似xarray的pandas面板,pandas,panel,python-xarray,xarray,Pandas,Panel,Python Xarray,Xarray,我已将pandas面板转换为xarray,但无法像使用pandas面板那样轻松添加新项目、长轴和短轴。代码如下: import numpy as np import pandas as pd import xarray as xr panel = pd.Panel(np.random.randn(3, 4, 5), items=['one', 'two', 'three'], major_axis=pd.date_range('1/1/2000', p

我已将pandas面板转换为xarray,但无法像使用pandas面板那样轻松添加新项目、长轴和短轴。代码如下:

import numpy as np

import pandas as pd

import xarray as xr


panel = pd.Panel(np.random.randn(3, 4, 5), items=['one', 'two', 'three'], 
                 major_axis=pd.date_range('1/1/2000', periods=4),
                 minor_axis=['a', 'b', 'c', 'd','e'])
例如,如果我想添加新项目,我可以:

panel.four=pd.DataFrame(np.ones((4,5)),index=pd.date_range('1/1/2000', periods=4), columns=['a', 'b', 'c', 'd','e'])

panel.four

            a   b   c   d   e
2000-01-01  1.0 1.0 1.0 1.0 1.0

2000-01-02  1.0 1.0 1.0 1.0 1.0

2000-01-03  1.0 1.0 1.0 1.0 1.0

2000-01-04  1.0 1.0 1.0 1.0 1.0
我很难在xarray中增加项目、长轴/短轴

px=panel.to_xarray()

#px gives me
<xarray.DataArray (items: 3, major_axis: 5, minor_axis: 4)>

array([[[-0.440081, -0.888226,  0.158702,  2.107577],
        [ 0.917835, -0.174557,  0.501626,  0.116761],
        [ 0.406988,  1.95184 , -1.345948,  2.960774],
        [-1.905529,  0.25793 ,  0.076162,  1.954012],
        [ 0.499675,  1.87567 , -1.698771, -1.143766]],


       [[ 0.070269, -1.151737, -0.344155, -0.506383],
        [-2.199357, -0.040909,  0.491984, -0.333431],
        [-0.113155, -0.668475,  2.366683, -0.421863],
        [-0.567336, -0.302224,  1.638386, -0.038545],
        [ 0.55067 , -0.409266, -0.27916 , -0.942144]],


       [[ 1.269171, -0.151471, -0.664072,  0.269168],
        [-0.486492,  0.59632 , -0.191977,  0.22537 ],
        [ 0.069231, -0.345793, -0.450797, -2.982   ],
        [-0.42338 , -0.849736,  0.965738, -0.544596],
        [-1.455378, -0.256441, -1.204572, -0.347749]]])

Coordinates:

  * items       (items) object 'one' 'two' 'three'

  * major_axis  (major_axis) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 ...

  * minor_axis  (minor_axis) object 'a' 'b' 'c' 'd'


#how should I add a fourth item, increase/delete major axis, minor axis?
px=panel.to_xarray()
#px给了我
数组([-0.440081,-0.888226,0.158702,2.107577],
[ 0.917835, -0.174557,  0.501626,  0.116761],
[ 0.406988,  1.95184 , -1.345948,  2.960774],
[-1.905529,  0.25793 ,  0.076162,  1.954012],
[ 0.499675,  1.87567 , -1.698771, -1.143766]],
[[ 0.070269, -1.151737, -0.344155, -0.506383],
[-2.199357, -0.040909,  0.491984, -0.333431],
[-0.113155, -0.668475,  2.366683, -0.421863],
[-0.567336, -0.302224,  1.638386, -0.038545],
[ 0.55067 , -0.409266, -0.27916 , -0.942144]],
[[ 1.269171, -0.151471, -0.664072,  0.269168],
[-0.486492,  0.59632 , -0.191977,  0.22537 ],
[ 0.069231, -0.345793, -0.450797, -2.982   ],
[-0.42338 , -0.849736,  0.965738, -0.544596],
[-1.455378, -0.256441, -1.204572, -0.347749]]])
协调:
*项目(项目)对象“一”“二”“三”
*长轴日期时间64[ns]2000-01-01 2000-01-02 2000-01-03。。。
*次轴(次轴)对象“a”“b”“c”“d”
#如何添加第四项,增加/删除长轴、短轴?

xarray分配没有熊猫面板那么优雅。假设我们想在上面的数据数组中添加第四项。以下是它的工作原理:

four=xr.DataArray(np.ones((1,4,5)), coords=[['four'],pd.date_range('1/1/2000', periods=4),['a', 'b', 'c', 'd','e']], 
                  dims=['items','major_axis','minor_axis'])

pxc=xr.concat([px,four],dim='items')
无论操作是在项目上还是在长轴/短轴上,都以类似的逻辑为准。用于删除和使用

pxc.drop(['four'], dim='items')

xarray.DataArray
在内部基于单个NumPy数组,因此无法有效地调整其大小或附加到该数组。您最好的选择是使用
xarray.concat
创建一个新的、更大的数据阵列

如果要将项添加到
pd.Panel
中,您可能要查看的数据结构是
xarray.Dataset
。这些是从多索引数据帧(相当于一个面板)构造的最简单方法:

# First, make a DataFrame with a MultiIndex
>>> df = panel.to_frame()

>>> df.head()
                       one       two     three
major      minor
2000-01-01 a      0.278958  0.676034 -1.544726
           b     -0.918150 -2.707339 -0.552987
           c      0.023479  0.175528 -0.817556
           d      1.798001 -0.142016  1.390834
           e      0.256575  0.265369 -1.829766

# Now, convert the DataFrame with a MultiIndex to xarray
>>> ds = df.to_xarray()

>>> ds
<xarray.Dataset>
Dimensions:  (major: 4, minor: 5)
Coordinates:
  * major    (major) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04
  * minor    (minor) object 'a' 'b' 'c' 'd' 'e'
Data variables:
    one      (major, minor) float64 0.279 -0.9182 0.02348 1.798 0.2566 2.41 ...
    two      (major, minor) float64 0.676 -2.707 0.1755 -0.142 0.2654 ...
    three    (major, minor) float64 -1.545 -0.553 -0.8176 1.391 -1.83 ...

# You can assign a DataFrame if it has the right column/index names
>>> ds['four'] = pd.DataFrame(np.ones((4,5)),
...                           index=pd.date_range('1/1/2000', periods=4, name='major'),
...                           columns=pd.Index(['a', 'b', 'c', 'd', 'e'], name='minor'))

# or just pass a tuple directly:
>>> ds['five'] = (('major', 'minor'), np.zeros((4, 5)))

>>> ds
<xarray.Dataset>
Dimensions:  (major: 4, minor: 5)
Coordinates:
  * major    (major) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04
  * minor    (minor) object 'a' 'b' 'c' 'd' 'e'
Data variables:
    one      (major, minor) float64 0.279 -0.9182 0.02348 1.798 0.2566 2.41 ...
    two      (major, minor) float64 0.676 -2.707 0.1755 -0.142 0.2654 ...
    three    (major, minor) float64 -1.545 -0.553 -0.8176 1.391 -1.83 ...
    four     (major, minor) float64 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ...
    five     (major, minor) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
#首先,创建一个具有多索引的数据帧
>>>df=面板至框架()
>>>df.head()
123
大调小调
2000-01-01 a 0.278958 0.676034-1.544726
b-0.918150-2.707339-0.552987
c 0.023479 0.175528-0.817556
d 1.798001-0.142016 1.390834
e 0.256575 0.265369-1.829766
#现在,将具有多索引的数据帧转换为xarray
>>>ds=df.to_xarray()
>>>ds
尺寸:(大调:4,小调:5)
协调:
*主要(主要)日期时间64[ns]2000-01-01 2000-01-02 2000-01-03 2000-01-04
*次要(次要)对象“a”“b”“c”“d”“e”
数据变量:
一个(大调,小调)浮动64 0.279-0.9182 0.02348 1.798 0.2566 2.41。。。
两个(大调,小调)浮动64 0.676-2.707 0.1755-0.142 0.2654。。。
三个(大调、小调)浮动64-1.545-0.553-0.8176 1.391-1.83。。。
#如果数据帧具有正确的列/索引名,则可以为其分配数据帧
>>>ds['four']=pd.DataFrame(np.one((4,5)),
…索引=pd.日期范围('1/1/2000',期间=4,名称='major'),
…columns=pd.Index(['a','b','c','d','e'],name='minor'))
#或者直接传递一个元组:
>>>ds['five']=(('major','minor'),np.zero((4,5)))
>>>ds
尺寸:(大调:4,小调:5)
协调:
*主要(主要)日期时间64[ns]2000-01-01 2000-01-02 2000-01-03 2000-01-04
*次要(次要)对象“a”“b”“c”“d”“e”
数据变量:
一个(大调,小调)浮动64 0.279-0.9182 0.02348 1.798 0.2566 2.41。。。
两个(大调,小调)浮动64 0.676-2.707 0.1755-0.142 0.2654。。。
三个(大调、小调)浮动64-1.545-0.553-0.8176 1.391-1.83。。。
四个(大调,小调)浮动64 1.01.01.01.01.01.01.01.01.01.01.01.01.01.01.0。。。
五个(大调,小调)浮动64 0.0 0.0.0.0.0.0.0.0.0.0.0.0.0。。。
有关从pandas.Panel过渡到xarray的更多信息,请阅读xarray文档中的本节: