Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/322.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在具有多索引列的dataframe中添加字段_Python_Dataframe_Pandas_Time Series_Multi Index - Fatal编程技术网

Python 在具有多索引列的dataframe中添加字段

Python 在具有多索引列的dataframe中添加字段,python,dataframe,pandas,time-series,multi-index,Python,Dataframe,Pandas,Time Series,Multi Index,我一直在寻找这个问题的答案,因为它看起来很简单,但还没有找到任何答案。如果我错过了什么,我会道歉。我有熊猫版本0.10.0,我一直在试验以下形式的数据: import pandas import numpy as np import datetime start_date = datetime.datetime(2009,3,1,6,29,59) r = pandas.date_range(start_date, periods=12) cols_1 = ['AAPL', 'AAPL', 'GO

我一直在寻找这个问题的答案,因为它看起来很简单,但还没有找到任何答案。如果我错过了什么,我会道歉。我有熊猫版本0.10.0,我一直在试验以下形式的数据:

import pandas
import numpy as np
import datetime
start_date = datetime.datetime(2009,3,1,6,29,59)
r = pandas.date_range(start_date, periods=12)
cols_1 = ['AAPL', 'AAPL', 'GOOG', 'GOOG', 'GS', 'GS']
cols_2 = ['close', 'rate', 'close', 'rate', 'close', 'rate']
dat = np.random.randn(12, 6)
cols = pandas.MultiIndex.from_arrays([cols_1, cols_2], names=['ticker','field'])
dftst = pandas.DataFrame(dat, columns=cols, index=r)
print dftst



ticker                   AAPL                GOOG                  GS          
field                   close      rate     close      rate     close      rate
2009-03-01 06:29:59  1.956255 -2.074371 -0.200568  0.759772 -0.951543  0.514577
2009-03-02 06:29:59  0.069611 -2.684352 -0.310006  0.730205 -0.302949 -0.830452
2009-03-03 06:29:59  2.077130 -0.903784  0.449857 -1.357464 -0.469572 -0.008757
2009-03-04 06:29:59  1.585358 -2.063672  0.600889 -1.741606 -0.299875  0.565253
2009-03-05 06:29:59  0.269123  0.226593  1.132663  0.485035  0.796858 -0.423112
2009-03-06 06:29:59  0.094879 -1.040069  0.613450 -0.175266 -0.065172  3.374658
2009-03-07 06:29:59 -1.255167 -0.326474  0.437053 -0.231594  0.437703 -0.256811
2009-03-08 06:29:59  0.115454 -1.096841 -1.189211 -0.208098 -0.807860  0.158198
2009-03-09 06:29:59  2.142816  0.173878 -0.160932  0.367309 -0.449765 -0.325400
2009-03-10 06:29:59  0.470669 -0.346805  1.152648  0.844632  1.031602 -0.012502
2009-03-11 06:29:59 -1.366954  0.452177  0.010713 -1.331553  0.226781  0.456900
2009-03-12 06:29:59  2.182409  0.890023 -0.627318 -1.516574 -1.565416 -0.694320
如您所见,我试图表示3d timeseries数据。所以我有一个timeseries索引和多索引列。我对数据切片非常熟悉。如果我只想要接近数据的尾随平均值,我可以执行以下操作:

pandas.rolling_mean(dftst.ix[:,::2], 5)


ticker                   AAPL      GOOG        GS
field                   close     close     close
2009-03-01 06:29:59       NaN       NaN       NaN
2009-03-02 06:29:59       NaN       NaN       NaN
2009-03-03 06:29:59       NaN       NaN       NaN
2009-03-04 06:29:59       NaN       NaN       NaN
2009-03-05 06:29:59  0.410966 -0.412356  0.722951
2009-03-06 06:29:59 -0.103187 -0.497165  0.137731
2009-03-07 06:29:59  0.000194 -0.645375 -0.298504
2009-03-08 06:29:59 -0.074036 -0.541717 -0.035906
2009-03-09 06:29:59 -0.391863 -0.671918 -0.554380
2009-03-10 06:29:59 -0.336397 -0.411845 -0.992615
2009-03-11 06:29:59 -0.251645 -0.289512 -0.458246
2009-03-12 06:29:59 -0.138925  0.244572 -0.230743
start_date = datetime.datetime(2009,3,1,6,29,59)
r = pandas.date_range(start_date, periods=12)
cols = ['AAPL', 'GOOG', 'GS']
dat = np.random.randn(12, 3)
dftst2 = pandas.DataFrame(dat, columns=cols, index=r)
print dftst2

                         AAPL      GOOG        GS
2009-03-01 06:29:59  2.476787  2.386037 -0.777566
2009-03-02 06:29:59 -0.820647  1.006159 -0.590240
2009-03-03 06:29:59  0.433960  0.104458  0.282641
2009-03-04 06:29:59  0.300190 -0.300786 -1.780412
2009-03-05 06:29:59 -0.247919  1.616572  1.145594
2009-03-06 06:29:59 -0.779130  0.695256  0.845819
2009-03-07 06:29:59  0.572073  0.349394 -3.557776
2009-03-08 06:29:59  2.019885  0.358346  1.350812
2009-03-09 06:29:59  0.472328 -0.334223 -0.605862
2009-03-10 06:29:59 -1.570479  0.410808  0.616515
2009-03-11 06:29:59  1.177562 -0.240396 -2.126951
2009-03-12 06:29:59  0.311566 -1.743213  0.382617
dftst2['GOOG_avg'] = pandas.rolling_mean(dftst2['GOOG'], 3)
print dftst2


                         AAPL      GOOG        GS  GOOG_avg
2009-03-01 06:29:59  2.476787  2.386037 -0.777566       NaN
2009-03-02 06:29:59 -0.820647  1.006159 -0.590240       NaN
2009-03-03 06:29:59  0.433960  0.104458  0.282641  1.165551
2009-03-04 06:29:59  0.300190 -0.300786 -1.780412  0.269944
2009-03-05 06:29:59 -0.247919  1.616572  1.145594  0.473415
2009-03-06 06:29:59 -0.779130  0.695256  0.845819  0.670347
2009-03-07 06:29:59  0.572073  0.349394 -3.557776  0.887074
2009-03-08 06:29:59  2.019885  0.358346  1.350812  0.467666
2009-03-09 06:29:59  0.472328 -0.334223 -0.605862  0.124506
2009-03-10 06:29:59 -1.570479  0.410808  0.616515  0.144977
2009-03-11 06:29:59  1.177562 -0.240396 -2.126951 -0.054604
2009-03-12 06:29:59  0.311566 -1.743213  0.382617 -0.524267
pn = pn.transpose(2,0,1)
print pn

Out[91]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 12 (major_axis) x 3 (minor_axis)
Items axis: close to rate
Major_axis axis: 2009-03-01 06:29:59 to 2009-03-12 06:29:59
Minor_axis axis: AAPL to GS
我不能做的是创建一个新字段,比如avg_close并分配给它。理想情况下,我想做如下工作:

pandas.rolling_mean(dftst.ix[:,::2], 5)


ticker                   AAPL      GOOG        GS
field                   close     close     close
2009-03-01 06:29:59       NaN       NaN       NaN
2009-03-02 06:29:59       NaN       NaN       NaN
2009-03-03 06:29:59       NaN       NaN       NaN
2009-03-04 06:29:59       NaN       NaN       NaN
2009-03-05 06:29:59  0.410966 -0.412356  0.722951
2009-03-06 06:29:59 -0.103187 -0.497165  0.137731
2009-03-07 06:29:59  0.000194 -0.645375 -0.298504
2009-03-08 06:29:59 -0.074036 -0.541717 -0.035906
2009-03-09 06:29:59 -0.391863 -0.671918 -0.554380
2009-03-10 06:29:59 -0.336397 -0.411845 -0.992615
2009-03-11 06:29:59 -0.251645 -0.289512 -0.458246
2009-03-12 06:29:59 -0.138925  0.244572 -0.230743
start_date = datetime.datetime(2009,3,1,6,29,59)
r = pandas.date_range(start_date, periods=12)
cols = ['AAPL', 'GOOG', 'GS']
dat = np.random.randn(12, 3)
dftst2 = pandas.DataFrame(dat, columns=cols, index=r)
print dftst2

                         AAPL      GOOG        GS
2009-03-01 06:29:59  2.476787  2.386037 -0.777566
2009-03-02 06:29:59 -0.820647  1.006159 -0.590240
2009-03-03 06:29:59  0.433960  0.104458  0.282641
2009-03-04 06:29:59  0.300190 -0.300786 -1.780412
2009-03-05 06:29:59 -0.247919  1.616572  1.145594
2009-03-06 06:29:59 -0.779130  0.695256  0.845819
2009-03-07 06:29:59  0.572073  0.349394 -3.557776
2009-03-08 06:29:59  2.019885  0.358346  1.350812
2009-03-09 06:29:59  0.472328 -0.334223 -0.605862
2009-03-10 06:29:59 -1.570479  0.410808  0.616515
2009-03-11 06:29:59  1.177562 -0.240396 -2.126951
2009-03-12 06:29:59  0.311566 -1.743213  0.382617
dftst2['GOOG_avg'] = pandas.rolling_mean(dftst2['GOOG'], 3)
print dftst2


                         AAPL      GOOG        GS  GOOG_avg
2009-03-01 06:29:59  2.476787  2.386037 -0.777566       NaN
2009-03-02 06:29:59 -0.820647  1.006159 -0.590240       NaN
2009-03-03 06:29:59  0.433960  0.104458  0.282641  1.165551
2009-03-04 06:29:59  0.300190 -0.300786 -1.780412  0.269944
2009-03-05 06:29:59 -0.247919  1.616572  1.145594  0.473415
2009-03-06 06:29:59 -0.779130  0.695256  0.845819  0.670347
2009-03-07 06:29:59  0.572073  0.349394 -3.557776  0.887074
2009-03-08 06:29:59  2.019885  0.358346  1.350812  0.467666
2009-03-09 06:29:59  0.472328 -0.334223 -0.605862  0.124506
2009-03-10 06:29:59 -1.570479  0.410808  0.616515  0.144977
2009-03-11 06:29:59  1.177562 -0.240396 -2.126951 -0.054604
2009-03-12 06:29:59  0.311566 -1.743213  0.382617 -0.524267
pn = pn.transpose(2,0,1)
print pn

Out[91]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 12 (major_axis) x 3 (minor_axis)
Items axis: close to rate
Major_axis axis: 2009-03-01 06:29:59 to 2009-03-12 06:29:59
Minor_axis axis: AAPL to GS
dftst[:,'avg_close']=熊猫滚动平均值(dftst.ix[,::2],5)

即使我交换多重索引的级别,我也无法使其工作:

dftst = dftst.swaplevel(1,0,axis=1)
print dftst['close']

ticker                   AAPL      GOOG        GS
2009-03-01 06:29:59  1.178557 -0.505672 -0.336645
2009-03-02 06:29:59  0.234305  0.581429 -0.232252
2009-03-03 06:29:59 -0.734798  0.117810  1.658418
2009-03-04 06:29:59 -1.555033 -0.298322  0.127408
2009-03-05 06:29:59  0.244102 -1.030041 -0.562039
2009-03-06 06:29:59 -0.297454  1.150564 -1.930883
2009-03-07 06:29:59  0.818910 -0.905296  1.219946
2009-03-08 06:29:59  0.586816  0.965242  0.928546
2009-03-09 06:29:59 -0.357693  0.071455  0.072956
2009-03-10 06:29:59  0.651803 -0.685937  0.805779
2009-03-11 06:29:59  0.569802 -0.062447 -1.349261
2009-03-12 06:29:59 -1.886335  0.205778 -0.864273

dftst['avg_close'] = pandas.rolling_mean(dftst['close'], 3)


----> 1 dftst['avg_close'] = pandas.rolling_mean(dftst['close'], 3)

/usr/local/lib/python2.7/dist-packages/pandas/core/frame.pyc in
__setitem__(self, key, value)    2041         else:    2042             # set column

-> 2043             self._set_item(key, value)    2044     2045     def _boolean_set(self, key, value):

/usr/local/lib/python2.7/dist-packages/pandas/core/frame.pyc in
_set_item(self, key, value)    2077         """    2078         value = self._sanitize_column(key, value)
-> 2079         NDFrame._set_item(self, key, value)    2080     2081     def insert(self, loc, column, value):

/usr/local/lib/python2.7/dist-packages/pandas/core/generic.pyc in
_set_item(self, key, value)
    544 
    545     def _set_item(self, key, value):
--> 546         self._data.set(key, value)
    547         self._clear_item_cache()
    548 

/usr/local/lib/python2.7/dist-packages/pandas/core/internals.pyc in set(self, item, value)
    951         except KeyError:
    952             # insert at end

--> 953             self.insert(len(self.items), item, value)
    954 
    955         self._known_consolidated = False

/usr/local/lib/python2.7/dist-packages/pandas/core/internals.pyc in insert(self, loc, item, value)
    963 
    964         # new block

--> 965         self._add_new_block(item, value, loc=loc)
    966 
    967         if len(self.blocks) > 100:

/usr/local/lib/python2.7/dist-packages/pandas/core/internals.pyc in
_add_new_block(self, item, value, loc)
    992             loc = self.items.get_loc(item)
    993         new_block = make_block(value, self.items[loc:loc+1].copy(),
--> 994                                self.items)
    995         self.blocks.append(new_block)
    996 

/usr/local/lib/python2.7/dist-packages/pandas/core/internals.pyc in make_block(values, items, ref_items)
    463         klass = ObjectBlock
    464 
--> 465     return klass(values, items, ref_items, ndim=values.ndim)
    466 
    467 # TODO: flexible with index=None and/or items=None


/usr/local/lib/python2.7/dist-packages/pandas/core/internals.pyc in
__init__(self, values, items, ref_items, ndim)
     30         if len(items) != len(values):
     31             raise AssertionError('Wrong number of items passed (%d vs %d)'
---> 32                                  % (len(items), len(values)))
     33 
     34         self._ref_locs = None

AssertionError: Wrong number of items passed (1 vs 3)
如果我的列不是多索引,我可以指定执行以下操作:

pandas.rolling_mean(dftst.ix[:,::2], 5)


ticker                   AAPL      GOOG        GS
field                   close     close     close
2009-03-01 06:29:59       NaN       NaN       NaN
2009-03-02 06:29:59       NaN       NaN       NaN
2009-03-03 06:29:59       NaN       NaN       NaN
2009-03-04 06:29:59       NaN       NaN       NaN
2009-03-05 06:29:59  0.410966 -0.412356  0.722951
2009-03-06 06:29:59 -0.103187 -0.497165  0.137731
2009-03-07 06:29:59  0.000194 -0.645375 -0.298504
2009-03-08 06:29:59 -0.074036 -0.541717 -0.035906
2009-03-09 06:29:59 -0.391863 -0.671918 -0.554380
2009-03-10 06:29:59 -0.336397 -0.411845 -0.992615
2009-03-11 06:29:59 -0.251645 -0.289512 -0.458246
2009-03-12 06:29:59 -0.138925  0.244572 -0.230743
start_date = datetime.datetime(2009,3,1,6,29,59)
r = pandas.date_range(start_date, periods=12)
cols = ['AAPL', 'GOOG', 'GS']
dat = np.random.randn(12, 3)
dftst2 = pandas.DataFrame(dat, columns=cols, index=r)
print dftst2

                         AAPL      GOOG        GS
2009-03-01 06:29:59  2.476787  2.386037 -0.777566
2009-03-02 06:29:59 -0.820647  1.006159 -0.590240
2009-03-03 06:29:59  0.433960  0.104458  0.282641
2009-03-04 06:29:59  0.300190 -0.300786 -1.780412
2009-03-05 06:29:59 -0.247919  1.616572  1.145594
2009-03-06 06:29:59 -0.779130  0.695256  0.845819
2009-03-07 06:29:59  0.572073  0.349394 -3.557776
2009-03-08 06:29:59  2.019885  0.358346  1.350812
2009-03-09 06:29:59  0.472328 -0.334223 -0.605862
2009-03-10 06:29:59 -1.570479  0.410808  0.616515
2009-03-11 06:29:59  1.177562 -0.240396 -2.126951
2009-03-12 06:29:59  0.311566 -1.743213  0.382617
dftst2['GOOG_avg'] = pandas.rolling_mean(dftst2['GOOG'], 3)
print dftst2


                         AAPL      GOOG        GS  GOOG_avg
2009-03-01 06:29:59  2.476787  2.386037 -0.777566       NaN
2009-03-02 06:29:59 -0.820647  1.006159 -0.590240       NaN
2009-03-03 06:29:59  0.433960  0.104458  0.282641  1.165551
2009-03-04 06:29:59  0.300190 -0.300786 -1.780412  0.269944
2009-03-05 06:29:59 -0.247919  1.616572  1.145594  0.473415
2009-03-06 06:29:59 -0.779130  0.695256  0.845819  0.670347
2009-03-07 06:29:59  0.572073  0.349394 -3.557776  0.887074
2009-03-08 06:29:59  2.019885  0.358346  1.350812  0.467666
2009-03-09 06:29:59  0.472328 -0.334223 -0.605862  0.124506
2009-03-10 06:29:59 -1.570479  0.410808  0.616515  0.144977
2009-03-11 06:29:59  1.177562 -0.240396 -2.126951 -0.054604
2009-03-12 06:29:59  0.311566 -1.743213  0.382617 -0.524267
pn = pn.transpose(2,0,1)
print pn

Out[91]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 12 (major_axis) x 3 (minor_axis)
Items axis: close to rate
Major_axis axis: 2009-03-01 06:29:59 to 2009-03-12 06:29:59
Minor_axis axis: AAPL to GS
要基于另一个字段添加字段,我可以执行以下操作:

pandas.rolling_mean(dftst.ix[:,::2], 5)


ticker                   AAPL      GOOG        GS
field                   close     close     close
2009-03-01 06:29:59       NaN       NaN       NaN
2009-03-02 06:29:59       NaN       NaN       NaN
2009-03-03 06:29:59       NaN       NaN       NaN
2009-03-04 06:29:59       NaN       NaN       NaN
2009-03-05 06:29:59  0.410966 -0.412356  0.722951
2009-03-06 06:29:59 -0.103187 -0.497165  0.137731
2009-03-07 06:29:59  0.000194 -0.645375 -0.298504
2009-03-08 06:29:59 -0.074036 -0.541717 -0.035906
2009-03-09 06:29:59 -0.391863 -0.671918 -0.554380
2009-03-10 06:29:59 -0.336397 -0.411845 -0.992615
2009-03-11 06:29:59 -0.251645 -0.289512 -0.458246
2009-03-12 06:29:59 -0.138925  0.244572 -0.230743
start_date = datetime.datetime(2009,3,1,6,29,59)
r = pandas.date_range(start_date, periods=12)
cols = ['AAPL', 'GOOG', 'GS']
dat = np.random.randn(12, 3)
dftst2 = pandas.DataFrame(dat, columns=cols, index=r)
print dftst2

                         AAPL      GOOG        GS
2009-03-01 06:29:59  2.476787  2.386037 -0.777566
2009-03-02 06:29:59 -0.820647  1.006159 -0.590240
2009-03-03 06:29:59  0.433960  0.104458  0.282641
2009-03-04 06:29:59  0.300190 -0.300786 -1.780412
2009-03-05 06:29:59 -0.247919  1.616572  1.145594
2009-03-06 06:29:59 -0.779130  0.695256  0.845819
2009-03-07 06:29:59  0.572073  0.349394 -3.557776
2009-03-08 06:29:59  2.019885  0.358346  1.350812
2009-03-09 06:29:59  0.472328 -0.334223 -0.605862
2009-03-10 06:29:59 -1.570479  0.410808  0.616515
2009-03-11 06:29:59  1.177562 -0.240396 -2.126951
2009-03-12 06:29:59  0.311566 -1.743213  0.382617
dftst2['GOOG_avg'] = pandas.rolling_mean(dftst2['GOOG'], 3)
print dftst2


                         AAPL      GOOG        GS  GOOG_avg
2009-03-01 06:29:59  2.476787  2.386037 -0.777566       NaN
2009-03-02 06:29:59 -0.820647  1.006159 -0.590240       NaN
2009-03-03 06:29:59  0.433960  0.104458  0.282641  1.165551
2009-03-04 06:29:59  0.300190 -0.300786 -1.780412  0.269944
2009-03-05 06:29:59 -0.247919  1.616572  1.145594  0.473415
2009-03-06 06:29:59 -0.779130  0.695256  0.845819  0.670347
2009-03-07 06:29:59  0.572073  0.349394 -3.557776  0.887074
2009-03-08 06:29:59  2.019885  0.358346  1.350812  0.467666
2009-03-09 06:29:59  0.472328 -0.334223 -0.605862  0.124506
2009-03-10 06:29:59 -1.570479  0.410808  0.616515  0.144977
2009-03-11 06:29:59  1.177562 -0.240396 -2.126951 -0.054604
2009-03-12 06:29:59  0.311566 -1.743213  0.382617 -0.524267
pn = pn.transpose(2,0,1)
print pn

Out[91]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 12 (major_axis) x 3 (minor_axis)
Items axis: close to rate
Major_axis axis: 2009-03-01 06:29:59 to 2009-03-12 06:29:59
Minor_axis axis: AAPL to GS

我尝试过使用Panel对象,但到目前为止还没有找到一种快速的方法来添加一个字段,在这个字段中我有多个索引列,理想情况下,其他级别的列将被广播。如果有其他帖子回答这个问题,我很抱歉。如果您有任何建议,我将不胜感激。

我不知道如何进行您想要的广播,但对于严格的作业,这应该可以做到:

dftst[(('GOOG', 'avg_close'))] = 7 
更具体地说,但仍然没有广播:

for tic in cols_1:
   dftst[(tic, 'avg_close')] = pandas.rolling_mean(dftst[(tic, 'close')],5) 

对于这个特殊的问题,使用面板对象似乎是可行的。我做了以下工作(从我原来的帖子中引用dftst):

pn=dftst.to_面板()
打印pn
出[83]:
尺寸:12(项目)x 3(长轴)x 2(短轴)
项目轴:2009-03-01 06:29:59至2009-03-12 06:29:59
长轴:AAPL至GS
短轴:接近速率
如果我通过执行以下操作将('close'、'rate')移动到项目:

pandas.rolling_mean(dftst.ix[:,::2], 5)


ticker                   AAPL      GOOG        GS
field                   close     close     close
2009-03-01 06:29:59       NaN       NaN       NaN
2009-03-02 06:29:59       NaN       NaN       NaN
2009-03-03 06:29:59       NaN       NaN       NaN
2009-03-04 06:29:59       NaN       NaN       NaN
2009-03-05 06:29:59  0.410966 -0.412356  0.722951
2009-03-06 06:29:59 -0.103187 -0.497165  0.137731
2009-03-07 06:29:59  0.000194 -0.645375 -0.298504
2009-03-08 06:29:59 -0.074036 -0.541717 -0.035906
2009-03-09 06:29:59 -0.391863 -0.671918 -0.554380
2009-03-10 06:29:59 -0.336397 -0.411845 -0.992615
2009-03-11 06:29:59 -0.251645 -0.289512 -0.458246
2009-03-12 06:29:59 -0.138925  0.244572 -0.230743
start_date = datetime.datetime(2009,3,1,6,29,59)
r = pandas.date_range(start_date, periods=12)
cols = ['AAPL', 'GOOG', 'GS']
dat = np.random.randn(12, 3)
dftst2 = pandas.DataFrame(dat, columns=cols, index=r)
print dftst2

                         AAPL      GOOG        GS
2009-03-01 06:29:59  2.476787  2.386037 -0.777566
2009-03-02 06:29:59 -0.820647  1.006159 -0.590240
2009-03-03 06:29:59  0.433960  0.104458  0.282641
2009-03-04 06:29:59  0.300190 -0.300786 -1.780412
2009-03-05 06:29:59 -0.247919  1.616572  1.145594
2009-03-06 06:29:59 -0.779130  0.695256  0.845819
2009-03-07 06:29:59  0.572073  0.349394 -3.557776
2009-03-08 06:29:59  2.019885  0.358346  1.350812
2009-03-09 06:29:59  0.472328 -0.334223 -0.605862
2009-03-10 06:29:59 -1.570479  0.410808  0.616515
2009-03-11 06:29:59  1.177562 -0.240396 -2.126951
2009-03-12 06:29:59  0.311566 -1.743213  0.382617
dftst2['GOOG_avg'] = pandas.rolling_mean(dftst2['GOOG'], 3)
print dftst2


                         AAPL      GOOG        GS  GOOG_avg
2009-03-01 06:29:59  2.476787  2.386037 -0.777566       NaN
2009-03-02 06:29:59 -0.820647  1.006159 -0.590240       NaN
2009-03-03 06:29:59  0.433960  0.104458  0.282641  1.165551
2009-03-04 06:29:59  0.300190 -0.300786 -1.780412  0.269944
2009-03-05 06:29:59 -0.247919  1.616572  1.145594  0.473415
2009-03-06 06:29:59 -0.779130  0.695256  0.845819  0.670347
2009-03-07 06:29:59  0.572073  0.349394 -3.557776  0.887074
2009-03-08 06:29:59  2.019885  0.358346  1.350812  0.467666
2009-03-09 06:29:59  0.472328 -0.334223 -0.605862  0.124506
2009-03-10 06:29:59 -1.570479  0.410808  0.616515  0.144977
2009-03-11 06:29:59  1.177562 -0.240396 -2.126951 -0.054604
2009-03-12 06:29:59  0.311566 -1.743213  0.382617 -0.524267
pn = pn.transpose(2,0,1)
print pn

Out[91]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 12 (major_axis) x 3 (minor_axis)
Items axis: close to rate
Major_axis axis: 2009-03-01 06:29:59 to 2009-03-12 06:29:59
Minor_axis axis: AAPL to GS
pn=pn.转置(2,0,1)
打印pn
出[91]:
尺寸:2(项目)x 12(长轴)x 3(短轴)
项目轴:接近费率
长轴:2009-03-01 06:29:59至2009-03-12 06:29:59
短轴:AAPL至GS
现在,我可以执行时间序列操作,并将其作为字段添加到Panel对象中:

pn['avg_close'] = pandas.rolling_mean(pn['close'], 5)
print pn

Out[93]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 3 (items) x 12 (major_axis) x 3 (minor_axis)
Items axis: close to avg_close
Major_axis axis: 2009-03-01 06:29:59 to 2009-03-12 06:29:59
Minor_axis axis: AAPL to GS

print pn['avg_close']

Out[94]: 
ticker                   AAPL      GOOG        GS
2009-03-01 06:29:59       NaN       NaN       NaN
2009-03-02 06:29:59       NaN       NaN       NaN
2009-03-03 06:29:59       NaN       NaN       NaN
2009-03-04 06:29:59       NaN       NaN       NaN
2009-03-05 06:29:59  0.303719 -0.129300 -0.037954
2009-03-06 06:29:59 -0.006839  0.206331  0.336467
2009-03-07 06:29:59  0.128299  0.174935  0.698275
2009-03-08 06:29:59  0.471010 -0.137343  0.671049
2009-03-09 06:29:59 -0.279855 -0.033427  0.848610
2009-03-10 06:29:59 -0.516032  0.260944  0.373046
2009-03-11 06:29:59 -0.456213  0.164710  0.910448
2009-03-12 06:29:59 -0.799156  0.544132  0.862764
pn['avg_close']=pandas.滚动平均值(pn['close'],5)
打印pn
出[93]:
尺寸:3(项目)x 12(长轴)x 3(短轴)
项目轴:接近平均值
长轴:2009-03-01 06:29:59至2009-03-12 06:29:59
短轴:AAPL至GS
打印pn['avg_close']
出[94]:
股票代码AAPL GOOG GS
2009-03-01 06:29:59楠楠
2009-03-02 06:29:59楠楠
2009-03-03 06:29:59楠楠楠
2009-03-04 06:29:59楠楠
2009-03-05 06:29:59  0.303719 -0.129300 -0.037954
2009-03-06 06:29:59 -0.006839  0.206331  0.336467
2009-03-07 06:29:59  0.128299  0.174935  0.698275
2009-03-08 06:29:59  0.471010 -0.137343  0.671049
2009-03-09 06:29:59 -0.279855 -0.033427  0.848610
2009-03-10 06:29:59 -0.516032  0.260944  0.373046
2009-03-11 06:29:59 -0.456213  0.164710  0.910448
2009-03-12 06:29:59 -0.799156  0.544132  0.862764

我实际上在面板对象上有一些其他问题,但是我会把它们放在另一个帖子上。

你也可以(作为一个解决方案,因为没有真正的API来做你想要的)。如果你不想用一个面板,考虑一下修改FU。不过,我不建议在庞大的数据集上使用它:使用一个面板

In [30]: df = dftst.stack(0)

In [31]: df['close_avg'] = pd.rolling_mean(df.close.unstack(), 5).stack()

In [32]: df
Out[32]: 
field                          close      rate  close_avg
                    ticker                               
2009-03-01 06:29:59 AAPL   -0.223042  0.554996        NaN
                    GOOG    0.060127 -0.333992        NaN
                    GS      0.117626 -1.256790        NaN
2009-03-02 06:29:59 AAPL   -0.513743 -0.402661        NaN
                    GOOG    0.059828 -0.125288        NaN
                    GS     -0.336196 -0.510595        NaN
2009-03-03 06:29:59 AAPL    0.142202 -1.038470        NaN
                    GOOG   -1.099251 -0.892581        NaN
                    GS      1.698086  0.885023        NaN
2009-03-04 06:29:59 AAPL   -1.125821  0.413005        NaN
                    GOOG    0.424290  1.106983        NaN
                    GS      0.047158  0.680714        NaN
2009-03-05 06:29:59 AAPL    0.470050  1.845354  -0.250071
                    GOOG    0.132956 -0.488800  -0.084410
                    GS      0.129190  0.208077   0.331173
2009-03-06 06:29:59 AAPL   -0.087360 -2.102512  -0.222934
                    GOOG    0.165100 -0.134886  -0.063415
                    GS      0.167720  0.082480   0.341192
2009-03-07 06:29:59 AAPL   -0.768542 -0.176076  -0.273894
                    GOOG    0.417694  2.257074   0.008158
                    GS     -1.744730 -1.850185   0.059485
2009-03-08 06:29:59 AAPL   -0.297363 -0.633828  -0.361807
                    GOOG   -1.096703 -0.572138   0.008667
                    GS      0.890016 -2.621563  -0.102129
2009-03-09 06:29:59 AAPL    1.038579  0.053330   0.071073
                    GOOG   -0.614050  0.607944  -0.199001
                    GS     -0.882848  0.596801  -0.288130
2009-03-10 06:29:59 AAPL   -0.255226  0.058178  -0.073982
                    GOOG    1.761861  1.841751   0.126780
                    GS     -0.549998 -1.551281  -0.423968
2009-03-11 06:29:59 AAPL    0.413522  0.149089   0.026194
                    GOOG   -2.964163  1.825312  -0.499072
                    GS     -0.373303  1.137001  -0.532173
2009-03-12 06:29:59 AAPL   -0.924776  1.238546  -0.005053
                    GOOG   -0.985956 -0.906590  -0.779802
                    GS     -0.320400  1.239681  -0.247307

这已经有十年了,但我也有同样的问题。这里有一个简单的方法来做你想要的事情。熊猫0.18已经被引入,所以滚动平均值现在有点不同,但你明白了

avg_close = dftst.xs('close', axis=1, level=1).rolling(5).mean()   
dftst[zip(avg_close.columns, ['avg_close']*len(avg_close.columns))] = avg_close

感谢这篇文章,我想出了一个用面板对象来实现的方法。然而,似乎有几件关键的事情我无法用面板对象来完成。我将在另一篇文章中问一些专门小组的问题。再次感谢!十年的三分之一!你的意思是其他答案不再有效吗?(我认为使用
zip
loke这在python3中可能不起作用,我本以为你可以只使用
dftst[avg_close.columns,'avg_close']=avg_close
(或者其他方式)?@Andy Hayden python3的zip有点不同,你可以使用
list(zip(avg_close.columns,['avg_close']]*len(avg_close.columns)))
.rolling\你的意思是熊猫已经贬值了,迟早会不起作用的啊,我明白了