Python 如果没有for循环,如何处理数据帧?
我的数据帧是:Python 如果没有for循环,如何处理数据帧?,python,pandas,Python,Pandas,我的数据帧是: Date Open High Low Close Adj Close Volume 5932 2016-08-18 218.339996 218.899994 218.210007 218.860001 207.483215 52989300 5933 2016-08-19 218.309998 218.750000 217.740005 218.539993
Date Open High Low Close Adj Close Volume
5932 2016-08-18 218.339996 218.899994 218.210007 218.860001 207.483215 52989300
5933 2016-08-19 218.309998 218.750000 217.740005 218.539993 207.179825 75443000
5934 2016-08-22 218.259995 218.800003 217.830002 218.529999 207.170364 61368800
5935 2016-08-23 219.250000 219.600006 218.899994 218.970001 207.587479 53399200
5936 2016-08-24 218.800003 218.910004 217.360001 217.850006 206.525711 71728900
5937 2016-08-25 217.399994 218.190002 217.220001 217.699997 206.383514 69224800
5938 2016-08-26 217.919998 219.119995 216.250000 217.289993 205.994827 122506300
5939 2016-08-29 217.440002 218.669998 217.399994 218.360001 207.009201 68606100
5940 2016-08-30 218.259995 218.589996 217.350006 218.000000 206.667908 58114500
5941 2016-08-31 217.610001 217.750000 216.470001 217.380005 206.080124 85269500
5942 2016-09-01 217.369995 217.729996 216.029999 217.389999 206.089645 97844200
5943 2016-09-02 218.389999 218.869995 217.699997 218.369995 207.018692 79293900
5944 2016-09-06 218.699997 219.119995 217.860001 219.029999 207.644394 56702100
5945 2016-09-07 218.839996 219.220001 218.300003 219.009995 207.625412 76554900
5946 2016-09-08 218.619995 218.940002 218.149994 218.509995 207.151398 73011600
5947 2016-09-09 216.970001 217.029999 213.250000 213.279999 202.193268 221589100
5948 2016-09-12 212.389999 216.809998 212.309998 216.339996 205.094223 168110900
5949 2016-09-13 214.839996 215.149994 212.500000 213.229996 202.145859 182828800
5950 2016-09-14 213.289993 214.699997 212.500000 213.149994 202.070023 134185500
5951 2016-09-15 212.960007 215.729996 212.750000 215.279999 204.089294 134427900
5952 2016-09-16 213.479996 213.690002 212.570007 213.369995 203.300430 155236400
目前,我正在这样做:
state['open_price'] = lookback.Open.iloc[-1:].get_values()[0]
for ind, row in lookback.reset_index().iterrows():
if ind < self.LOOKBACK_DAYS:
state['close_' + str(self.LOOKBACK_DAYS - ind)] = row.Close
state['open_' + str(self.LOOKBACK_DAYS - ind)] = row.Open
state['volume_' + str(self.LOOKBACK_DAYS - ind)] = row.Volume
一种方法是使用
.values
我还将添加一些创建等效示例的步骤:
import pandas as pd
from itertools import product
initial = ['cash', 'num_shares', 'somethingsomething']
initial_series = pd.Series([1, 2, 3], index = initial)
print(initial_series)
#Output:
cash 1
num_shares 2
somethingsomething 3
dtype: int64
好的,只是在输出中系列开头的一些值,例如mock
df = pd.read_clipboard(sep='\s\s+') #pure magic
print(df.head())
#Output:
Date Open ... Adj Close Volume
5932 2016-08-18 218.339996 ... 207.483215 52989300
5933 2016-08-19 218.309998 ... 207.179825 75443000
5934 2016-08-22 218.259995 ... 207.170364 61368800
5935 2016-08-23 219.250000 ... 207.587479 53399200
5936 2016-08-24 218.800003 ... 206.525711 71728900
[5 rows x 7 columns]
df现在基本上就是您在示例中提供的数据帧。剪贴板技巧来源于《熊猫》杂志,对熊猫麦克维斯来说是一本很好的读物
to_select = ['Close', 'Open', 'Volume']
SOMELOOKBACK = 6000 #mocked
final_index = [f"{name}_{index}" for index, name in product((SOMELOOKBACK - df.index), to_select)]
这将准备索引,看起来像这样
['Close_68',
'Open_68',
'Volume_68',
'Close_67',
'Open_67',
'Volume_67',
...
]
现在,只需从dataframe中选择相关列,使用.values
获得一个2d数组,然后展平,获得最终系列
final_series = pd.Series(df[to_select].values.flatten(), index = final_index)
result = initial_series.append(final_series)
#Output:
cash 1.000000e+00
num_shares 2.000000e+00
somethingsomething 3.000000e+00
Close_68 2.188600e+02
Open_68 2.183400e+02
Volume_68 5.298930e+07
Close_67 2.185400e+02
Open_67 2.183100e+02
Volume_67 7.544300e+07
Close_66 2.185300e+02
Open_66 2.182600e+02
Volume_66 6.136880e+07
...
Close_48 2.133700e+02
Open_48 2.134800e+02
Volume_48 1.552364e+08
Length: 66, dtype: float64
请定义“transform”这个词,transform这个词不对。我想把它转换成一个系列?这将有助于理解代码试图做什么,以及预期的输出是什么。@coldspeed更新了post以反映预期的输出
final_series = pd.Series(df[to_select].values.flatten(), index = final_index)
result = initial_series.append(final_series)
#Output:
cash 1.000000e+00
num_shares 2.000000e+00
somethingsomething 3.000000e+00
Close_68 2.188600e+02
Open_68 2.183400e+02
Volume_68 5.298930e+07
Close_67 2.185400e+02
Open_67 2.183100e+02
Volume_67 7.544300e+07
Close_66 2.185300e+02
Open_66 2.182600e+02
Volume_66 6.136880e+07
...
Close_48 2.133700e+02
Open_48 2.134800e+02
Volume_48 1.552364e+08
Length: 66, dtype: float64