Python 获取数据帧中的上一个值_Python_Pandas_Dataframe

Python 获取数据帧中的上一个值

python pandas dataframe

Python 获取数据帧中的上一个值,python,pandas,dataframe,Python,Pandas,Dataframe,假设我有这样的数据 id X Y Z ----------------- 0 1 2 10 0 1 2 20 0 1 3 30 0 1 4 40 0 2 2 50 0 2 2 60 0 2 2 70 0 2 3 80 0 2 3 90 0 2 3 100 0 2 3 110 0 2 4 120 我想计算X，Y对的上一个值和“索引”。最终结果应该是这样的 id X

假设我有这样的数据

id   X  Y  Z
-----------------
0    1  2  10
0    1  2  20
0    1  3  30
0    1  4  40
0    2  2  50
0    2  2  60
0    2  2  70
0    2  3  80
0    2  3  90
0    2  3  100
0    2  3  110
0    2  4  120

我想计算X，Y对的上一个值和“索引”。最终结果应该是这样的

id   X  Y  Z    Z_previous   Z_index
---------------------------------------
0    1  2  10       0          0
0    1  2  20      10          1
0    1  3  30       0          0
0    1  4  40       0          0
0    2  2  50       0          0
0    2  2  60      50          1
0    2  2  70      60          2
0    2  3  80       0          0
0    2  3  90      80          1      
0    2  3  100     90          2
0    2  3  110    100          3
0    2  4  120      0          0

if X != X_previous || Y != Y_previous:
    Z_previous = 0

所以，我用shift新建了3列

pf[Z_previous] = df.Z.shift(1)
pf[X_previous] = df.X.shift(1)
pf[Y_previous] = df.Y.shift(1)

现在我要做这样的事情

id   X  Y  Z    Z_previous   Z_index
---------------------------------------
0    1  2  10       0          0
0    1  2  20      10          1
0    1  3  30       0          0
0    1  4  40       0          0
0    2  2  50       0          0
0    2  2  60      50          1
0    2  2  70      60          2
0    2  3  80       0          0
0    2  3  90      80          1      
0    2  3  100     90          2
0    2  3  110    100          3
0    2  4  120      0          0

if X != X_previous || Y != Y_previous:
    Z_previous = 0

我不知道如何使用数据帧实现这一点

有更好的方法吗？

您可以：

# row index in a group
df2['index']=df.groupby(['X','Y']).cumcount()+1

# groupby to calculate aggregates
xf = df2.groupby(['X','Y']).agg(Z_previous=('Z', 'shift'),
                                Z_index = ('index', 'shift')).fillna(0)

# join the result
df2 = pd.concat([df2.drop('index', 1), xf], axis=1)

print(df2)

    id  X  Y    Z  Z_previous  Z_index
0    0  1  2   10         0.0      0.0
1    0  1  2   20        10.0      1.0
2    0  1  3   30         0.0      0.0
3    0  1  4   40         0.0      0.0
4    0  2  2   50         0.0      0.0
5    0  2  2   60        50.0      1.0
6    0  2  2   70        60.0      2.0
7    0  2  3   80         0.0      0.0
8    0  2  3   90        80.0      1.0
9    0  2  3  100        90.0      2.0
10   0  2  3  110       100.0      3.0
11   0  2  4  120         0.0      0.0