Python 如何按id对数据进行分组并找出相邻数据的差异？_Python_Pandas_Math_Dataframe

Python 如何按id对数据进行分组并找出相邻数据的差异？

python pandas math dataframe

Python 如何按id对数据进行分组并找出相邻数据的差异？,python,pandas,math,dataframe,Python,Pandas,Math,Dataframe,我有以下数据 id starting_point ending_point A 2525 6565 B 5656 8989 A 1234 5656 A 4562 6245 B 6496 9999 B 1122

我有以下数据

id    starting_point      ending_point
A        2525                 6565
B        5656                 8989
A        1234                 5656
A        4562                 6245
B        6496                 9999
B        1122                 2211

关于上述数据：

df['id'] = ['A','B','A','A','B', 'B']
df['starting_point'] =['2525','5656','1234','4562','6496','1122']
df['ending_point'] = ['6565','8989','5656','6245','9999','9999']

我想编写一个python代码，根据它们的id（即a，B，…）对它们进行分组，并找出a的第一个和第二个端点之和与a的第二个和第三个端点之和之间的差异。在这种情况下[（6565+5656）-（5656+6245）]。

IIUC您可以将第二个端点scip，如

（6565+5656）-（5656+6245）

6565-6245

：

In [15]: df.groupby('id')['ending_point'].apply(lambda x: x - x.shift(-2))
Out[15]:
0     320.0
1    6778.0
2       NaN
3       NaN
4       NaN
5       NaN
Name: ending_point, dtype: float64

您可以与自定义函数一起使用，其中“选择依据”和“获取和的差值”：

df = df.groupby('id')['ending_point'] \
       .apply(lambda x: x.iloc[:2].sum() - x.iloc[1:4].sum()).reset_index()
print (df)
  id  ending_point
0  A           320
1  B          6778

另一种可能的解决办法：

group = df.groupby('id')
group['ending_point'].first() - group['ending_point'].last()

你试过写实际的代码吗？谢谢！也可以使用列名找到它吗？我不确定是否理解，您是否需要

df=df.groupby（'id'）['starting_point'，'ending_point'].apply（lambda x:x.iloc[:2].sum（）-x.iloc[1:4].sum（））.reset_index（）

？抱歉！那是个错误！非常感谢你！它起作用了！没问题，天气不错！