Pandas 熊猫：当索引不唯一时，将diff与groupby一起使用时出现问题_Pandas_Python_Pandas Groupby

Pandas 熊猫：当索引不唯一时，将diff与groupby一起使用时出现问题

pandas python

Pandas 熊猫：当索引不唯一时，将diff与groupby一起使用时出现问题,pandas,python,pandas-groupby,Pandas,Python,Pandas Groupby,我使用的是（版本0.20.3），我想将diff（）方法应用于groupby（），但结果不是数据帧，而是一个“下划线” 代码如下： import numpy as np import pandas as pd # creating the DataFrame data = np.random.random(18).reshape(6,3) indexes = ['B']*3 + ['A']*3 columns = ['x', 'y', 'z'] df = pd.DataFrame(data, i

我使用的是（版本0.20.3），我想将

diff（）

方法应用于

groupby（）

，但结果不是数据帧，而是一个“下划线”

代码如下：

import numpy as np
import pandas as pd

# creating the DataFrame
data = np.random.random(18).reshape(6,3)
indexes = ['B']*3 + ['A']*3
columns = ['x', 'y', 'z']
df = pd.DataFrame(data, index=indexes, columns=columns)
df.index.name = 'chain_id'

# Now I want to apply the diff method in function of the chain_id
df.groupby('chain_id').diff()

结果是一个下划线

请注意，

df.loc['A'].diff（）

和

df.loc['B'].diff（）

确实返回了预期的结果，因此我不明白为什么它不能与

groupby（）

IIUC一起工作，错误：无法从重复轴重新编制索引

因为你有一个非唯一的索引！您的索引有重复项

['B']*3+['A']*3

。

df.reset_index().groupby('chain_id').diff().set_index(df.index)
Out[859]: 
                 x         y         z
chain_id                              
B              NaN       NaN       NaN
B        -0.468771  0.192558 -0.443570
B         0.323697  0.288441  0.441060
A              NaN       NaN       NaN
A        -0.198785  0.056766  0.081513
A         0.138780  0.563841  0.635097