Python2.7-熊猫数据帧分组依据两个标准_Python_Python 2.7_Pandas Groupby

Python2.7-熊猫数据帧分组依据两个标准

python python-2.7

Python2.7-熊猫数据帧分组依据两个标准,python,python-2.7,pandas-groupby,Python,Python 2.7,Pandas Groupby,假设我有一个panadas数据帧： import pandas as pd df = pd.DataFrame(columns=['name','time']) df = df.append({'name':'Waren', 'time': '20:15'}, ignore_index=True) df = df.append({'name':'Waren', 'time': '20:12'}, ignore_index=True) df = df.append({'name':'Waren'

假设我有一个panadas数据帧：

import pandas as pd

df = pd.DataFrame(columns=['name','time'])
df = df.append({'name':'Waren', 'time': '20:15'}, ignore_index=True)
df = df.append({'name':'Waren', 'time': '20:12'}, ignore_index=True)
df = df.append({'name':'Waren', 'time': '20:11'}, ignore_index=True)
df = df.append({'name':'Waren', 'time': '01:29'}, ignore_index=True)
df = df.append({'name':'Waren', 'time': '02:15'}, ignore_index=True)
df = df.append({'name':'Waren', 'time': '02:16'}, ignore_index=True)

df = df.append({'name':'Kim', 'time': '20:11'}, ignore_index=True)
df = df.append({'name':'Kim', 'time': '01:29'}, ignore_index=True)
df = df.append({'name':'Kim', 'time': '02:15'}, ignore_index=True)
df = df.append({'name':'Kim', 'time': '01:49'}, ignore_index=True)
df = df.append({'name':'Kim', 'time': '01:49'}, ignore_index=True)
df = df.append({'name':'Kim', 'time': '02:15'}, ignore_index=True)
df = df.append({'name':'Mary', 'time': '22:15'}, ignore_index=True)
df = df.drop(df.index[2])
df = df.drop(df.index[7])

我想按名称对这个框架进行分组，然后按连续索引进行分组

所需的输出将是如下所示的分组：

因此，行是按名称分组的，对于这个连续增加的行，只使用第一个和最后一个元素作为索引

我这样试过： df.groupby['name'].groupbydf.index.to_series.diff.ne1.cumsum.group 这只会引发错误： AttributeError:无法访问“DataFrameGroupBy”对象的可调用属性“groupby”，请尝试使用“apply”方法

欢迎任何帮助

你做错了。执行df.groupby['name']时，它返回不可调用的属性groupby。您需要同时应用这两者


df.groupby(['name', df.index.to_series().diff().ne(1).cumsum()]).groups

Out: 
{('Kim', 2): [6, 7],
 ('Kim', 3): [9, 10, 11],
 ('Mary', 3): [12],
 ('Waren', 1): [0, 1],
 ('Waren', 2): [3, 4, 5]}