Pandas 如何查找列连续月数为6的索引?
示例Pandas 如何查找列连续月数为6的索引?,pandas,Pandas,示例 df id date A 201901 A 201902 A 201903 A 201904 A 201905 A 201906 A 202006 A 202007 A 202008 B 202008 B 202009 B 202109 B 202110 B 202111 C
df
id date
A 201901
A 201902
A 201903
A 201904
A 201905
A 201906
A 202006
A 202007
A 202008
B 202008
B 202009
B 202109
B 202110
B 202111
C 201901
C 201902
C 201903
C 201904
C 201905
C 201906
C 202006
C 202007
C 202008
C 202009
C 202010
C 202011
对于每个id,对日期进行排序
预期的
我想找到连续6个月的id。对于idA
,201901-201906,而idC
为202006-2020011
expected_id=['A','C']
列日期的类型为对象
试试看
我不知道如何获得它。您可以通过以下方式使用聚合count
s进行修改:
和最后一个过滤器id
通过相等的6
进行过滤:
df['date'] = pd.to_datetime(df['date'], format='%Y%m').dt.to_period('M')
new = df.groupby('id', group_keys=False)['date'].diff().ne(pd.offsets.MonthEnd()).cumsum()
df = df.groupby(['id',new]).size().reset_index(name='count')
print (df)
id date count
0 A 1 6
1 A 2 3
2 B 3 2
3 B 4 3
4 C 5 6
5 C 6 6
expected_id = df.loc[df['count'].eq(6), 'id'].unique().tolist()
print (expected_id)
['A', 'C']