使用Python中的Pandas,仅选择group by group count为1的行
我已按此处建议筛选了我的数据: 现在,我只想让作者在这个数据框中出现一次。我写了这封信,但不起作用:使用Python中的Pandas,仅选择group by group count为1的行,python,pandas,dataframe,Python,Pandas,Dataframe,我已按此处建议筛选了我的数据: 现在,我只想让作者在这个数据框中出现一次。我写了这封信,但不起作用: def where_just_one_exists(group): return group.loc[group.count() == 1] most_expensive_single_category = most_expensive_for_each_model.groupby('author', as_index = False).apply(where_just_one_
def where_just_one_exists(group):
return group.loc[group.count() == 1]
most_expensive_single_category = most_expensive_for_each_model.groupby('author', as_index = False).apply(where_just_one_exists).reset_index(drop = True)
print most_expensive_single_category
错误:
File "/home/mike/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1659, in check_bool_indexer
raise IndexingError('Unalignable boolean Series key provided')
pandas.core.indexing.IndexingError: Unalignable boolean Series key provided
我期望的输出是:
author cat val
0 author1 category2 15
1 author2 category4 9
2 author3 category1 7
3 author3 category3 7
容易的
我的解决方案有点复杂,但仍然有效
def groupbyOneOccurrence(df):
grouped = df.groupby("author")
retDf = pd.DataFrame()
for group in grouped:
if len(group[1]._get_values) == 1:
retDf = pd.concat([retDf, group[1]])
return retDf
author cat val
0 author1 category2 15
1 author2 category4 9
你想要的输出是什么?我已经添加了想要的输出。Padraic的解决方案似乎正是下面有人建议的。在应用count和mean之后,我将如何对其进行排序?
df.groupby('author').filter(lambda x: len(x)==1)
author cat val
id
0 author1 category2 15
1 author2 category4 9
def groupbyOneOccurrence(df):
grouped = df.groupby("author")
retDf = pd.DataFrame()
for group in grouped:
if len(group[1]._get_values) == 1:
retDf = pd.concat([retDf, group[1]])
return retDf
author cat val
0 author1 category2 15
1 author2 category4 9