Pandas 如何根据不同的条件分组和计数?
预期结果:Pandas 如何根据不同的条件分组和计数?,pandas,Pandas,预期结果: is_correct, question_id t 1 t 1 f 1 f 1 t 2 t 2 这是我的,但我只能得到一个正确的计数 correct_count, incorrect_count, question_id 2 2 1 2 0 2 您可以
is_correct, question_id
t 1
t 1
f 1
f 1
t 2
t 2
这是我的,但我只能得到一个正确的计数
correct_count, incorrect_count, question_id
2 2 1
2 0 2
您可以使用pivot_table函数:
df[df["is_correct"]].groupby("question_id")["question_id"].count()
创建用于计数的另一列后,可以使用groupby:
In [28]: data = """\
....: is_correct question_id
....: t 1
....: t 1
....: f 1
....: f 1
....: t 2
....: t 2
....: """
In [29]: df = pd.read_csv(io.StringIO(data), delim_whitespace=True)
In [30]: df['count'] = 0
In [31]:
In [31]: df
Out[31]:
is_correct question_id count
0 t 1 0
1 t 1 0
2 f 1 0
3 f 1 0
4 t 2 0
5 t 2 0
In [32]:
In [32]: df.pivot_table(index='question_id', columns='is_correct',
....: values='count', aggfunc='count', fill_value=0)\
....: .reset_index()
Out[32]:
is_correct question_id f t
0 1 2 2
1 2 0 2
创建groupby后,您只需重新排列数据,使其适合所需的列:
df = pd.DataFrame({'is_correct':['t','t','f','f','t','t'],'question_id':[1,1,1,1,2,2]})
df['to_sum_up']=1
is_correct question_id to_sum_up
t 1 1
t 1 1
f 1 1
f 1 1
t 2 1
t 2 1
df2 = df.groupby(['question_id','is_correct'],as_index = False).sum()
然后,为了有一个好的数据帧作为输出:
df2['correct_count'] = df2.ix[df2['is_correct']=='t','N']
df2['incorrect_count'] = df2.ix[df2['is_correct']=='f','N']
它的可能副本是一个副本。虽然MaxU对这个问题的解答比另一个问题有更好、更有趣的答案,但请将另一个问题标记为这个问题的副本,以便所有问题都指向这个问题。@samol,这有帮助吗?
df2.ix[df2['correct_count'].isnull(),'correct_count'] = 0
df2.ix[df2['incorrect_count'].isnull(),'incorrect_count'] = 0
df2 = df2.groupby('question_id',as_index = False).max()
df2 = df2.drop(['N','is_correct'],1)
question_id correct_count incorrect_count
0 1 2 2
1 2 2 0