Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/348.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 熊猫:每组只显示一次总数_Python_Pandas - Fatal编程技术网

Python 熊猫:每组只显示一次总数

Python 熊猫:每组只显示一次总数,python,pandas,Python,Pandas,我有以下数据: 说明卡会员费 “苹果”“亚当”2 “苹果”“亚当”2 “梨”“鲍勃”7 “橙色”“爱丽丝”8 “橙色”“爱丽丝”8 “橙色”“爱丽丝”8 我正在尝试添加如下所示的总计列: 说明卡会员成本**总计** “苹果”“亚当”2 “苹果”“亚当”2 4 “梨”“鲍勃”7 “橙色”“爱丽丝”8 “橙色”“爱丽丝”8 “橙色”“爱丽丝”8 24 我尝试使用df[“Total”]=df.groupby('Card Member')['Cost'].transform('sum') 虽然它会在

我有以下数据:

说明卡会员费
“苹果”“亚当”2
“苹果”“亚当”2
“梨”“鲍勃”7
“橙色”“爱丽丝”8
“橙色”“爱丽丝”8
“橙色”“爱丽丝”8
我正在尝试添加如下所示的总计列:

说明卡会员成本**总计**
“苹果”“亚当”2
“苹果”“亚当”2 4
“梨”“鲍勃”7
“橙色”“爱丽丝”8
“橙色”“爱丽丝”8
“橙色”“爱丽丝”8 24
我尝试使用
df[“Total”]=df.groupby('Card Member')['Cost'].transform('sum')

虽然它会在每一行之后生成总计,但我只希望总计在每行每个成员的末尾显示一次

这就是它产生的结果:

说明卡会员成本**总计**
“苹果”“亚当”2 4
“苹果”“亚当”2 4
“梨”“鲍勃”7
“橙色”“爱丽丝”8 24
“橙色”“爱丽丝”8 24
“橙色”“爱丽丝”8 24
正如你所看到的,总数被一次又一次地重复,这使得我的数据不那么可读。我只希望总行值显示一次,然后在每个成员的末尾显示,而不是让它们不断重复出现


我正在考虑循环并删除不等于下一次迭代的值,但如果不同成员的总数相同,这将导致问题。

您可以使用重复的
提取最后一行:

s = ~df.duplicated(['Description','CardMember'], keep='last')

df.loc[s,'total'] = df.groupby(['Description', 'CardMember'], sort=False)['Cost'].transform('sum')
输出:

  Description CardMember  Cost  total
0     "apple"     "adam"     2    NaN
1     "apple"     "adam"     2    4.0
2      "pear"      "bob"     7    7.0
3    "orange"    "alice"     8    NaN
4    "orange"    "alice"     8    NaN
5    "orange"    "alice"     8   24.0

您可以使用重复的
提取最后一行

s = ~df.duplicated(['Description','CardMember'], keep='last')

df.loc[s,'total'] = df.groupby(['Description', 'CardMember'], sort=False)['Cost'].transform('sum')
输出:

  Description CardMember  Cost  total
0     "apple"     "adam"     2    NaN
1     "apple"     "adam"     2    4.0
2      "pear"      "bob"     7    7.0
3    "orange"    "alice"     8    NaN
4    "orange"    "alice"     8    NaN
5    "orange"    "alice"     8   24.0
这应该行得通

df["total"] = 0

for name in df["Card Member"].unique():
    df_sel = df[df["Card Memebr"] == name]
    df_sel.iloc[len(df_sel) - 1, 4] = df_sel["Cost"].sum()
    df[df["Card Member"] == name] = df_sel
这应该行得通

df["total"] = 0

for name in df["Card Member"].unique():
    df_sel = df[df["Card Memebr"] == name]
    df_sel.iloc[len(df_sel) - 1, 4] = df_sel["Cost"].sum()
    df[df["Card Member"] == name] = df_sel
您可以在此处将参数设置为
last

s = df.groupby('Card Member')['Cost'].transform('sum')
df.assign(total = s.mask(s.duplicated(keep = 'last'))

       Desc      mem  cost  total
0   "apple"   "adam"     2    NaN
1   "apple"   "adam"     2    4.0
2    "pear"    "bob"     7    7.0
3  "orange"  "alice"     8    NaN
4  "orange"  "alice"     8    NaN
5  "orange"  "alice"     8   24.0
您可以在此处将参数设置为
last

s = df.groupby('Card Member')['Cost'].transform('sum')
df.assign(total = s.mask(s.duplicated(keep = 'last'))

       Desc      mem  cost  total
0   "apple"   "adam"     2    NaN
1   "apple"   "adam"     2    4.0
2    "pear"    "bob"     7    7.0
3  "orange"  "alice"     8    NaN
4  "orange"  "alice"     8    NaN
5  "orange"  "alice"     8   24.0

np.其中
版本:

df["Total"] = np.where(~df['Card Member'].duplicated('last'),
                       df.groupby('Card Member')['Cost'].transform('sum'),
                       None)

df['Description'].duplicated('last')
将每个复制组的最后一个值标记为
False
,因此
~df['Description'].duplicated('last')
可用于将这些值标记为
True
,并且仅在这些行中输入您的
groupby
计算。

np

df["Total"] = np.where(~df['Card Member'].duplicated('last'),
                       df.groupby('Card Member')['Cost'].transform('sum'),
                       None)

df['Description'].duplicated('last')
将每个复制组的最后一个值标记为
False
,因此
~df['Description'].duplicated('last')
可用于将这些值标记为
True
,并且仅在这些行中输入您的
groupby
计算值。

让我们用
apply

s=df.groupby(['Description','Card'],as_index=False).MemberCost.apply(lambda x : pd.Series(x.sum(),index=[x.index[-1]])).reset_index(level=0,drop=True)
df['New']=s
df
Out[103]: 
  Description     Card  MemberCost   New
0     "apple"   "adam"           2   NaN
1     "apple"   "adam"           2   4.0
2      "pear"    "bob"           7   7.0
3    "orange"  "alice"           8   NaN
4    "orange"  "alice"           8   NaN
5    "orange"  "alice"           8  24.0

让我们用
apply

s=df.groupby(['Description','Card'],as_index=False).MemberCost.apply(lambda x : pd.Series(x.sum(),index=[x.index[-1]])).reset_index(level=0,drop=True)
df['New']=s
df
Out[103]: 
  Description     Card  MemberCost   New
0     "apple"   "adam"           2   NaN
1     "apple"   "adam"           2   4.0
2      "pear"    "bob"           7   7.0
3    "orange"  "alice"           8   NaN
4    "orange"  "alice"           8   NaN
5    "orange"  "alice"           8  24.0