Python 使用不相等(!=)计算浮点64或int64的频率

Python 使用不相等(!=)计算浮点64或int64的频率,python,pandas,Python,Pandas,我知道有很多帖子,但这并不能解决我的问题 我知道数据框是这样的: df1 = [{"Customer Number": "AFIMBN01000BCA17030001177", "Account Name": "Sunarto","Debit/Credit Indicator" : "k","Money" : 100}, {"Customer Number": "AFIMBN01000BCA17030001177", "Account Name": "Sunarto","Debit/Cr

我知道有很多帖子,但这并不能解决我的问题

我知道数据框是这样的:

df1 = [{"Customer Number": "AFIMBN01000BCA17030001177", "Account Name": "Sunarto","Debit/Credit Indicator" : "k","Money" : 100},
    {"Customer Number": "AFIMBN01000BCA17030001177", "Account Name": "Sunarto","Debit/Credit Indicator": "k","Money" : 200},
    {"Customer Number": "AFIMBN01000BCA17030001177", "Account Name": "Sunarto","Debit/Credit Indicator" : "D", "Money" : 0}]
df1 = pd.DataFrame(df1)
df1

Account Name    Customer Number           Debit/Credit Indicator         Money
Sunarto      AFIMBN01000BCA17030001177       k                            100
Sunarto      AFIMBN01000BCA17030001177       k                            200
Sunarto      AFIMBN01000BCA17030001177       D                             0

Account Name              object
Customer Number           object
Debit/Credit Indicator    object
Money                      int64 (or let's say float64)
我想根据“钱”计算频率

如果钱是0,就不算了

我试过使用
df1[“Money”]。value\u counts()
不起作用

df1.loc[df1["Money"] != 0, "Per item"] = df1["Money"].value_counts()
df1

Account Name    Customer Number           Debit/Credit Indicator         Money   Per item
Sunarto      AFIMBN01000BCA17030001177       k                            100     1
Sunarto      AFIMBN01000BCA17030001177       k                            200    NaN
Sunarto      AFIMBN01000BCA17030001177       D                             0   NaN
但我的期望是

Account Name    Customer Number           Debit/Credit Indicator         Money   Per item
Sunarto      AFIMBN01000BCA17030001177       k                            100     1
Sunarto      AFIMBN01000BCA17030001177       k                            200    1
Sunarto      AFIMBN01000BCA17030001177       D                             0   0
所以我的期望是,当我申请pivot时,我可以得到“钱”上有价值的物品

我的期望值

gdf = pd.pivot_table(df1, index = ["Account Name","Customer Number"],values = ["Money", "Per item"],aggfunc = np.sum)

gdf.head()

                                                Money              Per item
Account Name      Customer Number
Sunarto           AFIMBN01000BCA17030001177     300                2.0


您需要为每个条件分配
1

df1.loc[df1["Money"] != 0, "Per item"] = 1
或将布尔掩码转换为整数:

df1["Per item"] = (df1["Money"] != 0).astype(int)
另一个不带聚合的透视表的解决方案:

gdf = (df1.groupby(["Account Name","Customer Number"])['Money']
          .agg([('Money','sum'), ('Per item', lambda x: x.ne(0).sum())]))
print (gdf)
                                        Money  Per item
Account Name Customer Number                           
Sunarto      AFIMBN01000BCA17030001177    300         2
编辑:

我可以知道为什么我的代码不起作用吗

问题是带有计数器值的返回序列,但索引值是由原始
序列的值创建的,这里是
100200
。因此,索引不匹配并获取缺少的值。解决方案是使用:

但如果有多个复制值,则问题不在于分配
1
而是计数器值并得到错误的输出,这里double
200
值错误地返回
4
值,而在于
2

df1 = [{"Customer Number": "AFIMBN01000BCA17030001177", "Account Name": "Sunarto","Debit/Credit Indicator" : "k","Money" : 200},
    {"Customer Number": "AFIMBN01000BCA17030001177", "Account Name": "Sunarto","Debit/Credit Indicator": "k","Money" : 200},
    {"Customer Number": "AFIMBN01000BCA17030001177", "Account Name": "Sunarto","Debit/Credit Indicator" : "D", "Money" : 0}]
df1 = pd.DataFrame(df1)


df1.loc[df1["Money"] != 0, "Per item"] = df1["Money"].map(df1["Money"].value_counts())
print (df1)
  Account Name            Customer Number Debit/Credit Indicator  Money  \
0      Sunarto  AFIMBN01000BCA17030001177                      k    200   
1      Sunarto  AFIMBN01000BCA17030001177                      k    200   
2      Sunarto  AFIMBN01000BCA17030001177                      D      0   

   Per item  
0       2.0  
1       2.0  
2       NaN  

gdf = pd.pivot_table(df1, index = ["Account Name","Customer Number"],values = ["Money", "Per item"],aggfunc = np.sum)

print (gdf)
                                        Money  Per item
Account Name Customer Number                           
Sunarto      AFIMBN01000BCA17030001177    400       4.0

天哪,谢谢你的解决,愚蠢的我呵呵…我能知道为什么我的代码不起作用吗?先生。。“lambda x:x.ne(0.sum()”的含义是什么?我不知道这个语句的功能for@charismabathara-当然,我解释。第一个
ne
是-它在这里工作就像
一样=
sum
仅用于计数
True
值,因为
.ne(0)
返回
True
False
值。
df1 = [{"Customer Number": "AFIMBN01000BCA17030001177", "Account Name": "Sunarto","Debit/Credit Indicator" : "k","Money" : 200},
    {"Customer Number": "AFIMBN01000BCA17030001177", "Account Name": "Sunarto","Debit/Credit Indicator": "k","Money" : 200},
    {"Customer Number": "AFIMBN01000BCA17030001177", "Account Name": "Sunarto","Debit/Credit Indicator" : "D", "Money" : 0}]
df1 = pd.DataFrame(df1)


df1.loc[df1["Money"] != 0, "Per item"] = df1["Money"].map(df1["Money"].value_counts())
print (df1)
  Account Name            Customer Number Debit/Credit Indicator  Money  \
0      Sunarto  AFIMBN01000BCA17030001177                      k    200   
1      Sunarto  AFIMBN01000BCA17030001177                      k    200   
2      Sunarto  AFIMBN01000BCA17030001177                      D      0   

   Per item  
0       2.0  
1       2.0  
2       NaN  

gdf = pd.pivot_table(df1, index = ["Account Name","Customer Number"],values = ["Money", "Per item"],aggfunc = np.sum)

print (gdf)
                                        Money  Per item
Account Name Customer Number                           
Sunarto      AFIMBN01000BCA17030001177    400       4.0