Python pandes-如果使用groupby函数对两列进行了汇总，则将一列中的数据除以另一列_Python_Pandas_Group By

Python pandes-如果使用groupby函数对两列进行了汇总，则将一列中的数据除以另一列

python pandas

Python pandes-如果使用groupby函数对两列进行了汇总，则将一列中的数据除以另一列,python,pandas,group-by,Python,Pandas,Group By,我有下面的代码生成了我想要的四列 df['revenue'] = pd.to_numeric(df['revenue']) #not exactly sure what this does df['Date'] = pd.to_datetime(df['Date'], unit='s') df['Year'] = df['Date'].dt.year df['First Purchase Date'] = pd.to_datetime(df['First Purchase Date'],

我有下面的代码生成了我想要的四列

   df['revenue'] = pd.to_numeric(df['revenue']) #not exactly sure what this does
df['Date'] = pd.to_datetime(df['Date'], unit='s')
df['Year'] = df['Date'].dt.year
df['First Purchase Date'] = pd.to_datetime(df['First Purchase Date'], unit='s')


df['number_existing_customers'] = df.groupby(df['Year'])[['Existing Customer']].sum()
df['number_new_customers'] = df.groupby(df['Year'])[['New Customer']].sum()
df['Rate'] = df['number_new_customers']/df['number_existing_customers']

Table = df.groupby(df['Year'])[['New Customer', 'Existing Customer', 'Rate', 'revenue']].sum()

print(Table)

我希望能够将一列除以另一列（新客户除以现有客户），但在创建新列时，我似乎得到了零（请参见下面的输出）

您只需定义列，然后使用相应的运算符，在本例中为

：

Table['Rate'] = Table['New customer']/Table['Existing customer']

在本例中，我复制您的

表

输出，并使用我发布的代码：

import pandas as pd
import numpy as np
data = {'Year':[2014,2015,2016,2017,2018,2019],'New customer':[7,1,5,9,12,16],'Existing customer':[2,3,3,3,7,10],'revenue':[1000,1000,1000,1001,1100,1200]}
Table = pd.DataFrame(data).set_index('Year')
Table['Rate'] = Table['New customer']/Table['Existing customer']
print(Table)

输出：

      New customer  Existing customer  revenue      Rate
Year
2014             7                  2     1000  3.500000
2015             1                  3     1000  0.333333
2016             5                  3     1000  1.666667
2017             9                  3     1001  3.000000
2018            12                  7     1100  1.714286
2019            16                 10     1200  1.600000

df['column1']/df['column2']

如果您想要新的列：

df['new\u coll']=df['column1']/df['column2']

？我不能完全确定返回数据帧的任务是否按照您的预期工作。在左边，您有您的原始数据帧，无论索引是什么。在右侧，

groupby

的结果将是一个由组键索引的数据帧，在本例中为“年”。因此，当您重新分配它时，它将只分配原始数据帧索引重叠的位置，如果日期合理，给定的RangeIndex可能类似于1990-2017年前后的行。不管怎样，这可能不是你想要的。谢谢你。输出符合要求，代码工作符合我的预期。Quang Hoang，您提出的代码给了我一个错误，上面写着“NotImplementedError:operator'/'not implemented for bool dtypes”Ivan，我尝试过这个，但数据中似乎得到了零。我已经修改了我的原始代码以包含上面的内容。如果您使用名为

表的数据帧尝试此操作，您应该不会遇到任何问题。我已经编辑了我的答案，以适应数据框架表
我仍然得到了所有的零，因为您删除了我错误地放在代码中的。
。另外，如果您这样做：print（table.info（））是int/float类型的列'New customer'
和'Existing customer'？明白了。问题是我在Table=df.groupby行之前添加了表['Rate']calc，但它应该在后面。
      New customer  Existing customer  revenue      Rate
Year
2014             7                  2     1000  3.500000
2015             1                  3     1000  0.333333
2016             5                  3     1000  1.666667
2017             9                  3     1001  3.000000
2018            12                  7     1100  1.714286
2019            16                 10     1200  1.600000