Python 带熊猫的月度数据的百分比值

Python 带熊猫的月度数据的百分比值,python,pandas,dataframe,numpy,data-science,Python,Pandas,Dataframe,Numpy,Data Science,我有一个数据示例: date Product Sales 2020-01-01. Dell. 4 2020-01-01. Apple. 6 2020-01-01. Lenovo. 5 2020-01-02. Dell. 2 2020-01-02. Apple. 4 2020-01-02. Lenovo. 3 我想创建另一个名为“月销售额百分比”的列,它是通过(某产品的月销售额/该月所有产品的总销售额)*100获得的 输出应如下所示: date

我有一个数据示例:

date        Product  Sales
2020-01-01.  Dell.    4
2020-01-01.  Apple.   6
2020-01-01.  Lenovo.  5
2020-01-02.  Dell.    2
2020-01-02.  Apple.   4
2020-01-02.  Lenovo.  3
我想创建另一个名为“月销售额百分比”的列,它是通过(某产品的月销售额/该月所有产品的总销售额)*100获得的

输出应如下所示:

date        Product  Sales. Percentage_monthly_sale
2020-01-01.  Dell.    4.      26.6 (4/15 *100)
2020-01-01.  Apple.   6.      40.0. (6/15*100)
2020-01-01.  Lenovo.  5.      33.3.  (5/15 *100)
2020-01-02.  Dell.    2.      22.2 (2/9 *100)
2020-01-02.  Apple.   4.      44.4 (4/9 *100)
2020-01-02.  Lenovo.  3.      33.3 (3/9 *100)
使用with获得总数,然后将序列相除相乘:

(显示所需输出)


要获取每个产品的月销售额百分比,请执行以下操作:

(解释期望的行为)


您可以将
groupby transform
lambda函数一起使用

df['Percentage_daily_sale'] = df.groupby(
    ['date'])['Sales'].transform(lambda x: (x/x.sum()) * 100)
输出

          date  Product  Sales  Percentage_daily_sale
0  2020-01-01.    Dell.      4                  26.67
1  2020-01-01.   Apple.      6                  40.00
2  2020-01-01.  Lenovo.      5                  33.33
3  2020-01-02.    Dell.      2                  22.22
4  2020-01-02.   Apple.      4                  44.44
5  2020-01-02.  Lenovo.      3                  33.33

看来你的解释和输出有很大的不同,你能澄清你的问题吗?预期产量为每日销售额的百分比。但是你的话说你想要每个产品的总数超过这个月的总数。
df['date'] = pd.to_datetime(df['date'])

monthly_product_total = df.groupby(
    [pd.Grouper(key='date', freq='1M'), 'Product']
)['Sales'].transform('sum')

monthly_total = df.groupby(
    pd.Grouper(key='date', freq='1M')
)['Sales'].transform('sum')

df['Percentage_Monthly_sale'] = monthly_product_total / monthly_total * 100
        date  Product  Sales  Percentage_Monthly_sale
0 2020-01-01    Dell.      4                25.000000
1 2020-01-01   Apple.      6                41.666667
2 2020-01-01  Lenovo.      5                33.333333
3 2020-01-02    Dell.      2                25.000000
4 2020-01-02   Apple.      4                41.666667
5 2020-01-02  Lenovo.      3                33.333333
df['Percentage_daily_sale'] = df.groupby(
    ['date'])['Sales'].transform(lambda x: (x/x.sum()) * 100)
          date  Product  Sales  Percentage_daily_sale
0  2020-01-01.    Dell.      4                  26.67
1  2020-01-01.   Apple.      6                  40.00
2  2020-01-01.  Lenovo.      5                  33.33
3  2020-01-02.    Dell.      2                  22.22
4  2020-01-02.   Apple.      4                  44.44
5  2020-01-02.  Lenovo.      3                  33.33