Python 带熊猫的月度数据的百分比值_Python_Pandas_Dataframe_Numpy_Data Science

Python 带熊猫的月度数据的百分比值

python pandas dataframe numpy

Python 带熊猫的月度数据的百分比值,python,pandas,dataframe,numpy,data-science,Python,Pandas,Dataframe,Numpy,Data Science,我有一个数据示例： date Product Sales 2020-01-01. Dell. 4 2020-01-01. Apple. 6 2020-01-01. Lenovo. 5 2020-01-02. Dell. 2 2020-01-02. Apple. 4 2020-01-02. Lenovo. 3 我想创建另一个名为“月销售额百分比”的列，它是通过（某产品的月销售额/该月所有产品的总销售额）*100获得的输出应如下所示： date

我有一个数据示例：

date        Product  Sales
2020-01-01.  Dell.    4
2020-01-01.  Apple.   6
2020-01-01.  Lenovo.  5
2020-01-02.  Dell.    2
2020-01-02.  Apple.   4
2020-01-02.  Lenovo.  3

我想创建另一个名为“月销售额百分比”的列，它是通过（某产品的月销售额/该月所有产品的总销售额）*100获得的

输出应如下所示：

date        Product  Sales. Percentage_monthly_sale
2020-01-01.  Dell.    4.      26.6 (4/15 *100)
2020-01-01.  Apple.   6.      40.0. (6/15*100)
2020-01-01.  Lenovo.  5.      33.3.  (5/15 *100)
2020-01-02.  Dell.    2.      22.2 (2/9 *100)
2020-01-02.  Apple.   4.      44.4 (4/9 *100)
2020-01-02.  Lenovo.  3.      33.3 (3/9 *100)

使用with获得总数，然后将序列相除相乘：

（显示所需输出）

要获取每个产品的月销售额百分比，请执行以下操作：

（解释期望的行为）

您可以将

groupby transform

与

lambda函数一起使用

：

df['Percentage_daily_sale'] = df.groupby(
    ['date'])['Sales'].transform(lambda x: (x/x.sum()) * 100)

输出：

          date  Product  Sales  Percentage_daily_sale
0  2020-01-01.    Dell.      4                  26.67
1  2020-01-01.   Apple.      6                  40.00
2  2020-01-01.  Lenovo.      5                  33.33
3  2020-01-02.    Dell.      2                  22.22
4  2020-01-02.   Apple.      4                  44.44
5  2020-01-02.  Lenovo.      3                  33.33

看来你的解释和输出有很大的不同，你能澄清你的问题吗？预期产量为每日销售额的百分比。但是你的话说你想要每个产品的总数超过这个月的总数。

df['date'] = pd.to_datetime(df['date'])

monthly_product_total = df.groupby(
    [pd.Grouper(key='date', freq='1M'), 'Product']
)['Sales'].transform('sum')

monthly_total = df.groupby(
    pd.Grouper(key='date', freq='1M')
)['Sales'].transform('sum')

df['Percentage_Monthly_sale'] = monthly_product_total / monthly_total * 100

        date  Product  Sales  Percentage_Monthly_sale
0 2020-01-01    Dell.      4                25.000000
1 2020-01-01   Apple.      6                41.666667
2 2020-01-01  Lenovo.      5                33.333333
3 2020-01-02    Dell.      2                25.000000
4 2020-01-02   Apple.      4                41.666667
5 2020-01-02  Lenovo.      3                33.333333

df['Percentage_daily_sale'] = df.groupby(
    ['date'])['Sales'].transform(lambda x: (x/x.sum()) * 100)

          date  Product  Sales  Percentage_daily_sale
0  2020-01-01.    Dell.      4                  26.67
1  2020-01-01.   Apple.      6                  40.00
2  2020-01-01.  Lenovo.      5                  33.33
3  2020-01-02.    Dell.      2                  22.22
4  2020-01-02.   Apple.      4                  44.44
5  2020-01-02.  Lenovo.      3                  33.33