Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/296.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 根据另一列的值创建索引数最大的新列_Python - Fatal编程技术网

Python 根据另一列的值创建索引数最大的新列

Python 根据另一列的值创建索引数最大的新列,python,Python,我有一个数据框,有两列:“商品名称”和“总销售额”。我需要做另一列,其中将包含从1,2,3计算的最大销售额指数。。。其中1是最大数,2是第二大数,依此类推 希望你能帮助我 我的数据帧: lst = [['Keyboard1', 1860], ['Keyboard2', 1650], ['Keyboard3', 900], ['Keyboard4', 1230], ['Keyboard5', 1150], ['Keyboard6', 1345], ['Mous

我有一个数据框,有两列:“商品名称”和“总销售额”。我需要做另一列,其中将包含从1,2,3计算的最大销售额指数。。。其中1是最大数,2是第二大数,依此类推

希望你能帮助我

我的数据帧:

lst = [['Keyboard1', 1860], ['Keyboard2', 1650], ['Keyboard3', 900], ['Keyboard4', 1230], ['Keyboard5', 1150], ['Keyboard6', 1345],
                   ['Mouse1', 3100], ['Mouse2', 2900], ['Mouse3', 3050], ['Mouse4', 2750], ['Mouse5', 4100], ['Mouse6', 3910]]

df = pd.DataFrame(lst, columns = ['Goods', 'Sales'])

       Goods    Sales
0   Keyboard1   1860
1   Keyboard2   1650
2   Keyboard3   900
3   Keyboard4   1230
4   Keyboard5   1150
5   Keyboard6   1345
6   Mouse1  3100
7   Mouse2  2900
8   Mouse3  3050
9   Mouse4  2750
10  Mouse5  4100
11  Mouse6  3910
我正在尝试使用以下代码:

import pandas as pd
import numpy as np

df = df.sort_values('Sales', ascending = False)
df['Largest'] = np.arange(len(df))+1
但是我得到了所有商品的最大值索引,我需要分别得到每种商品的最大值索引。我的结果是:

        Goods  Sales  Largest
10     Mouse5    4100        1
11     Mouse6    3910        2
6      Mouse1    3100        3
8      Mouse3    3050        4
7      Mouse2    2900        5
9      Mouse4    2750        6
1   Keyboard2    1860        7
0   Keyboard1    1650        8
5   Keyboard6    1345        9
3   Keyboard4    1230       10
4   Keyboard5    1150       11
2   Keyboard3     900       12
以下是我需要的输出:

        Goods  Sales  Largest
10     Mouse5    4100        1
11     Mouse6    3910        2
6      Mouse1    3100        3
8      Mouse3    3050        4
7      Mouse2    2900        5
9      Mouse4    2750        6
1   Keyboard2    1860        1
0   Keyboard1    1650        2
5   Keyboard6    1345        3
3   Keyboard4    1230        4
4   Keyboard5    1150        5
2   Keyboard3     900        6
只要做:

# remove any number of groups at the end
df['goods_group'] = df['Goods'].str.replace('\d+$', '')

# sort by the new column and sales
df = df.sort_values(['goods_group', 'Sales'], ascending=False)

# create largest column
df['largest'] = df.groupby('goods_group').cumcount() + 1

# drop the new column
res = df.drop('goods_group', 1)
print(res)
输出

        Goods  Sales  largest
10     Mouse5   4100        1
11     Mouse6   3910        2
6      Mouse1   3100        3
8      Mouse3   3050        4
7      Mouse2   2900        5
9      Mouse4   2750        6
0   Keyboard1   1860        1
1   Keyboard2   1650        2
5   Keyboard6   1345        3
3   Keyboard4   1230        4
4   Keyboard5   1150        5
2   Keyboard3    900        6

您可以
groupby
Goods
而不使用数字:

>>> df = df.sort_values('Sales', ascending=False)
>>> df
        Goods  Sales
10     Mouse5   4100
11     Mouse6   3910
6      Mouse1   3100
8      Mouse3   3050
7      Mouse2   2900
9      Mouse4   2750
0   Keyboard1   1860
1   Keyboard2   1650
5   Keyboard6   1345
3   Keyboard4   1230
4   Keyboard5   1150
2   Keyboard3    900
>>> df['Largest'] = df.groupby(df['Goods'].replace('\d+', '', regex=True)).cumcount() + 1
>>> df
        Goods  Sales  Largest
10     Mouse5   4100        1
11     Mouse6   3910        2
6      Mouse1   3100        3
8      Mouse3   3050        4
7      Mouse2   2900        5
9      Mouse4   2750        6
0   Keyboard1   1860        1
1   Keyboard2   1650        2
5   Keyboard6   1345        3
3   Keyboard4   1230        4
4   Keyboard5   1150        5
2   Keyboard3    900        6

尝试在代码末尾添加以下行:

df['new'] = df['Goods'].str[:-1]
df['Largest'] = df.groupby('new').cumcount() + 1
df = df.drop('new', axis=1)
print(df)
输出:

        Goods  Sales       new  Largest
10     Mouse5   4100     Mouse        1
11     Mouse6   3910     Mouse        2
6      Mouse1   3100     Mouse        3
8      Mouse3   3050     Mouse        4
7      Mouse2   2900     Mouse        5
9      Mouse4   2750     Mouse        6
0   Keyboard1   1860  Keyboard        1
1   Keyboard2   1650  Keyboard        2
5   Keyboard6   1345  Keyboard        3
3   Keyboard4   1230  Keyboard        4
4   Keyboard5   1150  Keyboard        5
2   Keyboard3    900  Keyboard        6

您希望同一类型的商品始终是连续的吗?非常感谢您的努力。(有效)