Python 以列表的形式获取购买最多的前10项商品_Python_Pandas_Dataframe

Python 以列表的形式获取购买最多的前10项商品

python pandas dataframe

Python 以列表的形式获取购买最多的前10项商品,python,pandas,dataframe,Python,Pandas,Dataframe,我想得到清单上买得最多的物品。但现在我只知道了数字。如何获取ItemID 我想要如下列表[元素1，元素2，…，元素n-1，元素n]。我如何获得最受欢迎的十大商品我怎样才能买到我买的大部分东西？通过计算itemid出现的频率，出现次数最多的id就是购买次数最多的商品 import pandas as pd d = {'userid': [0, 0, 0, 1, 2, 2, 3, 3, 4, 4, 4], 'itemid': [715, 845, 98, 12324, 85, 715,

我想得到清单上买得最多的物品。但现在我只知道了数字。如何获取ItemID

我想要如下列表

[元素1，元素2，…，元素n-1，元素n]

。我如何获得最受欢迎的十大商品

我怎样才能买到我买的大部分东西？通过计算itemid出现的频率，出现次数最多的id就是购买次数最多的商品

import pandas as pd
d = {'userid': [0, 0, 0, 1, 2, 2, 3, 3, 4, 4, 4],
     'itemid': [715, 845, 98, 12324, 85, 715, 2112, 85, 2112, 852, 102]}
df = pd.DataFrame(data=d)
print(df.head(10))

df_new = df.groupby('itemid').count().head(10) # get the top 10 products
print(df_new)

print(df_new.values.tolist())

现在是输出

# the dataframe
   userid  itemid
0       0     715
1       0     845
2       0      98
3       1   12324
4       2      85
5       2     715
6       3    2112
7       3      85
8       4    2112
9       4     852

# the counts
        userid
itemid        
85           2
98           1
102          1
715          2
845          1
852          1
2112         2
12324        1

# the list
[[2], [1], [1], [2], [1], [1], [2], [1]]

# what I want
[85, 98, 102, 715, 845, 852, 2112, 12324]

按“userid”排序，然后返回前10名的索引

new_df = df.groupby('itemid').count().sort_values(by=['userid', 'itemid'], ascending=[False, True])
print(new_df[:10].index.tolist())

我们可以将userid重命名为user\u counts，在这种情况下，我们按user\u counts排序

top_10 = df.groupby('itemid').count().rename(columns = {"userid": "user_counts"}).sort_values(by=['user_counts', 'itemid'], ascending=[False, True])[:10].index.tolist()
print(top_10)

如果“most BUND”是基于行数的，那么它只是值计数，它返回一个排序结果，然后切片索引：

df['itemid'].value\u counts（）.index[0:10]

。但是如果你想根据购买物品的独特人群的数量来选择，你需要

groupby

nunique

，看起来你想要的是df_new的索引：

df_new.index.tolist（）

谢谢你的回答。为什么我必须按

userid

排序？它实际上是userid\u计数，但列名为userid