Python 3.x 在每个类型中排名下载应用程序的最高数量,并仅筛选每个类型中排名前2的应用程序
我正在尝试从下面的数据集中获取下载量最高的前2个应用程序,这是我使用的数据集Python 3.x 在每个类型中排名下载应用程序的最高数量,并仅筛选每个类型中排名前2的应用程序,python-3.x,pandas,pandas-groupby,Python 3.x,Pandas,Pandas Groupby,我正在尝试从下面的数据集中获取下载量最高的前2个应用程序,这是我使用的数据集 import pandas as pd df = pd.DataFrame(data={'genre_id': ['tools', 'tools', 'VIDEO_PLAYERS', 'VIDEO_PLAYERS', 'PHOTOGRAPHY'], 'app_id':['MP3Cutter','Phot
import pandas as pd
df = pd.DataFrame(data={'genre_id': ['tools', 'tools', 'VIDEO_PLAYERS',
'VIDEO_PLAYERS', 'PHOTOGRAPHY'],
'app_id':['MP3Cutter','PhotoCutter','VLC','MXPlayer','Picasa'],
'min_installs': [10, 100, 10, 20,1000]})
df
这就是我尝试过的
df['default_rank'] = df.groupby(['genre_id']).agg(['rank'])
df.sort_values(by='default_rank')
我得到的输出如下:
genre_id app_id min_installs default_rank
0 tools MP3Cutter 10 1.0
2 VIDEO_PLAYERS VLC 10 1.0
4 PHOTOGRAPHY Picasa 1000 1.0
1 tools PhotoCutter 100 2.0
3 VIDEO_PLAYERS MXPlayer 20 2.0
genre_id app_id min_installs default_rank
4 PHOTOGRAPHY Picasa 1000 1.0
1 tools PhotoCutter 100 1.0
0 tools MP3Cutter 10 2.0
3 VIDEO_PLAYERS MXPlayer 20 1.0
2 VIDEO_PLAYERS VLC 10 2.0
但我想得到这样的东西:
genre_id app_id min_installs default_rank
0 tools MP3Cutter 10 1.0
2 VIDEO_PLAYERS VLC 10 1.0
4 PHOTOGRAPHY Picasa 1000 1.0
1 tools PhotoCutter 100 2.0
3 VIDEO_PLAYERS MXPlayer 20 2.0
genre_id app_id min_installs default_rank
4 PHOTOGRAPHY Picasa 1000 1.0
1 tools PhotoCutter 100 1.0
0 tools MP3Cutter 10 2.0
3 VIDEO_PLAYERS MXPlayer 20 1.0
2 VIDEO_PLAYERS VLC 10 2.0
我是熊猫新手,使用python熊猫可以进行高级数据操作吗?就像我们在SQL中所做的那样?我相信您需要按照每个组的最大值进行排序,然后使用:
s = df.groupby('genre_id')['min_installs'].transform('max')
df['default_rank'] = (df.groupby('genre_id')['min_installs']
.rank(method='max', ascending=False))
df = (df.assign(m=s)
.sort_values(by=['m', 'default_rank', 'genre_id'],
ascending=[False, True, True])
.drop('m', axis=1))
如果需要筛选每组的top2值:
df = df.groupby('genre_id').head(2)
print (df)
genre_id app_id min_installs default_rank
4 PHOTOGRAPHY Picasa 1000 1.0
1 tools PhotoCutter 100 1.0
0 tools MP3Cutter 10 2.0
3 VIDEO_PLAYERS MXPlayer 20 1.0
2 VIDEO_PLAYERS VLC 10 2.0