Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/288.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
从groupby对象Python创建字典_Python_Dictionary_Pandas_Group By - Fatal编程技术网

从groupby对象Python创建字典

从groupby对象Python创建字典,python,dictionary,pandas,group-by,Python,Dictionary,Pandas,Group By,假设我有一个数据帧: df = pd.DataFrame({'Type' : ['Pokemon', 'Pokemon', 'Bird', 'Pokemon', 'Bird', 'Pokemon', 'Pokemon', 'Bird'],'Name' : ['Jerry', 'Jerry', 'Flappy Bird', 'Mudkip','Pigeon', 'Mudkip', 'Jerry', 'Pigeon']}) 我根据类型对其进行分组: print df.groupby(['Typ

假设我有一个数据帧:

df = pd.DataFrame({'Type' : ['Pokemon', 'Pokemon', 'Bird', 'Pokemon', 'Bird', 'Pokemon', 'Pokemon', 'Bird'],'Name' : ['Jerry', 'Jerry', 'Flappy Bird', 'Mudkip','Pigeon', 'Mudkip', 'Jerry', 'Pigeon']})  
我根据类型对其进行分组:

print df.groupby(['Type','Name'])['Type'].agg({'Frequency':'count'})

                           Frequency
Type    Name                  
Bird    Flappy Bird          1
        Pigeon               2
Pokemon Jerry                3
        Mudkip               2
我可以从上面的组中创建词典吗??
“Bird”
将有一个列表值,其中包含
['bike',Flappy Bird']
请注意,更高频率的名称应首先出现在值列表

预期输出:

dict1 = { 'Bird':['Pigeon','Flappy Bird'] , 'Pokemon':['Jerry','Mudkip'] }

您可以使用字典理解创建字典,如下所示

df = pd.DataFrame({'Type' : ['Pokemon', 'Pokemon', 'Bird', 'Pokemon', 'Bird', 'Pokemon', 'Pokemon', 'Bird'],'Name' : ['Jerry', 'Jerry', 'Flappy Bird', 'Mudkip','Pigeon', 'Mudkip', 'Jerry', 'Pigeon']})  
f = df.groupby(['Type','Name'])['Type'].agg({'Frequency':'count'})
f.sort('Frequency',ascending=False, inplace=True)

d = {k:list(f.ix[k].index) for k in f.index.levels[0]}
print(d)
# {'Bird': ['Pigeon', 'Flappy Bird'], 'Pokemon': ['Jerry', 'Mudkip']}
字典理解将遍历外部索引(“Bird”、“Pokemon”),然后将该值设置为字典的内部索引

必须首先按
频率
列对
多索引
进行排序,以获得所需的顺序。

这里有一行

df.groupby(['Type'])['Name'].apply(lambda grp: list(grp.value_counts().index)).to_dict()

# output
#{'Bird': ['Pigeon', 'Flappy Bird'], 'Pokemon': ['Jerry', 'Mudkip']}
value\u counts
函数按计数隐式分组
Name
字段,默认情况下返回降序

奖金:如果您想包括计数,您可以执行以下操作

df.groupby(['Type']).apply(lambda grp: grp.groupby('Name')['Type'].count().to_dict()).to_dict()

# {'Bird': {'Flappy Bird': 1, 'Pigeon': 2}, 'Pokemon': {'Jerry': 3, 'Mudkip': 2}}

DataFrame.sort()
已被弃用,现在已被删除。现在使用
f.对值进行排序()
Hi@DanDy:你的奖金部分帮助了我。你能详细说明一下吗?我想知道它是怎么工作的。谢谢