Python 字典中特定列表的计数频率
假设我有一本字典:Python 字典中特定列表的计数频率,python,Python,假设我有一本字典: thisdict = { "1": ['Vanilla','Chocolate'] "2": ['Vanilla'] "7": ['Chocolate'] "8": ['Chocolate','Vanilla'] } (注:钥匙为身份证号码) 我想知道一个特定列表出现多少次的频率,而不考虑元素的顺序。所以我希望我的结果是: ['Chocolate','Vanilla'] = 2 ['Chocolate'] = 1 ['Vanilla'] = 1 我该怎么做
thisdict = {
"1": ['Vanilla','Chocolate']
"2": ['Vanilla']
"7": ['Chocolate']
"8": ['Chocolate','Vanilla']
}
(注:钥匙为身份证号码)
我想知道一个特定列表出现多少次的频率,而不考虑元素的顺序。所以我希望我的结果是:
['Chocolate','Vanilla'] = 2
['Chocolate'] = 1
['Vanilla'] = 1
我该怎么做
现在我试过了,Chief是字典的名字,我想找出以下值的频率:
track = {}
for key,value in chief.items():
if value not in track:
track[value]=0
else:
track[value]+=1
print(track)
但是列表是不可散列的,所以它不起作用
非常感谢你的帮助 看起来您希望以顺序无关紧要的方式计算值。您可以转换为一个集合,但是集合是不可散列的,这使得转换有点困难。您可以使用可散列的,并允许
['Chocolate','Vanilla']
与['Vanilla','Chocolate']
计数相同:
from collections import Counter
thisdict = {
"1": ['Vanilla','Chocolate'],
"2": ['Vanilla'],
"7": ['Chocolate'],
"8": ['Chocolate','Vanilla']
}
counts = Counter(map(frozenset, thisdict.values()))
计数
将是一个计数器实例,如:
Counter({frozenset({'Chocolate', 'Vanilla'}): 2,
frozenset({'Vanilla'}): 1,
frozenset({'Chocolate'}): 1})
由于不考虑顺序,您可以将列表转换为
frozenset
s,以便使用collections。Counter
计算每组值的频率:
from collections import Counter
for combination, count in Counter(map(frozenset, thisdict.values())).items():
print(f'{list(combination)} = {count}')
这将产生:
['Vanilla', 'Chocolate'] = 2
['Vanilla'] = 1
['Chocolate'] = 1
首先生成dict中所有列表的排序副本,然后将每个列表转换为元组,然后对每个元组进行散列,怎么样?这应该比一组更好地处理重复的值:
test =[tuple(sorted(x)) for x in thisdict.values()]
创建一个等价排序的哈希表以进行查找:
mytable = [hash(x) for x in test])
然后在其上运行一个循环:
result = []
for item in test:
count = 0
for i, value in enumerate(mytable):
if hash(item) == value and count == 0:
count += 1
result.append([item, count])
elif hash(item) == value and count >= 1:
result[:1][0][1] += 1
test[i] = '(None)'
else:
pass
给出:
[[('Chocolate', 'Vanilla'), 2], [('Vanilla',), 1], [('Chocolate',), 1]]
仅供参考-您的dict格式不正确,但一旦格式正确,您也可以使用熊猫以相对简单的方式执行此操作:
import pandas as pd
df = pd.DataFrame({'vals':[*thisdict.values()],'keys':[*thisdict.keys()]})
df
vals keys
0 [Vanilla, Chocolate] 1
1 [Vanilla] 2
2 [Chocolate] 7
3 [Chocolate, Vanilla] 8
out = df['vals'].apply(lambda x: tuple(sorted(x))).value_counts()
out
(Chocolate, Vanilla) 2
(Chocolate,) 1
(Vanilla,) 1
Name: vals, dtype: int64
您可以转换为可散列的元组。或者更好地将它们转换为集合,以便
set(['Chocolate','Vanilla'])==set(['Vanilla','Chocolate'])
除了集合也不可散列之外。。。