Python中列表中列表的成对比较中元素的频率

Python中列表中列表的成对比较中元素的频率,python,list,loops,frequency,pairwise,Python,List,Loops,Frequency,Pairwise,我有这样一个列表: my_list_of_lists = [['sparrow','sparrow','sparrow','junco','jay','robin'], ['sparrow','sparrow','junco', 'sparrow','robin','robin'], ['sparrow','sparrow','sparrow','sparrow','jay','robin']] #1 with 2 ['sparrow','sparrow','sparrow','junco'

我有这样一个列表:

my_list_of_lists = 
[['sparrow','sparrow','sparrow','junco','jay','robin'],
['sparrow','sparrow','junco', 'sparrow','robin','robin'],
['sparrow','sparrow','sparrow','sparrow','jay','robin']]
#1 with 2
['sparrow','sparrow','sparrow','junco','jay','robin']
['sparrow','sparrow','junco', 'sparrow','robin','robin']

#1 with 3
['sparrow','sparrow','sparrow','junco','jay','robin']
['sparrow','sparrow','sparrow','sparrow','jay','robin']

#2 with 3
['sparrow','sparrow','junco', 'sparrow','robin','robin']
['sparrow','sparrow','sparrow','sparrow','jay','robin']
我想在每个位置对所有列表进行两两比较,如下所示:

my_list_of_lists = 
[['sparrow','sparrow','sparrow','junco','jay','robin'],
['sparrow','sparrow','junco', 'sparrow','robin','robin'],
['sparrow','sparrow','sparrow','sparrow','jay','robin']]
#1 with 2
['sparrow','sparrow','sparrow','junco','jay','robin']
['sparrow','sparrow','junco', 'sparrow','robin','robin']

#1 with 3
['sparrow','sparrow','sparrow','junco','jay','robin']
['sparrow','sparrow','sparrow','sparrow','jay','robin']

#2 with 3
['sparrow','sparrow','junco', 'sparrow','robin','robin']
['sparrow','sparrow','sparrow','sparrow','jay','robin']
因此,1和2的对:

pairs =[('sparrow','sparrow'), ('sparrow','sparrow'), ('sparrow','junco'),('junco','sparrow'),('junco','junco'), ('jay','robin'), ('robin','robin')]
我想获得每对比较中成对的计数和频率:

pairs =[('sparrow','sparrow'), ('sparrow','sparrow'), ('sparrow','junco'),('junco','sparrow') ('junco','junco'), ('jay','robin'), ('robin','robin')]

sparrowsparrow_counts = 2
juncosparrow_counts = 2
jayrobin_counts = 1
robinrobin = 1

frequency_of_combos = [('sparrow', 'sparrow'):.333, ('sparrow', 'junco'):.333, ('jay', 'robin'):.167, ('robin', 'robin'): .167]
我试过压缩,但最后我把所有的列表(不是成对的)压缩成元组,其余的我都被难倒了


我认为这与我的数据有点相关,但我不知道如何将其应用于我的数据。

压缩两个列表,然后过滤出不匹配的对,并使用集合。计数器计数:

from collections import Counter

a = ['sparrow','sparrow','sparrow','junco','jay','robin']
b = ['sparrow','sparrow','junco', 'sparrow','robin','robin']
c = Counter([ i for i in zip(a,b) if i[0] == i[1]])
print(c)


Counter({('sparrow', 'sparrow'): 2, ('robin', 'robin'): 1})

你似乎已经计算出了频率部分,但这应该可以清除zip和Counter的使用。

你压缩了一对dict;使用
collections.Counter
结构来计算对数。除以总数。现在为三个列表配对中的每一个重复这个步骤。感谢您的快速回复。为了确保我正确理解您的意思,我会创建一个这些列表的字典(也就是列表的字典),然后使用collections.Counter为每个列表的每个成对比较计数对。除以总数,就会得到频率。至于重复,在完整的数据集中,我必须迭代这些列表中的大约75个,所以我会循环浏览字典。这是一个正确的解释吗?对不起,请把一对三张单子合起来,而不是新的单子。如果您有75个列表,那么我建议您使用
itertools.combines(列表的列表,2)
来生成列表对。