Python-字典的交叉列表

Python-字典的交叉列表,python,dictionary,intersection,Python,Dictionary,Intersection,我有以下词典列表: artist_and_tags = [{u'Yo La Tengo': ['indie', 'indie rock', 'seen live', 'alternative', 'indie pop', 'rock', 'post-rock', 'dream pop', 'shoegaze', 'noise pop', 'folk', 'experimental', 'alternative rock', 'american', 'lo-fi', 'pop', 'new jer

我有以下词典列表:

artist_and_tags = [{u'Yo La Tengo': ['indie', 'indie rock', 'seen live', 'alternative', 'indie pop', 'rock', 'post-rock', 'dream pop', 'shoegaze', 'noise pop', 'folk', 'experimental', 'alternative rock', 'american', 'lo-fi', 'pop', 'new jersey', 'yo la tengo', 'usa', 'noise rock', '90s', 'noise', '00s', 'ambient', 'post-punk', '80s', 'mellow', 'psychedelic', 'hoboken', 'experimental rock', 'singer-songwriter', 'post rock', 'electronic', 'female vocalists', 'alt-country', 'dreamy', 'matador', 'chillout', 'instrumental', 'favorites', 'punk', 'electronica', 'slowcore', 'folk rock', 'new wave', 'jazz', 'eclectic', 'new york', 'emo']}, {u'Radiohead': ['alternative', 'alternative rock', 'rock', 'indie', 'electronic', 'seen live', 'british', 'britpop', 'indie rock', 'experimental', 'radiohead', 'progressive rock', '90s', 'electronica', 'art rock', 'experimental rock', 'post-rock', 'psychedelic', 'uk', 'male vocalists', 'pop', '00s', 'ambient', 'chillout', 'progressive', 'favorites', 'melancholic', 'awesome', 'overrated', 'english', 'beautiful', 'classic rock', 'genius', 'melancholy', 'better than radiohead', 'trip-hop', 'idm', 'indie pop', 'emo']}, {u'Portishead': ['trip-hop', 'electronic', 'female vocalists', 'chillout', 'trip hop', 'alternative', 'electronica', 'seen live', 'downtempo', 'british', 'indie', 'portishead', 'experimental', 'ambient', 'female vocalist', 'alternative rock', '90s', 'lounge', 'mellow', 'bristol', 'jazz', 'psychedelic', 'chill', 'melancholic', 'triphop', 'uk', 'rock', 'bristol sound', 'acid jazz', 'lo-fi']}]
我用它来建立艺术家之间的关系

为此,我正在做:

tags0 = set(artist_and_tags[0].values()[0])
tags1 = set(artist_and_tags[1].values()[0])
tags2 = set(artist_and_tags[2].values()[0])
然后:

因此:

我发现“Yo La Tengo”比“Portishead”更接近“Radiohead”,有20个交叉标签

这段代码似乎有点多余,但是

问题:


有没有办法在
for循环
(或者封装在一个简单的
函数
)中使用此逻辑,因此它可以与具有
n
艺术家(
)的字典一起工作?

您可以使用
itertools.compositions

import itertools
import collections

ArtistTags = collections.namedtuple('ArtistTags', ('name', 'tags'))
tags = (ArtistTags(artist, set(tags))
        for artists_dict in artist_and_tags
        for artist, tags in artists_dict.items())
artist_pairings = itertools.combinations(tags, 2)
intersections = ((len(a.tags & b.tags), a, b) for a, b in artist_pairings)
for n, a, b in sorted(intersections, reverse=True):
    print(n, a.name, b.name)
输出:

20 Yo La Tengo Radiohead
16 Yo La Tengo Portishead
16 Radiohead Portishead

你应该把集合作为值,不是吗?除非需要排序,或者可能需要多个项目?
tags0=set(artist_和_标记[0].values()[0])
-->-->
TypeError:“dict_values”对象不支持索引
给定n个艺术家,您想找到最匹配的两个,还是全部匹配?所有匹配,是的,这是一个嵌套列表。它相当于
tags=[];对于艺术家和标签中的艺术家:对于艺术家,艺术家中的标签\u dict.items():tags.append(艺术家标签(艺术家,集(标签))
import itertools
import collections

ArtistTags = collections.namedtuple('ArtistTags', ('name', 'tags'))
tags = (ArtistTags(artist, set(tags))
        for artists_dict in artist_and_tags
        for artist, tags in artists_dict.items())
artist_pairings = itertools.combinations(tags, 2)
intersections = ((len(a.tags & b.tags), a, b) for a, b in artist_pairings)
for n, a, b in sorted(intersections, reverse=True):
    print(n, a.name, b.name)
20 Yo La Tengo Radiohead
16 Yo La Tengo Portishead
16 Radiohead Portishead