Python 按其中的值对列表进行分组

Python 按其中的值对列表进行分组,python,list,group-by,Python,List,Group By,我正在尝试按其中的值对列表进行分组: a = [1,2] b = [0,1,2] c = [0,1] d = [3,4] e = [3,4,5] f = [4,5] g = [2,6] h = [7,8] 所以,如果列表中的一个值与另一个列表共享,我想将它们分组到新列表中。在我的情况下,我的期望输出: out = ([a,b,c,g],[d,e,f],[h]) 为了实现这一点,我尝试了以下几点: lists = IN[0] values = IN[1] out = [] out1=[]

我正在尝试按其中的值对列表进行分组:

a = [1,2]
b = [0,1,2]
c = [0,1]
d = [3,4]
e = [3,4,5]
f = [4,5]
g = [2,6]
h = [7,8]
所以,如果列表中的一个值与另一个列表共享,我想将它们分组到新列表中。在我的情况下,我的期望输出:

out = ([a,b,c,g],[d,e,f],[h])
为了实现这一点,我尝试了以下几点:

lists = IN[0] 
values = IN[1] 
out = [] 
out1=[] 
def check(valeur , lists): 
    if valeur in lists: 
        result = True 
    else: 
        result = False 
    return result 

for list in lists: 
    out1=[] 
    for i in values: 
        out1.append(check(i,list)) 
    out.append(out1) 
OUT = out 
我可怜的脑袋快死了


谢谢你的帮助

这很容易被视为一个图形问题。每个列表都是一个节点,如果两个节点共享任何公共元素,则它们之间有一条边,并且您需要图形的连接组件,即,在任意两个节点之间存在路径的每组节点。使用流行的图形库networkx:

导入networkx
从itertools导入组合
从pprint导入pprint
a=[1,2]
b=[0,1,2]
c=[0,1]
d=[3,4]
e=[3,4,5]
f=[4,5]
g=[2,6]
h=[7,8]
节点=列表(映射(冻结集[a、b、c、d、e、f、g、h]))
G=networkx.Graph()
G.从(节点)添加节点
对于组合中的n1、n2(节点,2):
如果n1.交叉点(n2):
G.添加边缘(n1、n2)
pprint(列表(networkx.connected_组件(G)))

此处演示:

这可以通过图形解决

图中的节点是以下列表:

a = [1,2]
b = [0,1,2]
c = [0,1]
d = [3,4]
e = [3,4,5]
f = [4,5]
g = [2,6]
h = [7,8]
如果节点有一个共同元素(即两个列表相交),则节点是连接的

优化的联合查找代码

def find(data, i):
 " finds parent of i optimized so O(log(n)) operation "
  if i != data[i]:
      data[i] = find(data, data[i])
  return data[i]

# A function that does union of two sets of x and y 
# (uses union by rank) 
def union(data, rank, x, y): 
  xroot = find(data, x) 
  yroot = find(data, y) 

  # Attach smaller rank tree under root of  
  # high rank tree (Union by Rank) 
  if rank[xroot] < rank[yroot]: 
      data[xroot] = yroot 
  elif rank[xroot] > rank[yroot]: 
      data[yroot] = xroot 

  # If ranks are same, then make one as root  
  # and increment its rank by one 
  else : 
      data[yroot] = xroot 
      rank[xroot] += 1
# Aggregate lists
lst = [a, b, c, d, e, f, g, h]
n = len(lst)
# Elements of lists as node vertices (0, 1, 2, etc.)
# With: 0 -> a, 1 -> b, 2 -> c, 3 -> d, etc. of our sublists
data = [i for i in range(n)]  # list of vertices
# initialize ranks
rank = [0]*n

# Find pairs of vertices with connections (create generator)
connections = ((i, j) for i in range(n) for j in range(i+1, n) if len(set(lst[i]).intersection(lst[j])) > 0)

# Create union of all connections
for i, j in connections:
  union(data, rank, i, j)

# Aggrgate all connected components (i.e. vertices with the same parent)
d = {}
for i in range(n):
  # find(data, i) => parent of i
  # Append all vertices with the same parent into the same list
  d.setdefault(find(data, i), []).append(lst[i])

print(*d.values(), sep = '\n')
输出

[[1, 2], [0, 1, 2], [0, 1], [2, 6]]
[[3, 4], [3, 4, 5], [4, 5]]
[[7, 8]]

到目前为止,您尝试了什么?我尝试检查每个值(从0到8)是否在每个列表中。但是在这之后,我不知道如何处理这些信息。你能把你用过的代码添加到问题中吗?另外,如果列表中的值会出现在多个集合中,如何处理?
lists=in[0]values=in[1]out=[]out1=[]def check(valeur,list):if valeur in list:result=True else:result=False返回列表中列表的结果:out1=[]对于i in value:out1.append(check(i,list))out.append(out1)OUT=OUT
我不是专家,所以我的代码可能不是完美的:D和我在软件中用IronPython编写代码,但在尝试演示时,我发现一个
ModuleNotFound
错误。@Jan您很可能需要
pip安装networkx