Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/list/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在python中高效地搜索字符串列表以查找字符串列表_Python_List - Fatal编程技术网

在python中高效地搜索字符串列表以查找字符串列表

在python中高效地搜索字符串列表以查找字符串列表,python,list,Python,List,我有一个字符串列表和一个字符串列表。例如: L1=[["cat","dog","apple"],["orange","green","red"]] L2=["cat","red"] 如果L1[i]包含L2中的任何项目,我需要放置对(用于在图形中创建边) 比如,在我的例子中,我需要成对的(“猫”,“狗”),(“猫,苹果”),(“红,橙”),(“红”,“绿”) 我应该使用什么方法使其最有效。(我的列表L1很大)我建议将它们全部转换为sets,并使用set操作(交集)来计算L2中的哪些术语在每个L1

我有一个字符串列表和一个字符串列表。例如:

L1=[["cat","dog","apple"],["orange","green","red"]]
L2=["cat","red"]
如果L1[i]包含L2中的任何项目,我需要放置对(用于在图形中创建边) 比如,在我的例子中,我需要成对的
(“猫”,“狗”),(“猫,苹果”),(“红,橙”),(“红”,“绿”)


我应该使用什么方法使其最有效。(我的列表L1很大)

我建议将它们全部转换为
set
s,并使用set操作(交集)来计算L2中的哪些术语在每个L1项中。然后,可以使用集合减法获得需要配对的项目列表

edges = []
L2set = set(L2)
for L1item in L1:
    L1set = set(L1item)
    items_in_L1item = L1set & L2set
    for item in items_in_L1item:
        items_to_pair = L1set - set([item])
        edges.extend((item, i) for i in items_to_pair)

要使此代码即使在
L1
L2
很大的情况下也是最佳的,请使用它生成一个生成器,而不是创建一个巨大的元组列表。如果你在Python3中工作,只需使用

代码很容易理解,几乎是纯英语!首先,循环遍历每个列表及其对应的元素,然后询问该元素是否在列表中,如果在列表中,则打印除该对(x,x)之外的所有对

输出:

[('cat', 'dog'), ('cat', 'apple'), ('red', 'orange'), ('red', 'green')]

假设在L1子列表中可能有多个“控制”项

我会使用和:

例如:

>>> L1 = [["cat","dog","apple"],
...       ["orange","green","red"],
...       ["hand","cat","red"]]
>>> L2 = ["cat","red"]
>>> generate_edges(L1, L2)
[('apple', 'cat'),
 ('dog', 'cat'),
 ('orange', 'red'),
 ('green', 'red'),
 ('hand', 'red'),
 ('hand', 'cat')]

如果L1非常大,您可能需要考虑使用对分。它要求您首先对L1进行展平和排序。你可以这样做:

from bisect import bisect_left, bisect_right
from itertools import chain

L1=[["cat","dog","apple"],["orange","green","red","apple"]]
L2=["apple", "cat","red"]

M1 = [[i]*len(j) for i, j in enumerate(L1)]
M1 = list(chain(*M1))
L1flat = list(chain(*L1))
I = sorted(range(len(L1flat)), key=L1flat.__getitem__)
L1flat = [L1flat[i] for i in I]
M1 = [M1[i] for i in I]

for item in L2:
    s = bisect_left(L1flat, item)
    e = bisect_right(L1flat, item)
    print item, M1[s:e]

#apple [0, 1]
#cat [0]
#red [1]

您是否直接尝试过(可能效率较低)?您的代码不适用于OP所寻找的一般情况。因为L2中的任何元素都可能位于L1中的任何列表中,并且L1中的列表数量可能与L2中的元素数量不同。
>>> L1 = [["cat","dog","apple"],
...       ["orange","green","red"],
...       ["hand","cat","red"]]
>>> L2 = ["cat","red"]
>>> generate_edges(L1, L2)
[('apple', 'cat'),
 ('dog', 'cat'),
 ('orange', 'red'),
 ('green', 'red'),
 ('hand', 'red'),
 ('hand', 'cat')]
from bisect import bisect_left, bisect_right
from itertools import chain

L1=[["cat","dog","apple"],["orange","green","red","apple"]]
L2=["apple", "cat","red"]

M1 = [[i]*len(j) for i, j in enumerate(L1)]
M1 = list(chain(*M1))
L1flat = list(chain(*L1))
I = sorted(range(len(L1flat)), key=L1flat.__getitem__)
L1flat = [L1flat[i] for i in I]
M1 = [M1[i] for i in I]

for item in L2:
    s = bisect_left(L1flat, item)
    e = bisect_right(L1flat, item)
    print item, M1[s:e]

#apple [0, 1]
#cat [0]
#red [1]