Python 列表理解的多个非连接条件_Python_List

Python 列表理解的多个非连接条件

python list

Python 列表理解的多个非连接条件,python,list,Python,List,我有一个4元组的列表。我想检查是否至少有一个元组的第三个元素等于'JJ'，并且至少有一个元组的第四个元素等于'nsubj'。但是，它们不一定是同一个元组所以做一些类似于- if (any([tup for tup in parse_tree[i] if (tup[2] == 'JJ' and tup[3] == 'nsubj')])): 这是错误的相反，如果你这样做了 if (any([tup for tup in parse_tree[i] if (tup[2] == 'JJ' or t

我有一个4元组的列表。我想检查是否至少有一个元组的第三个元素等于'JJ'，并且至少有一个元组的第四个元素等于'nsubj'。但是，它们不一定是同一个元组

所以做一些类似于-

if (any([tup for tup in parse_tree[i] if (tup[2] == 'JJ' and tup[3] == 'nsubj')])):

这是错误的

相反，如果你这样做了

if (any([tup for tup in parse_tree[i] if (tup[2] == 'JJ' or tup[3] == 'nsubj')])):

您将得到至少满足一个条件但不能同时满足两个条件的列表

我认为解决这个问题的唯一办法就是这样做-

if any([tup for tup in parse_tree[i] if tup[2] == 'JJ']) and any([tup for tup in parse_tree[i] if tup[3] == 'nsubj']):

只有一个列表有办法做到这一点吗？

您可以使用

zip

将列表中元组中的所有第3和第4个元素分组到另一个列表中的新元组中，然后直接从这些新元组中检查元素：

# say lst is your original list

new_lst = zip(*lst)
if 'JJ' in new_lst[2] and 'nsubj' in new_lst[3]:
     # your code

当然，如果允许从原始列表创建新列表

我想不出一个纯粹的布尔加

任何/所有

解决方案。如果没有原语，您可以使用保持状态的自定义比较对象来解决此问题

class MultiComp(object):
    def __init__(self, *conditions):
        self._conditions = conditions
        self._results = [False] * len(conditions)

    def __bool__(self):
        return all(self._results)

    __nonzero__ = __bool__

    def digest(self, elem):
        for idx, condition in enumerate(self._conditions):
            if not self._results[idx] and condition(elem):
                self._results[idx] = True
        return self

comp = MultiComp(lambda tup: tup[2] == 'JJ', lambda tup: tup[3] == 'nsubj')
any(tup for tup in ttuple if bool(comp.digest(tup)))

请注意，这比正确评估这两种情况慢十倍以上

对照班：

In [215]: %%timeit
   .....: comp = MultiComp(lambda tup: tup[2] == 'JJ', lambda tup: tup[3] == 'nsubj')
   .....: any(tup for tup in ttuple if bool(comp.digest(tup)))
   .....:
10000 loops, best of 3: 30.6 µs per loop

正确使用发电机：

In [216]: %%timeit
   .....: any(tup[2] == 'JJ' for tup in ttuple) and any(tup[3] == 'nsubj' for tup in ttuple)
   .....:
100000 loops, best of 3: 4.26 µs per loop

也许你可以做一些事情，比如使用生成器（效率更高），而不是列表本身

是否有任何特定的原因，你需要它只做一个列表

if any(True for tuple_list in list_of_tuple_lists if any(True for t in tuple_list if t[2]=='JJ') and any(True for t in tuple_list if t[3] == 'nsubj'))

为了匹配您的姓名：

if any(True for tuple_list in parse_tree[i] if any(True for t in tuple_list if t[2]=='JJ') and any(True for t in tuple_list if t[3] == 'nsubj'))

该代码适用于所有情况。简单测试：

import random
import time

none = [(1,2,3,4),(1,2,3,4),(1,2,3,4),(1,2,3,4)] # none of the conditions
first = [(1,2,3,4),(1,2,'JJ',4),(1,2,3,4),(1,2,3,4)]# only first of the conditions
second = [(1,2,3,4),(1,2,'test',4),(1,2,4,'nsubj'),(1,2,3,4)]# only second of the conditions
both = [(1,2,'JJ',4),(1,2,'test',4),(1,2,3,'nsubj'),(1,2,3,4)]# both of the conditions in different tuples
same = [(1,2,'JJ','nsubj'),(1,2,'test',4),(1,2,2,4),(1,2,3,4)]# both of the conditions in same tuple
possible_tuples=[none,first,second,both,same]


def our_check(list_of_tuple_lists):
    if any(True for tuple_list in list_of_tuple_lists if any(True for t in tuple_list if t[2]=='JJ') and any(True for t in tuple_list if t[3] == 'nsubj')):
        return True
    else:
        return False


def our_check_w_lists(list_of_tuple_lists):
    if any([True for tuple_list in list_of_tuple_lists
           if any([True for t in tuple_list if t[2]=='JJ']) and any([True for t in tuple_list if t[3] == 'nsubj'])]):
        return True
    else:
        return False


def third_snippet(list_of_tuple_lists):
    if any([tup for tup in list_of_tuple_lists if tup[2] == 'JJ']) and any([tup for tup in list_of_tuple_lists if tup[3] == 'nsubj']):
        return True
    else:
        return False

def test(func):
    test_cases = []
    for n in range(100000):
        test_list=[]
        for i in range(10):
            test_list.append(random.choice(possible_tuples))
        expected = False
        if both in test_list or same in test_list:
            expected = True

        test_case = expected, test_list
        test_cases.append(test_case)
    start = time.clock()
    for expected, case_list in test_cases:
        if expected != func(case_list):
            print('%s, Fail for: %s'%(func,case_list))
            return False
    end = time.clock()
    print('function:%s, %f'%(func, end-start))

test(our_check)
test(our_check_w_lists)
test(third_snippet)

该测试结果显示了仅对10个元组长的列表使用生成器和列表理解之间的执行时间差

function:<function our_check at 0x00000000028CE7B8>, 0.378369
function:<function our_check_w_lists at 0x00000000031472F0>, 1.270924
<function third_snippet at 0x00000000031E0840>, Fail for: [[...

函数：，0.378369
功能：，1.270924
，失败原因：[[…]。。。

虽然它不能解决您的问题，但使用生成器表达式将减少其运行时间。从上一条语句中删除

[]

任何都将在元素为True
时停止，但[]
此时强制创建所有元素。使用状态为的列表理解是一个坏主意。最简单的方法是直接循环，并为您要查找的每件事情标记标记。@jonrsharpe不是那么不和谐吗？只不过是尝试将其塞进一行无法读取的内容中而已！谢谢您的回答，但这基本上不是d吗我正在做什么，但把它放在函数中？@nikhiprabhu我刚刚放在函数中进行测试。你可以使用上面的简短片段，其中的列表将是parse_tree[I]，以匹配你的代码。它没有做同样的事，因为它只使用生成器，没有创建任何列表。any（）以generator作为参数的语句将在其变为真时停止。使用[]列表理解将导致生成完整的列表，无论何时何地满足条件。此外，您的第三个片段在我的测试中失败。@Nikhiprabhu我更新了我的答案，以显示使用生成器和列表理解在性能上的差异。哇，这是一个很大的差异。我必须承认，我没有注意到您在您的答案中，从any（）函数中删除了list对象。但是，我的主要问题是，作为一个条件，在一行中写入我想要的内容。无论如何，谢谢！@nikhiprabhu！它在一行中作为单个if:if any（对于tuple\u列表中的tuple\u list，如果有，则为True）（对于tuple\u列表中的t，如果t[2]='JJ'）和any（对于tuple\u列表中的t，则为True，如果t[3]=='nsubj'））我刚刚将其格式化为在站点上更可读的代码片段，并移动到函数以测试性能。您只需将其内联到语句中即可。