Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/350.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/arrays/12.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在python中删除数组中的重复数组_Python_Arrays - Fatal编程技术网

在python中删除数组中的重复数组

在python中删除数组中的重复数组,python,arrays,Python,Arrays,我想忽略以最低运行成本拥有多个阵列的阵列中的重复项。比如, A = [['1','2'],['3','4'],['5','6'],['1','2'],['3','4'],['7','8']] 预期的输出应该如下所示 Output = [['1','2'],['3','4'],['5','6'],['7','8']] 是否可以在一个数组中比较数组。 我是这样做的 A = [['1','2'],['3','4'],['5','6'],['1','2'],['3','4'],['7','8']]

我想忽略以最低运行成本拥有多个阵列的阵列中的重复项。比如,

A = [['1','2'],['3','4'],['5','6'],['1','2'],['3','4'],['7','8']]
预期的输出应该如下所示

Output = [['1','2'],['3','4'],['5','6'],['7','8']]
是否可以在一个数组中比较数组。 我是这样做的

 A = [['1','2'],['3','4'],['5','6'],['1','2'],['3','4'],['7','8']]
        output = set()
        for x in A:
            output.add(x)
        print (output)
但它提示

TypeError:不可损坏的类型:“列表”

一个简单的方法是:

uniques = set()
output = []
for x in A:
    val = '-'.join([str(key) for key in x])
    if val not in uniques:
        output.append(x)
        uniques.add(val)
print (output)
输出:

[['1', '2'], ['3', '4'], ['5', '6'], ['7', '8']]
一个简单的方法是:

uniques = set()
output = []
for x in A:
    val = '-'.join([str(key) for key in x])
    if val not in uniques:
        output.append(x)
        uniques.add(val)
print (output)
输出:

[['1', '2'], ['3', '4'], ['5', '6'], ['7', '8']]

简单点,比如:

B = list(map(list, set(map(tuple, A))))
这是我的“bakeoff”--如果我曲解了你的解决方案,请告诉我:

import timeit
from random import choice

DIGITS = list("123456789")

# one million elements in list
A = [[choice(DIGITS), choice(DIGITS)] for _ in range(1000000)]

def elena(A):  # MrName's solution is identical
    B = []

    for i in A:
        if i not in B:
            B.append(i)
    return B

def cdlane(A):

    return list(map(list, set(map(tuple, A))))

def VikashSingh(A):
    uniques = set()
    B = []

    for x in A:
        val = '-'.join([str(key) for key in x])
        if val not in uniques:
            B.append(x)
            uniques.add(val)
    return B

def AbhilekhSingh(A):
    def unique_elements(l):
        last = object()
        for item in l:
            if item == last:
                continue
            yield item
            last = item

    return list(unique_elements(sorted(A)))

# sanity check to make sure everyone one agrees on the answer
B = sorted(elena(A))
assert(B == sorted(cdlane(A)))
assert(B == sorted(VikashSingh(A)))
assert(B == sorted(AbhilekhSingh(A)))

print("elena:", format(timeit.timeit('B = elena(A)', number=10, globals=globals()), ".3"))

print("cdlane:", format(timeit.timeit('B = cdlane(A)', number=10, globals=globals()), ".3"))

print("VikashSingh:", format(timeit.timeit('B = VikashSingh(A)', number=10, globals=globals()), ".3"))

print("AbhilekhSingh:", format(timeit.timeit('B = AbhilekhSingh(A)', number=10, globals=globals()), ".3"))
In [27]: A = [['1','2'],['3','4'],['5','6'],['1','2'],['3','4'], ['7','8']]

In [28]: new_list = []

In [29]: for i in A:
    ...:     if i not in new_list:
    ...:         new_list.append(i)
    ...:         

In [30]: new_list
Out[30]: [['1', '2'], ['3', '4'], ['5', '6'], ['7', '8']]
结果

elena: 17.5
cdlane: 2.04
VikashSingh: 10.0
AbhilekhSingh: 8.83

简单点,比如:

B = list(map(list, set(map(tuple, A))))
这是我的“bakeoff”--如果我曲解了你的解决方案,请告诉我:

import timeit
from random import choice

DIGITS = list("123456789")

# one million elements in list
A = [[choice(DIGITS), choice(DIGITS)] for _ in range(1000000)]

def elena(A):  # MrName's solution is identical
    B = []

    for i in A:
        if i not in B:
            B.append(i)
    return B

def cdlane(A):

    return list(map(list, set(map(tuple, A))))

def VikashSingh(A):
    uniques = set()
    B = []

    for x in A:
        val = '-'.join([str(key) for key in x])
        if val not in uniques:
            B.append(x)
            uniques.add(val)
    return B

def AbhilekhSingh(A):
    def unique_elements(l):
        last = object()
        for item in l:
            if item == last:
                continue
            yield item
            last = item

    return list(unique_elements(sorted(A)))

# sanity check to make sure everyone one agrees on the answer
B = sorted(elena(A))
assert(B == sorted(cdlane(A)))
assert(B == sorted(VikashSingh(A)))
assert(B == sorted(AbhilekhSingh(A)))

print("elena:", format(timeit.timeit('B = elena(A)', number=10, globals=globals()), ".3"))

print("cdlane:", format(timeit.timeit('B = cdlane(A)', number=10, globals=globals()), ".3"))

print("VikashSingh:", format(timeit.timeit('B = VikashSingh(A)', number=10, globals=globals()), ".3"))

print("AbhilekhSingh:", format(timeit.timeit('B = AbhilekhSingh(A)', number=10, globals=globals()), ".3"))
In [27]: A = [['1','2'],['3','4'],['5','6'],['1','2'],['3','4'], ['7','8']]

In [28]: new_list = []

In [29]: for i in A:
    ...:     if i not in new_list:
    ...:         new_list.append(i)
    ...:         

In [30]: new_list
Out[30]: [['1', '2'], ['3', '4'], ['5', '6'], ['7', '8']]
结果

elena: 17.5
cdlane: 2.04
VikashSingh: 10.0
AbhilekhSingh: 8.83

以下是一个简单的解决方案:

import timeit
from random import choice

DIGITS = list("123456789")

# one million elements in list
A = [[choice(DIGITS), choice(DIGITS)] for _ in range(1000000)]

def elena(A):  # MrName's solution is identical
    B = []

    for i in A:
        if i not in B:
            B.append(i)
    return B

def cdlane(A):

    return list(map(list, set(map(tuple, A))))

def VikashSingh(A):
    uniques = set()
    B = []

    for x in A:
        val = '-'.join([str(key) for key in x])
        if val not in uniques:
            B.append(x)
            uniques.add(val)
    return B

def AbhilekhSingh(A):
    def unique_elements(l):
        last = object()
        for item in l:
            if item == last:
                continue
            yield item
            last = item

    return list(unique_elements(sorted(A)))

# sanity check to make sure everyone one agrees on the answer
B = sorted(elena(A))
assert(B == sorted(cdlane(A)))
assert(B == sorted(VikashSingh(A)))
assert(B == sorted(AbhilekhSingh(A)))

print("elena:", format(timeit.timeit('B = elena(A)', number=10, globals=globals()), ".3"))

print("cdlane:", format(timeit.timeit('B = cdlane(A)', number=10, globals=globals()), ".3"))

print("VikashSingh:", format(timeit.timeit('B = VikashSingh(A)', number=10, globals=globals()), ".3"))

print("AbhilekhSingh:", format(timeit.timeit('B = AbhilekhSingh(A)', number=10, globals=globals()), ".3"))
In [27]: A = [['1','2'],['3','4'],['5','6'],['1','2'],['3','4'], ['7','8']]

In [28]: new_list = []

In [29]: for i in A:
    ...:     if i not in new_list:
    ...:         new_list.append(i)
    ...:         

In [30]: new_list
Out[30]: [['1', '2'], ['3', '4'], ['5', '6'], ['7', '8']]

以下是一个简单的解决方案:

import timeit
from random import choice

DIGITS = list("123456789")

# one million elements in list
A = [[choice(DIGITS), choice(DIGITS)] for _ in range(1000000)]

def elena(A):  # MrName's solution is identical
    B = []

    for i in A:
        if i not in B:
            B.append(i)
    return B

def cdlane(A):

    return list(map(list, set(map(tuple, A))))

def VikashSingh(A):
    uniques = set()
    B = []

    for x in A:
        val = '-'.join([str(key) for key in x])
        if val not in uniques:
            B.append(x)
            uniques.add(val)
    return B

def AbhilekhSingh(A):
    def unique_elements(l):
        last = object()
        for item in l:
            if item == last:
                continue
            yield item
            last = item

    return list(unique_elements(sorted(A)))

# sanity check to make sure everyone one agrees on the answer
B = sorted(elena(A))
assert(B == sorted(cdlane(A)))
assert(B == sorted(VikashSingh(A)))
assert(B == sorted(AbhilekhSingh(A)))

print("elena:", format(timeit.timeit('B = elena(A)', number=10, globals=globals()), ".3"))

print("cdlane:", format(timeit.timeit('B = cdlane(A)', number=10, globals=globals()), ".3"))

print("VikashSingh:", format(timeit.timeit('B = VikashSingh(A)', number=10, globals=globals()), ".3"))

print("AbhilekhSingh:", format(timeit.timeit('B = AbhilekhSingh(A)', number=10, globals=globals()), ".3"))
In [27]: A = [['1','2'],['3','4'],['5','6'],['1','2'],['3','4'], ['7','8']]

In [28]: new_list = []

In [29]: for i in A:
    ...:     if i not in new_list:
    ...:         new_list.append(i)
    ...:         

In [30]: new_list
Out[30]: [['1', '2'], ['3', '4'], ['5', '6'], ['7', '8']]

您可以对列表进行排序,并将每个元素与其前一个元素进行比较

List length: n
Element length: m 
Complexity: Sorting(n * log(n) * m) + Comparison(n * m) = Total(n * log(n) * m)
试试这个:

def unique_elements(l):
    last = object()
    for item in l:
        if item == last:
            continue
        yield item
        last = item

def remove_duplicates(l):
    return list(unique_elements(sorted(l)))

您可以对列表进行排序,并将每个元素与其前一个元素进行比较

List length: n
Element length: m 
Complexity: Sorting(n * log(n) * m) + Comparison(n * m) = Total(n * log(n) * m)
试试这个:

def unique_elements(l):
    last = object()
    for item in l:
        if item == last:
            continue
        yield item
        last = item

def remove_duplicates(l):
    return list(unique_elements(sorted(l)))

另一个可能简单的解决方案,但不确定“成本”与其他提出的解决方案相比如何:

A = [['1','2'],['3','4'],['5','6'],['1','2'],['3','4'],['7','8']]

res = []
for entry in A:
    if not entry in res:
        res.append(entry)

另一个可能简单的解决方案,但不确定“成本”与其他提出的解决方案相比如何:

A = [['1','2'],['3','4'],['5','6'],['1','2'],['3','4'],['7','8']]

res = []
for entry in A:
    if not entry in res:
        res.append(entry)


不能将列表放入集合中,因为列表不可散列。即使这样做有效,您也会丢失项目的顺序。@MosesKoledoye顺序并不重要,但成本很重要。此外,这些都是列表,不是。您不能将列表放入集合中,因为列表是不可散列的。即使这样做有效,您也会失去物品的订购。@MosesKoledoye订单并不重要,但成本很重要。此外,这些都是清单,不必再提运行成本。我认为它有o(n)。是吗?运行时间。请同时提及运行成本。我认为它有o(n)。是吗?O(n)运行时间。请同时提及运行成本。这将改变列表元素的顺序,这可能是个问题,也可能不是问题。运行成本是多少?@sphericalcowboy,OP在其帖子的评论中已经提到了这一点。我认为
log(n)
是不可能的。因为如果
log(n)
是可能的,我们可以对它进行反向工程,并使用它对列表进行排序。我们知道这不可能是
log(n)
。请同时提及运行成本。这将改变列表元素的顺序,这可能是个问题,也可能不是问题。运行成本是多少?@spherecalcowboy,OP在其帖子的评论中已经提到了这一点。我认为
log(n)
是不可能的。因为如果
log(n)
是可能的,我们可以对它进行反向工程,并使用它对列表进行排序。我们知道这不能是
log(n)
。请同时提及运行成本。我认为它有o(n)。是吗?
如果我不在新列表中:
这难道不是一个昂贵的步骤,并使流程n^2?请同时提及运行成本。我认为它有o(n)。是吗?
如果我不在新列表中:
这难道不是一个昂贵的步骤,并使流程n^2?请同时提及运行成本。我认为它有o(n)。是吗?添加了运行成本,并让我知道您对此有任何疑问。您在100万列表中的解决方案给出了
CPU时间:用户1.21秒,系统17.9毫秒,总计:1.23秒
,请同时提及运行成本。我认为它有o(n)。是吗?增加了运行成本,并让我知道您对此有任何疑问。您在100万列表中的解决方案提供了
CPU时间:用户1.21秒,sys:17.9毫秒,总计:1.23秒
对于提供的输入数据,此解决方案非常充分,保留了顺序,并且非常简单。对于一个包含10000个列表,每个列表有1000个数字的列表来说,这将是非常昂贵的。对于提供的输入数据来说,这个解决方案是非常充分的,可以保持顺序,并且非常简单。对于一个10000个列表,每个1000个数字的列表来说,这将是昂贵的。