Arrays 用零填充空列表以获得5个元组的固定大小列表
我有1000个例子。每个示例包含一个包含18个列表的列表,这些列表长度可变,有些列表为空 以下是一个示例:Arrays 用零填充空列表以获得5个元组的固定大小列表,arrays,python-2.7,list,numpy,nan,Arrays,Python 2.7,List,Numpy,Nan,我有1000个例子。每个示例包含一个包含18个列表的列表,这些列表长度可变,有些列表为空 以下是一个示例: len(My_list) 18 print(My_list) array([list([(17, 163, 0.11258018, 15),(78, 193, 0.99713018, 17),(478, 94, 0.7299528, 2), (63, 268, 0.77531445, 3), (169, 279, 0.7947326, 4),(456, 140, 0.65013665, 7
len(My_list)
18
print(My_list)
array([list([(17, 163, 0.11258018, 15),(78, 193, 0.99713018, 17),(478, 94, 0.7299528, 2), (63, 268, 0.77531445, 3), (169, 279, 0.7947326, 4),(456, 140, 0.65013665, 7), (61, 301, 0.7433308, 8)]),
list([]),
list([]),
list([]),
list([]),
list([]),
list([]),
list([]),
list([(63, 176, 0.18713018, 0),(199, 185, 0.88743243, 79), (282, 75, 0.752135, 84)]),
list([(62, 185, 0.13743243, 1)]),
list([]),
list([(67, 156, 0.14346971, 2)]),
list([(2, 15, 0.00639179, 3)]),
list([]),
list([]),
list([]),
list([]),
list([])],
dtype=object)
我想做什么?
对于每个列表:
1-保留前5个元组
2-如果列表为空,则创建一个包含五个元组的列表,如下所示
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]).
3-如果列表不是空的,但不包含5个元素,则完成它以获得5个元素。由于My_list[12]
只包含一个元素list([(67156,0.14346971,2)])
因此:
My_list[12]=list([(67, 156, 0.14346971, 2),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)])
预期产出:
array([list([(17, 163, 0.11258018, 15),(78, 193, 0.99713018, 17),(478, 94, 0.7299528, 2), (63, 268, 0.77531445, 3), (169, 279, 0.7947326, 4)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(63, 176, 0.18713018, 0),(199, 185, 0.88743243, 79), (282, 75, 0.752135, 84),(0,0,0,0),(0,0,0,0)]),
list([(62, 185, 0.13743243, 1),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(67, 156, 0.14346971, 2),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(2, 15, 0.00639179, 3),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)])],
dtype=object)
我试过什么?
My_list=np.asarray(My_list)
My_list = [joint if len(joint) != 0 else [(0, 0, 0,0)] for joint in My_list]
然而,这并不意味着这项工作。它仅用(0,0,0,0)填充空列表。此外,包含一个或多个元素的列表将跳过它们。并且它希望用(0,0,0,0)填充所有空列表或少于五个元素的列表,以获得每个列表的五个元素
有什么提示吗?这里有一种方法:将5个元组粘贴到所有内容上,然后修剪:
>>> ml
array([list([(17, 163, 0.11258018, 15), (78, 193, 0.99713018, 17), (478, 94, 0.7299528, 2), (63, 268, 0.77531445, 3), (169, 279, 0.7947326, 4), (456, 140, 0.65013665, 7), (61, 301, 0.7433308, 8)]),
list([]), list([]), list([]), list([]), list([]), list([]),
list([]),
list([(63, 176, 0.18713018, 0), (199, 185, 0.88743243, 79), (282, 75, 0.752135, 84)]),
list([(62, 185, 0.13743243, 1)]), list([]),
list([(67, 156, 0.14346971, 2)]), list([(2, 15, 0.00639179, 3)]),
list([]), list([]), list([]), list([]), list([])], dtype=object)
>>>
>>> z = np.array([None, 5*[4*(0,)]])[[1]]
>>> z
array([list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)])],
dtype=object)
>>>
>>> res = np.frompyfunc(list.__getitem__, 2, 1)(ml + z, slice(5))
>>> res
array([list([(17, 163, 0.11258018, 15), (78, 193, 0.99713018, 17), (478, 94, 0.7299528, 2), (63, 268, 0.77531445, 3), (169, 279, 0.7947326, 4)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(63, 176, 0.18713018, 0), (199, 185, 0.88743243, 79), (282, 75, 0.752135, 84), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(62, 185, 0.13743243, 1), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(67, 156, 0.14346971, 2), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(2, 15, 0.00639179, 3), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)])],
dtype=object)
说明:对象数据类型的数组委托操作,如添加到其元素。因此,ml+z
将每个原始列表与5x4个零的副本组合在一起
接下来,我们只需要将每个列表缩减为5个元素。操作
somelist[:5]
可以写成somelist.\uu getitem\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu。最后一种形式是我们使用np“矢量化”的形式。frompyfunc
这是@PaulP answer(和@Eir的评论)的一个变体。它足够近了,我不会发布它,除非它更快(可能更清晰)
定义一个一次在一个列表上运行的函数-使用添加pad和剥离不需要的元素的思想:
In [209]: z = [4*(0,) for _ in range(5)]
In [210]: def foo(alist):
...: return (alist + z)[:5]
这可以通过列表理解应用于每个列表:
In [211]: [foo(row) for row in arr]
Out[211]:
[[(17, 163, 0.11258018, 15),
(78, 193, 0.99713018, 17),
(478, 94, 0.7299528, 2),
(63, 268, 0.77531445, 3),
(169, 279, 0.7947326, 4)],
[(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)],
....
[(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]]
但是如果你想要一个对象数组,@Paul使用frompyfunc
的方法非常有效:
In [212]: np.frompyfunc(foo,1,1)(arr)
Out[212]:
array([list([(17, 163, 0.11258018, 15), (78, 193, 0.99713018, 17), (478, 94, 0.7299528, 2), (63, 268, 0.77531445, 3), (169, 279, 0.7947326, 4)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
.... dtype=object)
时间:
In [176]: timeit np.frompyfunc(list.__getitem__, 2, 1)(arr + z, slice(5))
14.8 µs ± 18.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [184]: timeit [foo(row) for row in arr]
7.6 µs ± 26.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [213]: timeit np.frompyfunc(foo,1,1)(arr)
8.49 µs ± 27.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
您可能希望向每个列表中添加一个包含5个零元素的常量列表,并只取前5个元素。这可能会消耗一些内存,但这项工作是否小心?您的z
会复制元组。虽然是元组可能并不重要,但速度确实更快。