Python 二维列表到numpy数组，并用-1填充较短子列表的剩余值_Python_Numpy

Python 二维列表到numpy数组，并用-1填充较短子列表的剩余值

python numpy

Python 二维列表到numpy数组，并用-1填充较短子列表的剩余值,python,numpy,Python,Numpy,我有一个不同长度的子列表的二维列表，我需要将该列表转换为一个numpy数组，以便较短子列表的所有剩余值都填充为-1，我正在寻找一种有效的方法来实现这一点例如，我有二维列表x： x = [ [0,2,3], [], [4], [5,6]] 我想得到一个如下所示的numpy数组： >>> array_x array([[ 0, 2, 3], [-1, -1, -1], [ 4, -1, -1], [

我有一个不同长度的子列表的二维列表，我需要将该列表转换为一个numpy数组，以便较短子列表的所有剩余值都填充为-1，我正在寻找一种有效的方法来实现这一点

例如，我有二维列表x：

x = [
    [0,2,3],
    [],
    [4],
    [5,6]]

我想得到一个如下所示的numpy数组：

>>> array_x
array([[ 0,  2,  3],
       [-1, -1, -1],
       [ 4, -1, -1],
       [ 5,  6, -1]])

n_rows = len(x)
n_cols = max(len(ele) for ele in x)

new_array = np.ones((n_rows, n_cols)) * -1

for i, row in enumerate(x):
    for j, ele in enumerate(row):
        new_array[i, j] = ele

基本方法是创建一个-1数组，然后在2D列表上循环以填充剩余值，如下所示：

>>> array_x
array([[ 0,  2,  3],
       [-1, -1, -1],
       [ 4, -1, -1],
       [ 5,  6, -1]])

n_rows = len(x)
n_cols = max(len(ele) for ele in x)

new_array = np.ones((n_rows, n_cols)) * -1

for i, row in enumerate(x):
    for j, ele in enumerate(row):
        new_array[i, j] = ele

但有没有更有效的解决方案

对原始解决方案的一些速度改进：

n_rows = len(x)
n_cols = max(map(len, x))

new_array = np.empty((n_rows, n_cols))
new_array.fill(-1)
for i, row in enumerate(x):
    for j, ele in enumerate(row):
        new_array[i, j] = ele

时间：

import numpy as np
from timeit import timeit
from itertools import izip_longest

def f1(x, enumerate=enumerate, max=max, len=len):
    n_rows = len(x)
    n_cols = max(len(ele) for ele in x)

    new_array = np.ones((n_rows, n_cols)) * -1
    for i, row in enumerate(x):
        for j, ele in enumerate(row):
            new_array[i, j] = ele
    return new_array

def f2(x, enumerate=enumerate, max=max, len=len, map=map):
    n_rows = len(x)
    n_cols = max(map(len, x))

    new_array = np.empty((n_rows, n_cols))
    new_array.fill(-1)
    for i, row in enumerate(x):
        for j, ele in enumerate(row):
            new_array[i, j] = ele

    return new_array

setup = '''x = [[0,2,3],
    [],
    [4],
    [5,6]]
from __main__ import f1, f2'''

print timeit(stmt='f1(x)', setup=setup, number=100000)
print timeit(stmt='f2(x)', setup=setup, number=100000)

使用

new\u array=np.empty（（n\u行，n\u列））

new\u array.fill（-1）

可使其达到100%faster@jamylak，我明白了。谢谢你的回答，所以我想没有明显的方法可以摆脱for循环。我不知道任何快速的方法，也不是说它不存在，我尝试了

np.array（tuple（izip_longest（*x，fillvalue=-1）），dtype=np.int）.t

，但速度很慢。你可以在这个问题上悬赏，试图引起更多的注意，以找到一种更快的方法来消除这个漏洞。我明白了，再次谢谢你。