Python 删除X数组中的NaN行以及Y数组中的对应行

Python 删除X数组中的NaN行以及Y数组中的对应行,python,arrays,numpy,matrix,nan,Python,Arrays,Numpy,Matrix,Nan,我有一个带有NaN的X数组,我可以删除带有NaN的行,如下所示: import numpy as np x = x[~np.isnan(x)] 但我有一个对应的Y数组 assert len(x) == len(y) # True x = x[~np.isnan(x)] assert len(x) == len(y) # False and breaks 如何从Y数组中删除相应的行? 我的X阵列如下所示: >>> x [[ 2.67510434 2.67521927 3.

我有一个带有NaN的X数组,我可以删除带有NaN的行,如下所示:

import numpy as np
x = x[~np.isnan(x)]
但我有一个对应的Y数组

assert len(x) == len(y) # True
x = x[~np.isnan(x)]
assert len(x) == len(y) # False and breaks
如何从Y数组中删除相应的行?

我的X阵列如下所示:

>>> x
[[ 2.67510434  2.67521927  3.49296989  3.80100625  4.          2.83631844]
 [ 3.47538057  3.4752436   3.62245715  4.0720535   5.          3.7773169 ]
 [ 2.6157049   2.61583852  3.48335887  3.78088813  0.          2.78791096]
 ..., 
 [ 3.60408952  3.60391203  3.64328267  4.1156462   5.          3.77933333]
 [ 2.66773792  2.66785516  3.49177798  3.7985113   4.          2.83631844]
 [ 3.26622238  3.26615124  3.58861468  4.00121327  5.          3.49693169]]
但奇怪的是:

indexes = ~np.isnan(x)
print indexes
[out]:

[[ True  True  True  True  True  True]
 [ True  True  True  True  True  True]
 [ True  True  True  True  True  True]
 ..., 
 [ True  True  True  True  True  True]
 [ True  True  True  True  True  True]
 [ True  True  True  True  True  True]]

您正在删除NaN项,而不是NaN行。正确的做法是:

mask = ~np.any(np.isnan(x), axis=1)
x = x[mask]
y = y[mask]
要查看两种方法的不同行为,请执行以下操作:

>>> x = np.random.rand(4, 5)
>>> x[[0, 2], [1, 4]] = np.nan
>>> x
array([[ 0.37499461,         nan,  0.51254549,  0.5253203 ,  0.3955948 ],
       [ 0.73817831,  0.70381481,  0.45222295,  0.68540433,  0.76113544],
       [ 0.1651173 ,  0.41594257,  0.66327842,  0.86836192,         nan],
       [ 0.70538764,  0.31702821,  0.04876226,  0.53867849,  0.58784935]])
>>> x[~np.isnan(x)]  # 1D array with NaNs removed
array([ 0.37499461,  0.51254549,  0.5253203 ,  0.3955948 ,  0.73817831,
        0.70381481,  0.45222295,  0.68540433,  0.76113544,  0.1651173 ,
        0.41594257,  0.66327842,  0.86836192,  0.70538764,  0.31702821,
        0.04876226,  0.53867849,  0.58784935])
>>> x[~np.any(np.isnan(x), axis=1)]  # 2D array with rows with NaN removed
array([[ 0.73817831,  0.70381481,  0.45222295,  0.68540433,  0.76113544],
       [ 0.70538764,  0.31702821,  0.04876226,  0.53867849,  0.58784935]]

你是说上面的
y=y[~np.isnan(x)]
?别忘了在这句话之后调用
x=x[~np.isnan(x)]
。@xnx,是的,没错,傻我……试试
np.mat(x)[~np.isnan(x)]
<代码>np.array(x)[~np.isnan(x)]将返回一个1d数组,而np.mat将保留其维度。它仍然提供
索引器:数组索引太多
我得到
索引器:数组索引太多
用于你的答案和@xnx方法。你确定
x
y
长度相同吗?牛津字典,参见例如@Bart,我欣赏引用,因此接受索引;然而,这一引文让这个问题悬而未决,因为我是一名科学家,所以我坚持使用“索引”;)@Chris8447
~
反转
运算符,即
~np.array([True,False])==np.array([False,True])
。请参阅我的,
~np.any(np.isnan(x,axis=1))
返回一个错误:
TypeError:“axis”是ufunc“isnan”的无效关键字。
我搞乱了括号的位置,它应该是
~np.any(np.isnan(x,axis=1)
>>> x = np.random.rand(4, 5)
>>> x[[0, 2], [1, 4]] = np.nan
>>> x
array([[ 0.37499461,         nan,  0.51254549,  0.5253203 ,  0.3955948 ],
       [ 0.73817831,  0.70381481,  0.45222295,  0.68540433,  0.76113544],
       [ 0.1651173 ,  0.41594257,  0.66327842,  0.86836192,         nan],
       [ 0.70538764,  0.31702821,  0.04876226,  0.53867849,  0.58784935]])
>>> x[~np.isnan(x)]  # 1D array with NaNs removed
array([ 0.37499461,  0.51254549,  0.5253203 ,  0.3955948 ,  0.73817831,
        0.70381481,  0.45222295,  0.68540433,  0.76113544,  0.1651173 ,
        0.41594257,  0.66327842,  0.86836192,  0.70538764,  0.31702821,
        0.04876226,  0.53867849,  0.58784935])
>>> x[~np.any(np.isnan(x), axis=1)]  # 2D array with rows with NaN removed
array([[ 0.73817831,  0.70381481,  0.45222295,  0.68540433,  0.76113544],
       [ 0.70538764,  0.31702821,  0.04876226,  0.53867849,  0.58784935]]