Python 使用R平方查找numpy数组中行之间的相关性

Python 使用R平方查找numpy数组中行之间的相关性,python,numpy,scipy,statistics,Python,Numpy,Scipy,Statistics,我试图在numpy数组中查找行之间的相关性,如果相关性高或等于0.85,则删除索引最低的行。 numpy阵列的示例: array =([[-0.90068117, 1.01900435, -1.34022653, -1.3154443 ], [-1.14301691, -0.13197948, -1.34022653, -1.3154443 ], [-1.38535265, 0.32841405, -1.39706395, -1

我试图在numpy数组中查找行之间的相关性,如果相关性高或等于0.85,则删除索引最低的行。 numpy阵列的示例:

array =([[-0.90068117,  1.01900435, -1.34022653, -1.3154443 ],
                [-1.14301691, -0.13197948, -1.34022653, -1.3154443 ],
                [-1.38535265,  0.32841405, -1.39706395, -1.3154443 ],
                [-1.50652052,  0.09821729, -1.2833891 , -1.3154443 ],
                [-1.02184904,  1.24920112, -1.34022653, -1.3154443 ],
                [-0.53717756,  1.93979142, -1.16971425, -1.05217993],
                [-1.50652052,  0.78880759, -1.34022653, -1.18381211],
                [-1.02184904,  0.78880759, -1.2833891 , -1.3154443 ],
                [-1.74885626, -0.36217625, -1.34022653, -1.3154443 ],
                [-1.14301691,  0.09821729, -1.2833891 , -1.44707648],
                [-0.53717756,  1.47939788, -1.2833891 , -1.3154443 ],
                [-1.26418478,  0.78880759, -1.22655167, -1.3154443 ],
                [-1.26418478, -0.13197948, -1.34022653, -1.44707648],
                [-1.87002413, -0.13197948, -1.51073881, -1.44707648],
                [-0.05250608,  2.16998818, -1.45390138, -1.3154443 ],
                [-0.17367395, 2.9       , -1.2833891 , -1.05217993],
                [-0.53717756,  1.93979142, -1.39706395, -1.05217993],
                [-0.90068117,  1.01900435, -1.34022653, -1.18381211],
                [-0.17367395,  1.70959465, -1.16971425, -1.18381211],
                [-0.90068117,  1.70959465, -1.2833891 , -1.18381211]])
所以我想检查第1->2行和第2->3行和第3->4行之间的相关性,如果它的I>=0.85,则删除最低索引中的行,因此我编写了以下代码:

raise2 = lambda element:element**2 
def check_corr(array):
array = np.rot90(array)
r_value_list = []
for i in range(len(array)):
    if i < 3:
        a = stats.linregress(array[i],array[i+1])
        r_value_list.append(a.rvalue)
        i += 1
r_squared_list = list(map(raise2,r_value_list))
for i in r_squared_list:
    if i >= 0.85:
        b = r_squared_list.index(i)
        
array = np.delete(array,b,0)
array = np.rot90(array)
array = np.rot90(array)
array = np.rot90(array)
return array
clean_DATA = check_corr(no_outliers_DATA)
print(clean_DATA)
我想要获得的输出示例:

array = [[-0.90068117,  1.01900435, -1.3154443 ],
                [-1.14301691, -0.13197948, -1.3154443 ],
                [-1.38535265,  0.32841405, -1.3154443 ],
                [-1.50652052,  0.09821729, -1.3154443 ],
                [-1.02184904,  1.24920112, -1.3154443 ],
                [-0.53717756,  1.93979142, -1.05217993],
                [-1.50652052,  0.78880759, -1.18381211],
                [-1.02184904,  0.78880759, -1.3154443 ],
                [-1.74885626, -0.36217625, -1.3154443 ],
                [-1.14301691,  0.09821729, -1.44707648],
                [-0.53717756,  1.47939788, -1.3154443 ],
                [-1.26418478,  0.78880759, -1.3154443 ],
                [-1.26418478, -0.13197948, -1.44707648],
                [-1.87002413, -0.13197948, -1.44707648],
                [-0.05250608,  2.16998818, -1.3154443 ],
                [-0.17367395, 2.9       , -1.05217993],
                [-0.53717756,  1.93979142, -1.05217993],
                [-0.90068117,  1.01900435, -1.18381211],
                [-0.17367395,  1.70959465, -1.18381211],
                [-0.90068117,  1.70959465, -1.18381211]])
其中第2行被删除,因为它与第3行相关。 另外,我希望该函数适用于大于4行的数组。
感谢您的帮助

请提供所需信息(MRE)。我们应该能够复制和粘贴一个连续的代码块,执行该文件,并再现您的问题以及跟踪问题点的输出。这让我们可以根据您的测试数据和期望的输出来测试我们的建议。显示中间结果与您预期的不同之处。我们希望您执行基本诊断,包括在您的帖子中。至少,在错误点打印可疑值,并将其追溯到其来源。在许多情况下,执行此基本诊断将向您显示问题所在,您根本不需要堆栈溢出。我认为您所有的问题都在索引和
I+=1
行中。删除
i+=1
,因为python会自动增加它。此外,for循环将在i=3处结束,这是数组的最大索引,因此
a=stats.linregresse(数组[i],数组[i+1])
行将失败。对于范围内的i(len(array)-1),最好使用
@Prune抱歉,这一切都是全新的。我编辑了它differently@OliverMohrBonometti那确实有用!谢谢
array = [[-0.90068117,  1.01900435, -1.3154443 ],
                [-1.14301691, -0.13197948, -1.3154443 ],
                [-1.38535265,  0.32841405, -1.3154443 ],
                [-1.50652052,  0.09821729, -1.3154443 ],
                [-1.02184904,  1.24920112, -1.3154443 ],
                [-0.53717756,  1.93979142, -1.05217993],
                [-1.50652052,  0.78880759, -1.18381211],
                [-1.02184904,  0.78880759, -1.3154443 ],
                [-1.74885626, -0.36217625, -1.3154443 ],
                [-1.14301691,  0.09821729, -1.44707648],
                [-0.53717756,  1.47939788, -1.3154443 ],
                [-1.26418478,  0.78880759, -1.3154443 ],
                [-1.26418478, -0.13197948, -1.44707648],
                [-1.87002413, -0.13197948, -1.44707648],
                [-0.05250608,  2.16998818, -1.3154443 ],
                [-0.17367395, 2.9       , -1.05217993],
                [-0.53717756,  1.93979142, -1.05217993],
                [-0.90068117,  1.01900435, -1.18381211],
                [-0.17367395,  1.70959465, -1.18381211],
                [-0.90068117,  1.70959465, -1.18381211]])