Python 2.7 numpy根据数组1和2的正确位置在第三个数组中查找值_Python 2.7_Pandas_Numpy

Python 2.7 numpy根据数组1和2的正确位置在第三个数组中查找值

python-2.7 pandas numpy

Python 2.7 numpy根据数组1和2的正确位置在第三个数组中查找值,python-2.7,pandas,numpy,Python 2.7,Pandas,Numpy,我想有一个快速的方法可以做到这一点。我有3个大小相同的数组，它们代表x、y、z的坐标，例如： In[85]: xxn Out[85]: array([ 0.08333333, 0.08333333, 0.08333333, 0.08333333, 0.08333333, 0.08333333, 0.08333333, 0.08333333, 0.08333333, 0.25 , 0.25 , 0.25 , 0.2

我想有一个快速的方法可以做到这一点。我有3个大小相同的数组，它们代表x、y、z的坐标，例如：

In[85]: xxn
Out[85]: 
array([ 0.08333333,  0.08333333,  0.08333333,  0.08333333,  0.08333333,
        0.08333333,  0.08333333,  0.08333333,  0.08333333,  0.25      ,
        0.25      ,  0.25      ,  0.25      ,  0.25      ,  0.25      ,
        0.25      ,  0.25      ,  0.25      ,  0.5       ,  0.5       ,
        0.5       ,  0.5       ,  0.5       ,  0.5       ,  0.5       ,
        0.5       ,  0.5       ,  1.        ,  1.        ,  1.        ,
        1.        ,  1.        ,  1.        ,  1.        ,  1.        ,
        1.        ,  2.        ,  2.        ,  2.        ,  2.        ,
        2.        ,  2.        ,  2.        ,  2.        ,  2.        ,
        3.        ,  3.        ,  3.        ,  3.        ,  3.        ,
        3.        ,  3.        ,  3.        ,  3.        ,  4.        ,
        4.        ,  4.        ,  4.        ,  4.        ,  4.        ,
        4.        ,  4.        ,  4.        ,  5.        ,  5.        ,
        5.        ,  5.        ,  5.        ,  5.        ,  5.        ,
        5.        ,  5.        ])
yyn
Out[86]: 
array([ 1306.89 ,  1524.705,  1742.52 ,  1960.335,  2178.15 ,  2395.965,
        2613.78 ,  2831.595,  3049.41 ,  1306.89 ,  1524.705,  1742.52 ,
        1960.335,  2178.15 ,  2395.965,  2613.78 ,  2831.595,  3049.41 ,
        1306.89 ,  1524.705,  1742.52 ,  1960.335,  2178.15 ,  2395.965,
        2613.78 ,  2831.595,  3049.41 ,  1306.89 ,  1524.705,  1742.52 ,
        1960.335,  2178.15 ,  2395.965,  2613.78 ,  2831.595,  3049.41 ,
        1306.89 ,  1524.705,  1742.52 ,  1960.335,  2178.15 ,  2395.965,
        2613.78 ,  2831.595,  3049.41 ,  1306.89 ,  1524.705,  1742.52 ,
        1960.335,  2178.15 ,  2395.965,  2613.78 ,  2831.595,  3049.41 ,
        1306.89 ,  1524.705,  1742.52 ,  1960.335,  2178.15 ,  2395.965,
        2613.78 ,  2831.595,  3049.41 ,  1306.89 ,  1524.705,  1742.52 ,
        1960.335,  2178.15 ,  2395.965,  2613.78 ,  2831.595,  3049.41 ])

    In[87]: zzn
Out[87]: 
array([ 0.4837052 ,  0.3976288 ,  0.3076519 ,  0.2105963 ,  0.1015546 ,
        0.1162558 ,  0.1723646 ,  0.2173536 ,  0.2547635 ,  0.3767569 ,
        0.3196527 ,  0.2606447 ,  0.1983554 ,  0.1291423 ,  0.09786849,
        0.1277448 ,  0.1560009 ,  0.1802875 ,  0.3420683 ,  0.2938885 ,
        0.2452067 ,  0.1958042 ,  0.144459  ,  0.1026045 ,  0.1086459 ,
        0.1256328 ,  0.1419562 ,  0.3090272 ,  0.2726449 ,  0.236535  ,
        0.200679  ,  0.1647521 ,  0.1310315 ,  0.1132389 ,  0.1129602 ,
        0.118809  ,  0.284265  ,  0.257173  ,  0.2310047 ,  0.205817  ,
        0.18154   ,  0.1586908 ,  0.1393701 ,  0.1264879 ,  0.1204383 ,
        0.2760804 ,  0.2540095 ,  0.2330927 ,  0.2133592 ,  0.1947658 ,
        0.1775263 ,  0.1622754 ,  0.1498286 ,  0.1407699 ,  0.274541  ,
        0.2560495 ,  0.2387175 ,  0.222547  ,  0.2075007 ,  0.1936717 ,
        0.1812974 ,  0.1706293 ,  0.1618527 ,  0.2802191 ,  0.2641784 ,
        0.2491889 ,  0.2352521 ,  0.2223443 ,  0.2105051 ,  0.199825  ,
        0.1903785 ,  0.1822064 ])

我想根据xxn和yyn中的匹配位置找出获取zzn值的最快方法，例如has[12395.965]将返回0.1310315，这是数组zzn中[12395.965]的位置匹配位置

在熊猫中，我会做zz[（xx==1）和（yy==2395.965）]=0.1310315，但不幸的是，在它上面有一个巨大的环，而且它的速度很慢

谢谢你的帮助，谢谢

编辑：

我当前的循环使用的是熊猫

for coordinate in df.itertuples():
    sTL = zz[(xx == x_match) & (yy == y_match)].values
    sBL = zz[(xx == x_match) & (yy == sB)].values
    sTR = zz[(xx == sR) & (yy == y_match)].values
    sBR = zz[(xx == sR) & (yy == sB)].values

其中坐标为x_match、y_match、sR、sB值，有100k行

您可以将

xxn

和

yyn

堆叠到一个数组中，搜索此新数组并使用结果从

zzn

获取值：

a = numpy.vstack((xxn, yyn)).T

idx = numpy.all(a==numpy.array([1.0, 2395.965]), axis=1)
print zzn[idx]

经过调查，我想出了一个简单的方法：

np.where((xxn == x_match) & (yyn ==y_match), zzn, 0).sum()

这看起来比熊猫的速度快得多：

 %timeit np.where((xxn == x_match) & (yyn ==y_match), zzn, 0).sum()
The slowest run took 8.72 times longer than the fastest. This could mean 

that an intermediate result is being cached.
100000 loops, best of 3: 8.19 �s per loop

 %timeit zz[(xx == x_match) & (yy == y_match)].values
1000 loops, best of 3: 1.43 ms per loop

以下是我在《熊猫》中的表现：

xyz = pd.DataFrame({'x':xxn, 'y':yyn, 'z':zzn})
xyz.set_index(['x', 'y'], inplace=True)

hunt = pd.DataFrame({'x':df[:,0], 'y':df[:,1]}) # coords to look for
print hunt.join(xyz, ['x', 'y'])

我认为您也不需要使用数组进行循环。如何循环实现它？这是一个自定义的插值，我在一个循环中用不同的XXN和YYN坐标来填充多个XXN yyn和ZZN，我需要ZZN等价于一个示例数据吗？@ DIVAKAR在一个主要的POST问题中添加了一个例子，这可能有帮助-我将考虑（通过OP）将

a==np.array（[…]）

替换为

np.isclose（a，np.array（…）

，因为

==

操作符不能很好地处理浮点数。