Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/arrays/12.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将点的numpy数组指定给二维方形栅格_Python_Arrays_Numpy - Fatal编程技术网

Python 将点的numpy数组指定给二维方形栅格

Python 将点的numpy数组指定给二维方形栅格,python,arrays,numpy,Python,Arrays,Numpy,由于速度问题,我超越了前面的问题。我有一个点的Lat/Lon坐标数组,我想把它们分配给一个索引代码,该代码是从一个大小相等的二维正方形网格中派生出来的。这就是一个例子。让我们调用点我的第一个数组,其中包含六个点的坐标(称为[xy]对): points = [[ 1.5 1.5] [ 1.1 1.1] [ 2.2 2.2] [ 1.3 1.3] [ 3.4 1.4] [ 2. 1.5]] 然后我有另一个数组,包含两个单元格网格顶点的坐标,形式为[minx,miny,max

由于速度问题,我超越了前面的问题。我有一个点的Lat/Lon坐标数组,我想把它们分配给一个索引代码,该代码是从一个大小相等的二维正方形网格中派生出来的。这就是一个例子。让我们调用
我的第一个数组,其中包含六个点的坐标(称为[xy]对):

points = [[ 1.5  1.5]
 [ 1.1  1.1]
 [ 2.2  2.2]
 [ 1.3  1.3]
 [ 3.4  1.4]
 [ 2.   1.5]]
然后我有另一个数组,包含两个单元格网格顶点的坐标,形式为[minx,miny,maxx,maxy];让我们称之为边界:

bounds = [[ 0.  0.  2.  2.]
 [ 2.  2.  3.  3.]]
我想找出哪些点位于哪个边界,然后分配一个从
bounds
数组索引派生的代码(在这种情况下,第一个单元格的代码为0,第二个单元格的代码为1,依此类推……)。由于单元是正方形,计算每个单元中是否有每个点的最简单方法是计算:

x > minx & x < maxx & y > miny & y < maxy
其中NaN表示点位于单元格外。在我的真实案例中,元素的数量是在10^4个单元格中找到10^6个点的顺序。有没有一种方法可以使用numpy阵列快速完成这类工作


编辑:为了澄清,预期的
结果
数组意味着第一个点位于第一个单元格内(边界数组的索引为0),因此第二个,第一个在
边界
数组的第二个单元格内,依此类推……

您可以使用嵌套循环检查条件并生成结果作为生成器:

points = [[ 1.5  1.5]
 [ 1.1  1.1]
 [ 2.2  2.2]
 [ 1.3  1.3]
 [ 3.4  1.4]
 [ 2.   1.5]]

bounds = [[ 0.  ,0. , 2.,  2.],
 [ 2.  ,2.  ,3.,  3.]]

import numpy as np

def pos(p,b):
  for x,y in p:
    flag=False
    for index,dis in enumerate(b):
      minx,miny,maxx,maxy=dis
      if x > minx and x < maxx and y > miny and y < maxy :
        flag=True
        yield index
    if not flag:
        yield 'NaN'


print list(pos(points,bounds))

我会这样做:

import numpy as np

points = np.random.rand(10,2)

xmin = [0.25,0.5]
ymin = [0.25,0.5]

results = np.zeros(len(points))

for i in range(len(xmin)):
     bool_index_array = np.greater(points, [xmin[i],ymin[i]])
     print "boolean index of (x,y) greater (xmin, ymin): ", bool_index_array
     indicies_of_true_true = np.where(bool_index_array[:,0]*bool_index_array[:,1]==1)[0]
     print "indices of [True,True]: ", indicies_of_true_true
     results[indicies_of_true_true] += 1

print "results: ", results

[out]: [ 1.  1.  1.  2.  0.  0.  1.  1.  1.  1.]
这将使用较低的边界将您的点分类为以下组:

  • 1(如果xmin[0]
  • 如果上述条件均未满足,则为0

    • 以下是解决问题的矢量化方法。它应该大大加快速度

      import numpy as np
      def findCells(points, bounds):
          # make sure points is n by 2 (pool.map might send us 1D arrays)
          points = points.reshape((-1,2))
      
          # check for each point if all coordinates are in bounds
          # dimension 0 is bound
          # dimension 1 is is point
          allInBounds = (points[:,0] > bounds[:,None,0])
          allInBounds &= (points[:,1] > bounds[:,None,1])
          allInBounds &= (points[:,0] < bounds[:,None,2])
          allInBounds &= (points[:,1] < bounds[:,None,3])
      
      
          # now find out the positions of all nonzero (i.e. true) values
          # nz[0] contains the indices along dim 0 (bound)
          # nz[1] contains the indices along dim 1 (point)
          nz = np.nonzero(allInBounds)
      
          # initialize the result with all nan
          r = np.full(points.shape[0], np.nan)
          # now use nz[1] to index point position and nz[0] to tell which cell the
          # point belongs to
          r[nz[1]] = nz[0]
          return r
      
      def findCellsParallel(points, bounds, chunksize=100):
          import multiprocessing as mp
          from functools import partial
      
          func = partial(findCells, bounds=bounds)
      
          # using python3 you could also do 'with mp.Pool() as p:'  
          p = mp.Pool()
          try:
              return np.hstack(p.map(func, points, chunksize))
          finally:
              p.close()
      
      def main():
          nPoints = 1e6
          nBounds = 1e4
      
          # points = np.array([[ 1.5, 1.5],
          #                    [ 1.1, 1.1],
          #                    [ 2.2, 2.2],
          #                    [ 1.3, 1.3],
          #                    [ 3.4, 1.4],
          #                    [ 2. , 1.5]])
      
          points = np.random.random([nPoints, 2])
      
          # bounds = np.array([[0,0,2,2],
          #                    [2,2,3,3]])
      
          # bounds = np.array([[0,0,1.4,1.4],
          #                    [1.4,1.4,2,2],
          #                    [2,2,3,3]])
      
          bounds = np.sort(np.random.random([nBounds, 2, 2]), 1).reshape(nBounds, 4)
      
          r = findCellsParallel(points, bounds)
          print(points[:10])
          for bIdx in np.unique(r[:10]):
              if np.isnan(bIdx):
                  continue
              print("{}: {}".format(bIdx, bounds[bIdx]))
          print(r[:10])
      
      if __name__ == "__main__":
          main()
      

      [0 0 1 0 NaN NaN]
      是前面的
      边界和
      点的结果吗?您能解释一下如何使用
      bounds
      ?是的,将每个点与两个单元格相交并得到相应的单元格代码。如果
      bounds
      是[minx miny maxx maxy]值的数组,那么问题是实现函数
      x>minx&xminy&y
      ,以确定:,第一个点位于
      bounds
      数组的第一个单元格中。希望有帮助。我明白了,但是
      bounds
      有2项!!!你是如何使用这两个项目的?只是重复一下。我的意思是,这里的问题是在边界数组上实现搜索函数,它有两个项,因为它包含两个单元格。第一项有索引0,第二项有索引1,所以我想把它们分配给每一个点。谢谢Kasra,但是可能有打字错误吗?结果列表长度应等于点输入数组的长度。@Fiabetto欢迎使用。是的:)那只是个打字错误!谢谢你的帮助,我正在测试你的速度-时间方法,与我的旧方法相比!再次感谢!谢谢事实上,一个点只能在一个和一个单一的边界中…所以它应该可以很好地工作@Fiabetto你可能想再看看我的答案。我现在在我的机器上降到24秒,有1e6分和1e4边界。
      import numpy as np
      
      points = np.random.rand(10,2)
      
      xmin = [0.25,0.5]
      ymin = [0.25,0.5]
      
      results = np.zeros(len(points))
      
      for i in range(len(xmin)):
           bool_index_array = np.greater(points, [xmin[i],ymin[i]])
           print "boolean index of (x,y) greater (xmin, ymin): ", bool_index_array
           indicies_of_true_true = np.where(bool_index_array[:,0]*bool_index_array[:,1]==1)[0]
           print "indices of [True,True]: ", indicies_of_true_true
           results[indicies_of_true_true] += 1
      
      print "results: ", results
      
      [out]: [ 1.  1.  1.  2.  0.  0.  1.  1.  1.  1.]
      
      import numpy as np
      def findCells(points, bounds):
          # make sure points is n by 2 (pool.map might send us 1D arrays)
          points = points.reshape((-1,2))
      
          # check for each point if all coordinates are in bounds
          # dimension 0 is bound
          # dimension 1 is is point
          allInBounds = (points[:,0] > bounds[:,None,0])
          allInBounds &= (points[:,1] > bounds[:,None,1])
          allInBounds &= (points[:,0] < bounds[:,None,2])
          allInBounds &= (points[:,1] < bounds[:,None,3])
      
      
          # now find out the positions of all nonzero (i.e. true) values
          # nz[0] contains the indices along dim 0 (bound)
          # nz[1] contains the indices along dim 1 (point)
          nz = np.nonzero(allInBounds)
      
          # initialize the result with all nan
          r = np.full(points.shape[0], np.nan)
          # now use nz[1] to index point position and nz[0] to tell which cell the
          # point belongs to
          r[nz[1]] = nz[0]
          return r
      
      def findCellsParallel(points, bounds, chunksize=100):
          import multiprocessing as mp
          from functools import partial
      
          func = partial(findCells, bounds=bounds)
      
          # using python3 you could also do 'with mp.Pool() as p:'  
          p = mp.Pool()
          try:
              return np.hstack(p.map(func, points, chunksize))
          finally:
              p.close()
      
      def main():
          nPoints = 1e6
          nBounds = 1e4
      
          # points = np.array([[ 1.5, 1.5],
          #                    [ 1.1, 1.1],
          #                    [ 2.2, 2.2],
          #                    [ 1.3, 1.3],
          #                    [ 3.4, 1.4],
          #                    [ 2. , 1.5]])
      
          points = np.random.random([nPoints, 2])
      
          # bounds = np.array([[0,0,2,2],
          #                    [2,2,3,3]])
      
          # bounds = np.array([[0,0,1.4,1.4],
          #                    [1.4,1.4,2,2],
          #                    [2,2,3,3]])
      
          bounds = np.sort(np.random.random([nBounds, 2, 2]), 1).reshape(nBounds, 4)
      
          r = findCellsParallel(points, bounds)
          print(points[:10])
          for bIdx in np.unique(r[:10]):
              if np.isnan(bIdx):
                  continue
              print("{}: {}".format(bIdx, bounds[bIdx]))
          print(r[:10])
      
      if __name__ == "__main__":
          main()
      
      >time python test.py
      [[ 0.69083585  0.19840985]
       [ 0.31732711  0.80462512]
       [ 0.30542996  0.08569184]
       [ 0.72582609  0.46687164]
       [ 0.50534322  0.35530554]
       [ 0.93581095  0.36375539]
       [ 0.66226118  0.62573407]
       [ 0.08941219  0.05944215]
       [ 0.43015872  0.95306899]
       [ 0.43171644  0.74393729]]
      9935.0: [ 0.31584562  0.18404152  0.98215445  0.83625487]
      9963.0: [ 0.00526106  0.017255    0.33177741  0.9894455 ]
      9989.0: [ 0.17328876  0.08181912  0.33170444  0.23493507]
      9992.0: [ 0.34548987  0.15906761  0.92277442  0.9972481 ]
      9993.0: [ 0.12448765  0.5404578   0.33981119  0.906822  ]
      9996.0: [ 0.41198261  0.50958195  0.62843379  0.82677092]
      9999.0: [ 0.437169    0.17833114  0.91096133  0.70713434]
      [ 9999.  9993.  9989.  9999.  9999.  9935.  9999.  9963.  9992.  9996.]
      
      real 0m 24.352s
      user 3m  4.919s
      sys  0m  1.464s