Python 汇总特定（多个）范围的数据_Python_Numpy

Python 汇总特定（多个）范围的数据

python numpy

Python 汇总特定（多个）范围的数据,python,numpy,Python,Numpy,我确信有一个很好的方法可以做到这一点，但我没有找到正确的谷歌搜索词，所以我会在这里询问。我的问题是：我有两个二维数组，它们的维数都相同。一个数组（数组1）是（x，y）点的累积降水量。另一个（数组2）是相同（x，y）栅格的地形高度。我想将数组1与数组2的特定高度相加，并创建一个条形图，其中x轴的地形高度为a，y轴的总累积降水量为a 因此，我希望能够声明一个高度列表（比如说[01100200，…，1000]），并对每个料仓汇总该料仓内发生的所有降水量我可以想出一些复杂的方法来做到这一点，但我猜可

我确信有一个很好的方法可以做到这一点，但我没有找到正确的谷歌搜索词，所以我会在这里询问。我的问题是：

我有两个二维数组，它们的维数都相同。一个数组（数组1）是（x，y）点的累积降水量。另一个（数组2）是相同（x，y）栅格的地形高度。我想将数组1与数组2的特定高度相加，并创建一个条形图，其中x轴的地形高度为a，y轴的总累积降水量为a

因此，我希望能够声明一个高度列表（比如说

[01100200，…，1000]

），并对每个料仓汇总该料仓内发生的所有降水量

我可以想出一些复杂的方法来做到这一点，但我猜可能有一个更简单的方法，我没有想到。我的直觉是在我的高度列表中循环，屏蔽超出该范围的任何东西，汇总剩余值，将其添加到新数组中，然后重复

我想知道是否有一个内置的numpy或类似的库可以更有效地完成这项工作

此代码显示了您的要求，并在注释中进行了一些解释：

import numpy as np


def in_range(x, lower_bound, upper_bound):
    # returns wether x is between lower_bound (inclusive) and upper_bound (exclusive)
    return x in range(lower_bound, upper_bound)


# vectorize allows you to easily 'map' the function to a numpy array
vin_range = np.vectorize(in_range)

# representing your rainfall
rainfall = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# representing your height map
height = np.array([[1, 2, 1], [2, 4, 2], [3, 6, 3]])
# the bands of height you're looking to sum
bands = [[0, 2], [2, 4], [4, 6], [6, 8]]

# computing the actual results you'd want to chart
result = [(band, sum(rainfall[vin_range(height, *band)])) for band in bands]

print(result)

下一行到最后一行是魔法发生的地方

vin\u范围（高度，*band）

使用矢量化函数创建布尔值的numpy数组，其尺寸与

height

相同，如果

height

的值在给定范围内，则该数组为True，否则为

False

通过使用该数组以目标值（

降雨

）对数组进行索引，可以得到一个仅包含高度在目标范围内的值的数组。那就只需要把这些加起来

在多个步骤中，

result=[（波段，总和（降雨量[vin_范围（高度，*band）]）））对于波段中的波段]

（但结果相同）：

一个使用的示例，它允许生成掩码数组。从文档中：

掩码数组是标准numpy.ndarray和掩码的组合。掩码可以是nomask（表示关联数组的值无效），也可以是布尔数组（确定关联数组的每个元素的值是否有效）

这似乎是你在这种情况下需要的

import numpy as np

pr = np.random.randint(0, 1000, size=(100, 100)) #precipitation map
he = np.random.randint(0, 1000, size=(100, 100)) #height map

bins = np.arange(0, 1001, 200)

values = []
for vmin, vmax in zip(bins[:-1], bins[1:]):
    #creating the masked array, here minimum included inside bin, maximum excluded.
    maskedpr = np.ma.masked_where((he < vmin) | (he >= vmax), pr)
    values.append(maskedpr.sum())

将numpy导入为np
pr=np.random.randint（0，1000，size=（100100））#降水图
he=np.random.randint（0，1000，size=（100100））#高度图
垃圾箱=np.arange（01001200）
值=[]
对于vmin，zip中的vmax（存储箱[：-1]，存储箱[1:]）：
#创建屏蔽数组，此处最小包含在bin中，最大不包含。
maskedpr=np.ma.masked_其中（（he=vmax），pr）
values.append（maskedpr.sum（））

values

是每个箱子的值列表，您可以绘制这些值

该函数返回一个被屏蔽的数组，其中条件为

True

。因此，您需要将条件设置为箱外的

True

。

sum（）<代码>数字化

从高度数组

高度

和箱子边界

箱子

创建箱子索引数组

bincount

然后使用bin索引对数组中的数据求和

rain

# set up
rain  = np.random.randint(0,100,(5,5))/10
height = np.random.randint(0,10000,(5,5))/10
bins = [0,250,500,750,10000]

# compute
sums = np.bincount(np.digitize(height.ravel(),bins),rain.ravel(),len(bins)+1)

# result
sums
# array([ 0. , 37. , 35.6, 14.6, 22.4,  0. ])

# check against direct method
[rain[(height>=bins[i]) & (height<bins[i+1])].sum() for i in range(len(bins)-1)]
# [37.0, 35.6, 14.600000000000001, 22.4]

#设置
雨=np.random.randint（0100，（5,5））/10
高度=np.random.randint（010000，（5,5））/10
箱子=[02500075010000]
#计算
sums=np.bincount（np.digitalize（height.ravel（），bins），rain.ravel（），len（bins）+1）
#结果
总数
#数组（[0,37,35.6,14.6,22.4,0.]））
#对照直接法
[雨水[（高度>=垃圾箱[i]）和（高度
# set up
rain  = np.random.randint(0,100,(5,5))/10
height = np.random.randint(0,10000,(5,5))/10
bins = [0,250,500,750,10000]

# compute
sums = np.bincount(np.digitize(height.ravel(),bins),rain.ravel(),len(bins)+1)

# result
sums
# array([ 0. , 37. , 35.6, 14.6, 22.4,  0. ])

# check against direct method
[rain[(height>=bins[i]) & (height<bins[i+1])].sum() for i in range(len(bins)-1)]
# [37.0, 35.6, 14.600000000000001, 22.4]