Python numpy数组中两组值之间的距离_Python_Arrays_Graph_Distance

Python numpy数组中两组值之间的距离

python arrays graph

Python numpy数组中两组值之间的距离,python,arrays,graph,distance,Python,Arrays,Graph,Distance,我有一个非常基本的问题，理论上很容易解决（在ArcGIS中，点数少，手工劳动多），但我根本无法从解决这个问题的编码开始（我对复杂的python编码也是新手）我有两个变量'Root zone'akaRTZ和'Tree cover'akaTC都是一个250x186值的数组（基本上是网格，每个网格都有一个特定的值）。TC中的值从0到100不等。每个栅格大小为0.25度（可能有助于理解距离）我的问题是“我想计算距离最近的TC范围在0-30（小于30）之间的点的50-100之间的每个TC值（因此每个l

我有一个非常基本的问题，理论上很容易解决（在ArcGIS中，点数少，手工劳动多），但我根本无法从解决这个问题的编码开始（我对复杂的python编码也是新手）

我有两个变量'Root zone'aka

RTZ

和'Tree cover'aka

TC

都是一个250x186值的数组（基本上是网格，每个网格都有一个特定的值）。

TC

中的值从0到100不等。每个栅格大小为0.25度（可能有助于理解距离）

我的问题是“我想计算距离最近的

TC

范围在0-30（小于30）之间的点的50-100之间的每个

TC

值（因此每个

lat

和

lon

处的每个TC值大于50）。”

只要考虑到我们不是在看

TC

的np.nan部分。因此，

TC

中的白色部分在

RZS

中也是白色的。

我想做的是创建一个二维散点图，X轴表示“从0-30个值到50-100

TC

的距离”，Y轴表示“这些50-100

TC

点的

RZS

”。上面的数字可能会让事情更清楚

我希望我能为此提供任何代码，但我甚至不能从距离开始。请提供任何关于我应该如何进行这项工作的建议

让我们考虑一个例子：

如果您查看x:70和y:70，您可以看到整个数据集中有许多具有0-30树覆盖率值的点。但我只需要从最近的值到我的点的距离，该距离在0-30之间。

以下代码可能适用于随机示例数据：

import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
# Create some completely random data, and include an area of NaNs as well
rzs = np.random.uniform(0, 100, size=(250, 168))
tc = np.random.lognormal(3.0, size=(250, 168))
tc = np.clip(tc, 0, 100)
rzs[60:80,:] = np.nan
tc[60:80,:] = np.nan

plt.subplot(2,2,1)
plt.imshow(rzs)
plt.colorbar()
plt.subplot(2,2,2)
plt.imshow(tc)
plt.colorbar()

现在做真正的工作：

# Select the indices of the low- and high-valued points
# This will results in warnings here because of NaNs;
# the NaNs should be filtered out in the indices, since they will 
# compare to False in all the comparisons, and thus not be 
# indexed by 'low' and 'high'
low = (tc >= 0) & (tc <= 30)
high = (tc >= 50) & (tc <= 100)
# Get the coordinates for the low- and high-valued points,
# combine and transpose them to be in the correct format
y, x = np.where(low)
low_coords = np.array([x, y]).T
y, x = np.where(high)
high_coords = np.array([x, y]).T

# We now calculate the distances between *all* low-valued points, and *all* high-valued points.
# This calculation scales as O^2, as does the memory cost (of the output), 
# so be wary when using it with large input sizes.
from scipy.spatial.distance import cdist, pdist
distances = cdist(low_coords, high_coords)

# Now find the minimum distance along the axis of the high-valued coords, 
# which here is the second axis.
# Since we also want to find values corresponding to those minimum distances, 
# we should use the `argmin` function instead of a normal `min` function.
indices = distances.argmin(axis=1)
mindistances = distances[np.arange(distances.shape[0]), indices]
minrzs = rzs.flatten()[indices]

plt.scatter(mindistances, minrzs)

#选择低值点和高值点的索引
#这将导致由于NAN而在此处发出警告；
#应在指数中过滤掉NAN，因为它们将
#在所有的比较中，比较为假，因此不为假
#按“低”和“高”索引
低=（tc>=0）和（tc=50）和（tc在计算距离时，是否要将所有50-100点与所有0-30点交叉关联？因为每个50-100点都有大量的值，而不是一个值。@9769953抱歉提出疑问。这是0-30之间最接近的值。第3段最后一行也提到了这一点。我添加了一个exa例如，这可能会有帮助。对于一个值介于50和100之间的给定点，您想知道到值介于0和30之间的最近点的距离？正确吗？当然，对于值介于50和100之间的每个点。@9769953……正确。以防万一，这些数据或类似的数据是正确的Me在公开的地方，我希望有一个指向它的指针，这样我就可以给出我的答案（和示例）更具代表性，总的来说，更好。遗憾的是，我没有简单的方法上传我在其中创建的笔记本，这将比当前的代码、注释和数字组合稍微好一点。感谢您的详细解释。我尝试了这个方法，效果很好。唯一的问题是我无法想象w这些像素属于哪一类。我们可以用不同的颜色将树覆盖类添加到绘图中吗？例如，一种颜色的TC介于50-60之间等等。我无法一次性累积所有数据集。