Pandas 找到从每个点到其余点的距离,循环
我是python新手。 我有一个csv文件,在两列中包含400对x和y。 我想循环数据,使其从一对(x_I,y_I)开始,并找到该对与其余399个点之间的距离。我希望对所有对(x_I,y_I)重复该过程,并将结果附加到列表Pandas 找到从每个点到其余点的距离,循环,pandas,loops,datatables,Pandas,Loops,Datatables,我是python新手。 我有一个csv文件,在两列中包含400对x和y。 我想循环数据,使其从一对(x_I,y_I)开始,并找到该对与其余399个点之间的距离。我希望对所有对(x_I,y_I)重复该过程,并将结果附加到列表Dist_I import pandas as pd x_y_data = pd.read_csv("x_y_points400_labeled_csv.csv") x = x_y_data.loc[:,'x'] y = x_y_data.loc[:,'y'] i=0 j
Dist_I
import pandas as pd
x_y_data = pd.read_csv("x_y_points400_labeled_csv.csv")
x = x_y_data.loc[:,'x']
y = x_y_data.loc[:,'y']
i=0
j=0
while (i<len(x)):
Dist=np.sqrt((x[i]-x)**2 + (y[j]-y)**2)
i = 1 + i
j = 1 + j
print(Dist)
这是我走了多远,但这不是我想要得到的。我的目标是在所附图片中获得类似的内容。提前谢谢你的帮助
您可以使用
广播
(arr[:,None]
)一次完成此计算。这将为您提供所需的重复计算。否则,scipy.space.distance.pdist将为您提供计算的上三角
样本数据
和西皮
from scipy.spatial.distance import pdist
pdist(df[['X', 'Y']])
array([2.8532972 , 0.82759587, 1.95770875, 3.00078036, 1.16534282,
3.27316125, 2.91598992, 1.17270443, 1.70814458, 2.78266933,
3.1214628 , 1.74902298, 3.7184812 , 1.77945856, 2.09245472])
将其转换为上述数据帧
L = len(df)
arr = np.zeros((L, L))
arr[np.triu_indices(L, 1)] = pdist(df[['X', 'Y']])
arr = arr + arr.T # Lower triangle b/c symmetric
pd.DataFrame(arr, index=df.index, columns=df.index)
point0 point1 point2 point3 point4 point5
point0 0.000000 2.853297 0.827596 1.957709 3.000780 1.165343
point1 2.853297 0.000000 3.273161 2.915990 1.172704 1.708145
point2 0.827596 3.273161 0.000000 2.782669 3.121463 1.749023
point3 1.957709 2.915990 2.782669 0.000000 3.718481 1.779459
point4 3.000780 1.172704 3.121463 3.718481 0.000000 2.092455
point5 1.165343 1.708145 1.749023 1.779459 2.092455 0.000000
非常感谢你的回答。所以我接下来尝试得到小于2的值,我尝试添加result1=result
x = df['X'].to_numpy()
y = df['Y'].to_numpy()
result = pd.DataFrame(np.sqrt((x[:, None] - x)**2 + (y[:, None] - y)**2),
index=df.index,
columns=df.index)
point0 point1 point2 point3 point4 point5
point0 0.000000 2.853297 0.827596 1.957709 3.000780 1.165343
point1 2.853297 0.000000 3.273161 2.915990 1.172704 1.708145
point2 0.827596 3.273161 0.000000 2.782669 3.121463 1.749023
point3 1.957709 2.915990 2.782669 0.000000 3.718481 1.779459
point4 3.000780 1.172704 3.121463 3.718481 0.000000 2.092455
point5 1.165343 1.708145 1.749023 1.779459 2.092455 0.000000
from scipy.spatial.distance import pdist
pdist(df[['X', 'Y']])
array([2.8532972 , 0.82759587, 1.95770875, 3.00078036, 1.16534282,
3.27316125, 2.91598992, 1.17270443, 1.70814458, 2.78266933,
3.1214628 , 1.74902298, 3.7184812 , 1.77945856, 2.09245472])
L = len(df)
arr = np.zeros((L, L))
arr[np.triu_indices(L, 1)] = pdist(df[['X', 'Y']])
arr = arr + arr.T # Lower triangle b/c symmetric
pd.DataFrame(arr, index=df.index, columns=df.index)
point0 point1 point2 point3 point4 point5
point0 0.000000 2.853297 0.827596 1.957709 3.000780 1.165343
point1 2.853297 0.000000 3.273161 2.915990 1.172704 1.708145
point2 0.827596 3.273161 0.000000 2.782669 3.121463 1.749023
point3 1.957709 2.915990 2.782669 0.000000 3.718481 1.779459
point4 3.000780 1.172704 3.121463 3.718481 0.000000 2.092455
point5 1.165343 1.708145 1.749023 1.779459 2.092455 0.000000