Python 我需要获得数据帧最密集点的坐标（x，y）_Python_Machine Learning_Statistics

Python 我需要获得数据帧最密集点的坐标（x，y）

python machine-learning statistics

Python 我需要获得数据帧最密集点的坐标（x，y）,python,machine-learning,statistics,Python,Machine Learning,Statistics,我有一个带有坐标（X，Y）的数据框，我需要得到一个具有最高密度点坐标的列表我使用坐标（X，Y）的平均值，计算从该点到所有其他点的距离，然后对它们排序，但平均值并不总是在最密集的点。使用gaussian_kde，我可以可视化最密集的点，但我不知道如何将这些点提取到列表中 import numpy as np import pandas as pd import pylab as plt import random from scipy.stats import gaussian_kde fr

我有一个带有坐标（X，Y）的数据框，我需要得到一个具有最高密度点坐标的列表

我使用坐标（X，Y）的平均值，计算从该点到所有其他点的距离，然后对它们排序，但平均值并不总是在最密集的点。使用gaussian_kde，我可以可视化最密集的点，但我不知道如何将这些点提取到列表中

import numpy as np
import pandas as pd
import pylab as plt 
import random
from scipy.stats import gaussian_kde
from scipy.spatial.distance import cdist
from scipy.spatial import distance

def closest_point(point, points):
    """ Find the nearest point. """
    return points[cdist([point], points).argmin()]

x = [random.randint(0, 100) for x in range(1, 51)]
y = [random.randint(0, 100) for x in range(1, 51)]
fr = pd.DataFrame({'x':x,'y':y})

mx = fr['x'].mean()
my = fr['y'].mean()
fr2 = pd.DataFrame({'x':[mx],'y':[my]})

fr['Punto'] =  [(x, y) for x,y in zip(fr['x'], fr['y'])]
fr2['Punto'] = [(x, y) for x,y in zip(fr2['x'], fr2['y'])]
fr2['Cercano'] = [closest_point(x, list(fr['Punto'])) for x in fr2['Punto']]

lista = fr['Punto'].tolist()
media = fr2['Punto'].tolist()

distancia_numpy =  distance.cdist(lista,media, 'euclidean')
distancia_lista = np.array(distancia_numpy).tolist()
distancia_serie = pd.Series(distancia_lista)
"""
we place a new column with the distance from the average point to the nearest point
"""
fr['Distancia'] = distancia_serie
ordenado = fr.sort_values('Distancia', ascending = True)

xy = np.vstack([x,y])
z = gaussian_kde(xy)(xy)
fig, ax = plt.subplots()
ax.scatter(x, y, s=50, c=z, edgecolor='')
"""in red the mean of the points"""
ax.scatter(mx, my, s=100,c='red', edgecolor='')

plt.show()
print (ordenado)

结果应该是一个列表或一个有序的数据帧，首先是最密集的点，事实上我得到了这些结果，但它们不正确，因为平均点不位于密度最大的点。

欢迎提供任何帮助

听起来您需要根据估计的pdf对点进行排序：使用

z.evaluate（xy）

作为（反向）排序键将首先给出最可能的点

很多！，这段代码完成了任务

point_gaus = pd.DataFrame({'x':x,'y':y,'gauss':list(z)})
point_gaus_order = point_gaus.sort_values('gauss', ascending = False)
point_gaus_order_10 = point_gaus_order[:10]
ax.scatter(point_gaus_order_10['x'],point_gaus_order_10['y'], s=25,c='red', edgecolor='')

你能再给我解释一下吗。。。我找不到如何正确使用评估方法的示例，根据您所描述的，您需要根据密度按降序对点进行排序。您将z构造为密度估计（KDE），因此现在z是您的密度。您需要在输入点对其进行评估，并按其对点进行排序。评估文档在这里，谢谢！你的帮助使我走上了正确的方向