Python 使用matplotlib连接散点图中k个最近邻的有效方法
我试图构建一个最近邻图,即每个数据点都连接到其k个最近邻的散点图。我目前的解决方案有效,但显然效率不高。以下是我到目前为止得到的信息:Python 使用matplotlib连接散点图中k个最近邻的有效方法,python,matplotlib,Python,Matplotlib,我试图构建一个最近邻图,即每个数据点都连接到其k个最近邻的散点图。我目前的解决方案有效,但显然效率不高。以下是我到目前为止得到的信息: import numpy as np from scipy.spatial.distance import pdist, squareform from matplotlib import pyplot as plt X = np.random.random(500).reshape((250, 2)) k = 4 # matrix of pairwise
import numpy as np
from scipy.spatial.distance import pdist, squareform
from matplotlib import pyplot as plt
X = np.random.random(500).reshape((250, 2))
k = 4
# matrix of pairwise Euclidean distances
distmat = squareform(pdist(X, 'euclidean'))
# select the kNN for each datapoint
neighbors = np.sort(np.argsort(distmat, axis=1)[:, 0:k])
plt.figure(figsize = (8, 8))
plt.scatter(X[:,0], X[:,1], c = 'black')
for i in np.arange(250):
for j in np.arange(k):
x1 = np.array([X[i,:][0], X[neighbors[i, j], :][0]])
x2 = np.array([X[i,:][1], X[neighbors[i, j], :][1]])
plt.plot(x1, x2, color = 'black')
plt.show()
有没有更有效的方法来构建此打印?使用线集合一次性打印所有边,而不是逐个打印:
import numpy as np
from scipy.spatial.distance import pdist, squareform
from matplotlib import pyplot as plt
from matplotlib.collections import LineCollection
N = 250
X = np.random.rand(250,2)
k = 4
# matrix of pairwise Euclidean distances
distmat = squareform(pdist(X, 'euclidean'))
# select the kNN for each datapoint
neighbors = np.sort(np.argsort(distmat, axis=1)[:, 0:k])
# get edge coordinates
coordinates = np.zeros((N, k, 2, 2))
for i in np.arange(250):
for j in np.arange(k):
coordinates[i, j, :, 0] = np.array([X[i,:][0], X[neighbors[i, j], :][0]])
coordinates[i, j, :, 1] = np.array([X[i,:][1], X[neighbors[i, j], :][1]])
# create line artists
lines = LineCollection(coordinates.reshape((N*k, 2, 2)), color='black')
fig, ax = plt.subplots(1,1,figsize = (8, 8))
ax.scatter(X[:,0], X[:,1], c = 'black')
ax.add_artist(lines)
plt.show()
在我的机器上,你的代码运行大约需要1秒;我的版本需要65毫秒。使用线集合一次性打印所有边,而不是逐个打印:
import numpy as np
from scipy.spatial.distance import pdist, squareform
from matplotlib import pyplot as plt
from matplotlib.collections import LineCollection
N = 250
X = np.random.rand(250,2)
k = 4
# matrix of pairwise Euclidean distances
distmat = squareform(pdist(X, 'euclidean'))
# select the kNN for each datapoint
neighbors = np.sort(np.argsort(distmat, axis=1)[:, 0:k])
# get edge coordinates
coordinates = np.zeros((N, k, 2, 2))
for i in np.arange(250):
for j in np.arange(k):
coordinates[i, j, :, 0] = np.array([X[i,:][0], X[neighbors[i, j], :][0]])
coordinates[i, j, :, 1] = np.array([X[i,:][1], X[neighbors[i, j], :][1]])
# create line artists
lines = LineCollection(coordinates.reshape((N*k, 2, 2)), color='black')
fig, ax = plt.subplots(1,1,figsize = (8, 8))
ax.scatter(X[:,0], X[:,1], c = 'black')
ax.add_artist(lines)
plt.show()
在我的机器上,你的代码运行大约需要1秒;我的版本需要65毫秒。谢谢,这就是我想要的!谢谢,这就是我要找的!