Python 如何可视化k近邻分类器的测试样本？_Python_Scikit Learn_Data Science_Knn

Python 如何可视化k近邻分类器的测试样本？

python scikit-learn

Python 如何可视化k近邻分类器的测试样本？,python,scikit-learn,data-science,knn,Python,Scikit Learn,Data Science,Knn,我想可视化4个k-NN分类器的测试样本。我已经找过了，但什么也没找到。你能帮我实现代码吗这是我到目前为止的代码 from sklearn.datasets import make_moons import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns %matplotlib inline from sklearn.neighbors import KNeighborsC

我想可视化4个k-NN分类器的测试样本。我已经找过了，但什么也没找到。你能帮我实现代码吗

这是我到目前为止的代码

from sklearn.datasets import make_moons
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
from sklearn.neighbors import KNeighborsClassifier
from sklearn import metrics
from sklearn.neighbors import KNeighborsClassifier

X, y = make_moons(n_samples=100, noise=0.3)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.04, random_state=42)
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)

通过1×4轴图形。对于每个轴，我希望可视化训练样本、相应的测试样本（用“+”标记表示）以及该样本的最近k个邻居（用绿色边框表示）。每个轴的标题应说明预测类。

为此，基本上需要重建KNN算法本身，因为它不跟踪使用哪些“邻居”对给定样本进行预测

如何做到这一点取决于KNN算法使用的距离度量

例如，您可以定义一个函数，根据

L1

（曼哈顿距离）提取最近邻，如下所示：

def get_neighbors(xs, sample, k=5):
    neighbors = [(x, np.sum(np.abs(x - sample))) for x in xs]
    neighbors = sorted(neighbors, key=lambda x: x[1])
    return np.array([x for x, _ in neighbors[:k]])

其中，

xs

是您的训练集，

sample

是您要进行预测的点

现在，您可以通过分散训练点、测试点和绘图上的邻居，轻松地将其可视化

_, ax = plt.subplots(nrows=1, ncols=4, figsize=(15, 5))
for i in range(4):
    sample = X_test[i]
    neighbors = get_neighbors(X_train, sample, k=5)
    ax[i].scatter(X_train[:, 0], X_train[:, 1], c="skyblue")
    ax[i].scatter(neighbors[:, 0], neighbors[:, 1], edgecolor="green")
    ax[i].scatter(sample[0], sample[1], marker="+", c="red", s=100)
    ax[i].set(xlim=(-2, 2), ylim=(-2, 2))

plt.tight_layout()

您可以通过向

scatter

方法添加适当的参数来设置样式。请注意，我在这里通过设置

xlim

和

ylim

对其进行剪裁，您也可以更改它们，但请注意保持x轴和y轴之间的1:1比例，否则邻居可能看起来不正确。

非常感谢！你的答案很清楚。我学到了很多东西