如何使用Python';s loop从头开始开发KNN alogarithm

如何使用Python';s loop从头开始开发KNN alogarithm,python,knn,Python,Knn,我正在学习Python并练习在不使用库的情况下开发KNN 下面是我想采取的3个主要步骤,但我的代码中充满了错误 我正在使用的数据有4个特性和两个类 请看我在下面尝试做什么并帮助改进它-我遇到的主要错误是: TypeError: only size-1 arrays can be converted to Python scalars 计划分3个阶段进行: 要准备数据,请拆分(用于评估): 从KNN测量所有距离 from math import sqrt from collections im

我正在学习Python并练习在不使用库的情况下开发KNN

下面是我想采取的3个主要步骤,但我的代码中充满了错误

我正在使用的数据有4个特性和两个类

请看我在下面尝试做什么并帮助改进它-我遇到的主要错误是:

TypeError: only size-1 arrays can be converted to Python scalars
计划分3个阶段进行:

  • 要准备数据,请拆分(用于评估):

  • 从KNN测量所有距离
    from math import sqrt
    from collections import Counter
    #new_measure=(X_new)
    #X_new= [1,2,3,4]
    distance=[]
    for group in X_train:
        for features in X_train:
           Eu_dis= sqrt( (X_new [0]- X_train[0])**2 + (X_new [1]- X_train[1])**2+(X_new [2]- X_train[2])**2+(X_new [3]- X_train[3])**2)
    
  • 确定最近的KNN和最可能的类别

  • 在此之后如何继续?

    以下是所有需要的功能: 1.计算两个向量之间的欧氏距离 2.找到最相似的邻居 3.利用邻域进行分类预测

    # calculate the Euclidean distance between two vectors
    def euclidean_distance(row1, row2):
        distance = 0.0
        for i in range(len(row1)-1):
            distance += (row1[i] - row2[i])**2
        return sqrt(distance)
    
    # Locate the most similar neighbors
    def get_neighbors(train, test_row, num_neighbors):
        distances = list()
        for train_row in train:
            dist = euclidean_distance(test_row, train_row)
            distances.append((train_row, dist))
        distances.sort(key=lambda tup: tup[1])
        neighbors = list()
        for i in range(num_neighbors):
            neighbors.append(distances[i][0])
        return neighbors
    
    # Make a classification prediction with neighbors
    def predict_classification(train, test_row, num_neighbors):
        neighbors = get_neighbors(train, test_row, num_neighbors)
        output_values = [row[-1] for row in neighbors]
        prediction = max(set(output_values), key=output_values.count)
        return prediction
    

    谢谢,非常有帮助。我使用了你的提示,并尝试将其应用于虹膜数据。我还有一个错误,你可以从中找到解决办法。这就是错误,您认为发生了什么。范围内的i(len(row1)-1)的10距离=0.0 11:-->12距离+=(row1[i]-row2[i])**2 13返回sqrt(distance)14索引器:索引4超出大小为4的轴0的范围
    # calculate the Euclidean distance between two vectors
    def euclidean_distance(row1, row2):
        distance = 0.0
        for i in range(len(row1)-1):
            distance += (row1[i] - row2[i])**2
        return sqrt(distance)
    
    # Locate the most similar neighbors
    def get_neighbors(train, test_row, num_neighbors):
        distances = list()
        for train_row in train:
            dist = euclidean_distance(test_row, train_row)
            distances.append((train_row, dist))
        distances.sort(key=lambda tup: tup[1])
        neighbors = list()
        for i in range(num_neighbors):
            neighbors.append(distances[i][0])
        return neighbors
    
    # Make a classification prediction with neighbors
    def predict_classification(train, test_row, num_neighbors):
        neighbors = get_neighbors(train, test_row, num_neighbors)
        output_values = [row[-1] for row in neighbors]
        prediction = max(set(output_values), key=output_values.count)
        return prediction