Python 替换嵌套循环_Python_Python 3.x_Numpy_Machine Learning

Python 替换嵌套循环

python python-3.x numpy machine-learning

Python 替换嵌套循环,python,python-3.x,numpy,machine-learning,Python,Python 3.x,Numpy,Machine Learning,我刚开始学习Python，我很难理解如何实现以下目标（我是一名Java程序员）以下是初始代码： def compute_distances_two_loops(self, X): """ Compute the distance between each test point in X and each training point in self.X_train using a nested loop over both the training data and

我刚开始学习Python，我很难理解如何实现以下目标（我是一名Java程序员）

以下是初始代码：

  def compute_distances_two_loops(self, X):
    """
    Compute the distance between each test point in X and each training point
    in self.X_train using a nested loop over both the training data and the 
    test data.

    Inputs:
    - X: A numpy array of shape (num_test, D) containing test data.

    Returns:
    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
      is the Euclidean distance between the ith test point and the jth training
      point.
    """

    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))

    for i in range(num_test):
      for j in range(num_train):
        #####################################################################
        # TODO:                                                             #
        # Compute the l2 distance between the ith test point and the jth    #
        # training point, and store the result in dists[i, j]. You should   #
        # not use a loop over dimension.                                    #
        #####################################################################
        dists[i, j] = np.sum(np.square(X[i] - self.X_train[j]))
        #####################################################################
        #                       END OF YOUR CODE                            #
        #####################################################################
    return dists

下面是一段代码，该代码假定在输出相同数组的同时少了一个嵌套循环：

  def compute_distances_one_loop(self, X):
    """
    Compute the distance between each test point in X and each training point
    in self.X_train using a single loop over the test data.

    Input / Output: Same as compute_distances_two_loops
    """
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))

    for i in range(num_test):
      tmp = '%s %d' % ("\nfor i:", i)
      print(tmp)

      print(X[i])
      print("end of X[i]")
      print(self.X_train[:]) # all the thing [[ ... ... ]]
      print(": before, i after")
      print(self.X_train[i]) # just a row
      print(self.X_train[i, :])

      #######################################################################
      # TODO:                                                               #
      # Compute the l2 distance between the ith test point and all training #
      # points, and store the result in dists[i, :].                        #
      #######################################################################
      dists[i, :] = np.sum(np.square(X[i] - self.X_train[i, :]))
      print(dists[i])
      #######################################################################
      #                         END OF YOUR CODE                            #
      #######################################################################
    return dists

这似乎应该对我有所帮助，但我还是不明白

您可以看到，我的陷阱之一是，我对“：”如何准确工作的理解不足

我花了几个小时试图弄明白这件事，但似乎我真的缺乏一些核心知识。有人能帮我吗？这个练习是为斯坦福大学的视觉识别课程而做的：这是第一个作业，但它不是我真正的家庭作业，因为我自己只是为了娱乐而上这门课

目前，我的代码输出了

两个_循环的对角线的正确值

，但对于整行。我不明白我应该如何将

dists[I，：]

中的

：

与

-self.X_train[I，：]

部分同步。如何计算X[i]减去经过整个self.X_序列的迭代

注意：

num\u测试

为500x3072，

num\u列车

为5000x3072。3072来自32x32x3，这是32x32图片的RGB值

dists[i，j]

是一个500x5000矩阵，它映射了

num\u test

的第i个元素和

num\u train

的第j个元素之间的L2距离

def compute_distances_one_loop(self, X):
    """
    Compute the distance between each test point in X and each training point
    in self.X_train using a single loop over the test data.

    Input / Output: Same as compute_distances_two_loops
    """
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))

    for i in range(num_test):
      tmp = '%s %d' % ("\nfor i:", i)
      print(tmp)

      #######################################################################
      # TODO:                                                               #
      # Compute the l2 distance between the ith test point and all training #
      # points, and store the result in dists[i, :].                        #
      #######################################################################
      dists[i] = np.sum(np.square(X[i] - self.X_train), axis=1)
      print(dists[i])
      #######################################################################
      #                         END OF YOUR CODE                            #
      #######################################################################
    return dists

删除循环中带有self.X_序列的打印，因为长度不同。（IndexOutOfRangeException）我不确定这是否删除了第二个循环，但这是一个有效的解决方案

另一个评论，我认为你对欧几里德距离公式的理解是错误的。

您缺少结尾处的sqrt。

从另一个角度看，这显然像是另一个

for

循环。我不认为这是解决办法。关于您对

sqrt

的评论：“在实际的最近邻应用程序中，我们可以省略平方根运算，因为平方根是一个单调函数。也就是说，它可以缩放距离的绝对大小，但保留顺序，因此有无平方根的最近邻都是相同的。”我想这就是你要找的。我修正了密码，真的！请您在回答中解释一下它的工作原理好吗？X[i]-self.X_train返回一个数组，该数组是将self.X_train（与self.X_train[：]相同）中的每两个组件点减去X[i]中的两个组件数组（实际点，与X[i，：]）的结果，并保留每个结果点，在np.square计算每个数字上的平方后，以及在np.sum的axis参数告诉numpy sum数组中的每一行（仅行中的组件，详细说明见）后，返回我们需要的数组，以dists表示该组件。对于非常笼统的解释和我的英语，我深表歉意。