Python 替换嵌套循环
我刚开始学习Python,我很难理解如何实现以下目标(我是一名Java程序员) 以下是初始代码:Python 替换嵌套循环,python,python-3.x,numpy,machine-learning,Python,Python 3.x,Numpy,Machine Learning,我刚开始学习Python,我很难理解如何实现以下目标(我是一名Java程序员) 以下是初始代码: def compute_distances_two_loops(self, X): """ Compute the distance between each test point in X and each training point in self.X_train using a nested loop over both the training data and
def compute_distances_two_loops(self, X):
"""
Compute the distance between each test point in X and each training point
in self.X_train using a nested loop over both the training data and the
test data.
Inputs:
- X: A numpy array of shape (num_test, D) containing test data.
Returns:
- dists: A numpy array of shape (num_test, num_train) where dists[i, j]
is the Euclidean distance between the ith test point and the jth training
point.
"""
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
for i in range(num_test):
for j in range(num_train):
#####################################################################
# TODO: #
# Compute the l2 distance between the ith test point and the jth #
# training point, and store the result in dists[i, j]. You should #
# not use a loop over dimension. #
#####################################################################
dists[i, j] = np.sum(np.square(X[i] - self.X_train[j]))
#####################################################################
# END OF YOUR CODE #
#####################################################################
return dists
下面是一段代码,该代码假定在输出相同数组的同时少了一个嵌套循环:
def compute_distances_one_loop(self, X):
"""
Compute the distance between each test point in X and each training point
in self.X_train using a single loop over the test data.
Input / Output: Same as compute_distances_two_loops
"""
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
for i in range(num_test):
tmp = '%s %d' % ("\nfor i:", i)
print(tmp)
print(X[i])
print("end of X[i]")
print(self.X_train[:]) # all the thing [[ ... ... ]]
print(": before, i after")
print(self.X_train[i]) # just a row
print(self.X_train[i, :])
#######################################################################
# TODO: #
# Compute the l2 distance between the ith test point and all training #
# points, and store the result in dists[i, :]. #
#######################################################################
dists[i, :] = np.sum(np.square(X[i] - self.X_train[i, :]))
print(dists[i])
#######################################################################
# END OF YOUR CODE #
#######################################################################
return dists
这似乎应该对我有所帮助,但我还是不明白
您可以看到,我的陷阱之一是,我对“:”如何准确工作的理解不足
我花了几个小时试图弄明白这件事,但似乎我真的缺乏一些核心知识。有人能帮我吗?这个练习是为斯坦福大学的视觉识别课程而做的:这是第一个作业,但它不是我真正的家庭作业,因为我自己只是为了娱乐而上这门课
目前,我的代码输出了两个_循环的对角线的正确值
,但对于整行。我不明白我应该如何将dists[I,:]
中的:
与-self.X_train[I,:]
部分同步。如何计算X[i]减去经过整个self.X_序列的迭代
注意:num\u测试
为500x3072,num\u列车
为5000x3072。3072来自32x32x3,这是32x32图片的RGB值dists[i,j]
是一个500x5000矩阵,它映射了num\u test
的第i个元素和num\u train
的第j个元素之间的L2距离
def compute_distances_one_loop(self, X):
"""
Compute the distance between each test point in X and each training point
in self.X_train using a single loop over the test data.
Input / Output: Same as compute_distances_two_loops
"""
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
for i in range(num_test):
tmp = '%s %d' % ("\nfor i:", i)
print(tmp)
#######################################################################
# TODO: #
# Compute the l2 distance between the ith test point and all training #
# points, and store the result in dists[i, :]. #
#######################################################################
dists[i] = np.sum(np.square(X[i] - self.X_train), axis=1)
print(dists[i])
#######################################################################
# END OF YOUR CODE #
#######################################################################
return dists
删除循环中带有self.X_序列的打印,因为长度不同。(IndexOutOfRangeException)
我不确定这是否删除了第二个循环,但这是一个有效的解决方案
另一个评论,我认为你对欧几里德距离公式的理解是错误的。
您缺少结尾处的sqrt。从另一个角度看,这显然像是另一个
for
循环。我不认为这是解决办法。关于您对sqrt
的评论:“在实际的最近邻应用程序中,我们可以省略平方根运算,因为平方根是一个单调函数。也就是说,它可以缩放距离的绝对大小,但保留顺序,因此有无平方根的最近邻都是相同的。”我想这就是你要找的。我修正了密码,真的!请您在回答中解释一下它的工作原理好吗?X[i]-self.X_train返回一个数组,该数组是将self.X_train(与self.X_train[:]相同)中的每两个组件点减去X[i]中的两个组件数组(实际点,与X[i,:])的结果,并保留每个结果点,在np.square计算每个数字上的平方后,以及在np.sum的axis参数告诉numpy sum数组中的每一行(仅行中的组件,详细说明见)后,返回我们需要的数组,以dists表示该组件。对于非常笼统的解释和我的英语,我深表歉意。