Matlab 了解KNN的标准化过程
因此,我在理解KNN分类器的标准化过程时遇到了一些困难。基本上,我需要知道标准化过程中发生了什么。如果有人能帮忙,我将不胜感激。我知道有一个均值和标准差的变量是由“火车例子”组成的,但在那个之后实际发生的事情是我遇到困难的Matlab 了解KNN的标准化过程,matlab,machine-learning,knn,Matlab,Machine Learning,Knn,因此,我在理解KNN分类器的标准化过程时遇到了一些困难。基本上,我需要知道标准化过程中发生了什么。如果有人能帮忙,我将不胜感激。我知道有一个均值和标准差的变量是由“火车例子”组成的,但在那个之后实际发生的事情是我遇到困难的 classdef myknn methods(Static) %the function m calls the train examples, train labels %and the no. of nea
classdef myknn
methods(Static)
%the function m calls the train examples, train labels
%and the no. of nearest neighbours.
function m = fit(train_examples, train_labels, k)
% start of standardisation process
m.mean = mean(train_examples{:,:}); %mean variable
m.std = std(train_examples{:,:}); %standard deviation variable
for i=1:size(train_examples,1)
train_examples{i,:} = train_examples{i,:} - m.mean;
train_examples{i,:} = train_examples{i,:} ./ m.std;
end
% end of standardisation process
m.train_examples = train_examples;
m.train_labels = train_labels;
m.k = k;
end
function predictions = predict(m, test_examples)
predictions = categorical;
for i=1:size(test_examples,1)
fprintf('classifying example example %i/%i\n', i, size(test_examples,1));
this_test_example = test_examples{i,:};
% start of standardisation process
this_test_example = this_test_example - m.mean;
this_test_example = this_test_example ./ m.std;
% end of standardisation process
this_prediction = myknn.predict_one(m, this_test_example);
predictions(end+1) = this_prediction;
end
end
function prediction = predict_one(m, this_test_example)
distances = myknn.calculate_distances(m, this_test_example);
neighbour_indices = myknn.find_nn_indices(m, distances);
prediction = myknn.make_prediction(m, neighbour_indices);
end
function distances = calculate_distances(m, this_test_example)
distances = [];
for i=1:size(m.train_examples,1)
this_training_example = m.train_examples{i,:};
this_distance = myknn.calculate_distance(this_training_example, this_test_example);
distances(end+1) = this_distance;
end
end
function distance = calculate_distance(p, q)
differences = q - p;
squares = differences .^ 2;
total = sum(squares);
distance = sqrt(total);
end
function neighbour_indices = find_nn_indices(m, distances)
[sorted, indices] = sort(distances);
neighbour_indices = indices(1:m.k);
end
function prediction = make_prediction(m, neighbour_indices)
neighbour_labels = m.train_labels(neighbour_indices);
prediction = mode(neighbour_labels);
end
end
end标准化是对训练示例中的每个特征进行标准化的过程,以便每个特征的平均值为零,标准偏差为一。这样做的程序是找到每个特征的平均值和每个特征的标准偏差。之后,我们取每个特征,减去其相应的平均值,然后除以其相应的标准偏差 可以通过该代码清楚地看到:
m.mean = mean(train_examples{:,:}); %mean variable
m.std = std(train_examples{:,:}); %standard deviation variable
for i=1:size(train_examples,1)
train_examples{i,:} = train_examples{i,:} - m.mean;
train_examples{i,:} = train_examples{i,:} ./ m.std;
end
m.mean
记住每个特征的平均值,而m.std
记住每个特征的标准偏差。请注意,当您想在测试时执行分类时,您必须记住这两项。这可以通过您使用的predict
方法看到,该方法从训练示例中提取测试特征并减去每个特征的平均值和标准偏差
function predictions = predict(m, test_examples)
predictions = categorical;
for i=1:size(test_examples,1)
fprintf('classifying example example %i/%i\n', i, size(test_examples,1));
this_test_example = test_examples{i,:};
% start of standardisation process
this_test_example = this_test_example - m.mean;
this_test_example = this_test_example ./ m.std;
% end of standardisation process
this_prediction = myknn.predict_one(m, this_test_example);
predictions(end+1) = this_prediction;
end
请注意,我们在测试示例中使用了m.mean
和m.std
,这些数量来自于培训示例
function predictions = predict(m, test_examples)
predictions = categorical;
for i=1:size(test_examples,1)
fprintf('classifying example example %i/%i\n', i, size(test_examples,1));
this_test_example = test_examples{i,:};
% start of standardisation process
this_test_example = this_test_example - m.mean;
this_test_example = this_test_example ./ m.std;
% end of standardisation process
this_prediction = myknn.predict_one(m, this_test_example);
predictions(end+1) = this_prediction;
end
我关于标准化的文章应该提供更多的背景。此外,它实现了与您提供的代码相同的效果,但采用了更矢量化的方式: