Matlab 用sigmoid函数组合线性回归进行Logistic回归_Matlab_Logistic Regression

Matlab 用sigmoid函数组合线性回归进行Logistic回归

matlab

Matlab 用sigmoid函数组合线性回归进行Logistic回归,matlab,logistic-regression,Matlab,Logistic Regression,我试图在不调用matlab支持的任何函数的情况下实现一个逻辑回归算法，随后我调用了逻辑回归mnrfit的matlab函数，以便交叉确认我的算法工作良好我正在实施的过程如下。我首先做一个向量x，它有输入数据，还有一个向量y[0,1]，它有对应于每个数据x的类。我使用梯度下降法对这些数据进行线性回归，一旦我提取了系数，我就通过了sigmoid函数。稍后，我对x=10进行预测，以找到该输入的类1的可能性。就这么简单之后，我调用matlab函数mnrfit并提取logistic回归的系数。为了进行相

我试图在不调用matlab支持的任何函数的情况下实现一个逻辑回归算法，随后我调用了逻辑回归mnrfit的matlab函数，以便交叉确认我的算法工作良好

我正在实施的过程如下。我首先做一个向量x，它有输入数据，还有一个向量y[0,1]，它有对应于每个数据x的类。我使用梯度下降法对这些数据进行线性回归，一旦我提取了系数，我就通过了sigmoid函数。稍后，我对x=10进行预测，以找到该输入的类1的可能性。就这么简单

之后，我调用matlab函数mnrfit并提取logistic回归的系数。为了进行相同的预测，我调用了参数为10的函数mnrval，因为我想像以前一样预测输入x=10。我的结果不一样，我不知道为什么

最后显示了提取的显示每种情况的概率密度函数的2个图

我还附上实施代码

% x is the continues input and y is the category of every output [1 or 0]
x = (1:100)';   % independent variables x(s)
y(1:10)  = 0;    % Dependent variables y(s) -- class 0
y(11:100) = 1;    % Dependent variables y(s) -- class 1
y=y';
y = y(randperm(length(y))); % Random order of y array
x=[ones(length(x),1) x]; % This is done for vectorized code

%% Initialize Linear regression parameters

m = length(y); % number of training examples
% initialize fitting parameters - all zeros
Alpha = 0; % gradient
Beta = 0;  % offset
% Some gradient descent settings
% iterations must be a big number because we are taking very small steps .
iterations = 100000;
% Learning step must be small because the line must fit the data between 
% [0 and 1]
Learning_step_a = 0.0005;  % step parameter

%% Run Gradient descent 

fprintf('Running Gradient Descent ...\n')
for iter = 1:iterations
% In every iteration calculate objective function 
h= Alpha.*x(:,2)+ Beta.*x(:,1);
% Update line variables
Alpha=Alpha - Learning_step_a * (1/m)* sum((h-y).* x(:,2));
Beta=Beta - Learning_step_a * (1/m) *  sum((h-y).*x(:,1)); 
end

% This is my linear Model
LinearModel=Alpha.*x(:,2)+ Beta.*x(:,1);
% I pass it through a sigmoid !
LogisticRegressionPDF = 1 ./ (1 + exp(-LinearModel));
% Make a prediction for p(y==1|x==10)
Prediction1=LogisticRegressionPDF(10);

%% Confirmation with matlab function mnrfit

B=mnrfit(x(:,2),y+1); % Find Logistic Regression Coefficients
mnrvalPDF = mnrval(B,x(:,2));
% Make a prediction .. p(y==1|x==10)
Prediction2=mnrvalPDF(10,2);

%% Plotting Results 

% Plot Logistic Regression Results ...
figure;
plot(x(:,2),y,'g*');
hold on
plot(x(:,2),LogisticRegressionPDF,'k--');
hold off
title('My Logistic Regression PDF')
xlabel('continues input');
ylabel('propability density function');

% Plot Logistic Regression Results (mnrfit) ...      
figure,plot(x(:,2),y,'g*');
hold on   
plot(x(:,2),mnrvalPDF(:,2),'--k') 
hold off   
title('mnrval Logistic Regression PDF')
xlabel('continues input');
ylabel('propability density function')

为什么我的情节只要每个案例的预测不一样

每次执行时，您可能提取的输出都会不同，因为y向量中的1和0的顺序是随机的。

我使用梯度下降法开发了自己的逻辑回归算法。对于良好的训练数据，我的算法别无选择，只能收敛到与mnrfit相同的解。对于不太好的训练数据，我的算法没有接近mnrfit。系数和相关模型可以很好地预测结果，但不如mnrfit。绘制残差图显示，mnrfit的残差接近于0.9x10-200，而矿山的残差接近于0.00001。我试着改变阿尔法、步数和初始θ猜测，但这样做只会产生不同的θ结果。当我用一个好的数据集调整这些参数时，我的θ开始与mnrfit更好地收敛。

非常感谢用户3779062提供的信息。里面的PDF文件是我想要的。我已经实现了随机梯度下降，所以要实现逻辑回归，我唯一要做的区别就是通过for循环中的sigmoid函数更新假设函数，并且只要更新thetas规则中的符号，就改变顺序。结果与mnrval相同。我实现了很多示例的代码，大多数情况下结果都是相同的，尤其是当数据集很好并且在两个类中都有大量信息时。我附加了最终代码和结果集的随机结果

% Machine Learning : Logistic Regression

% Logistic regression is working as linear regression but as an output
% specifies the propability to be attached to one category or the other.
% At the beginning we created a well defined data set that can be easily
% be fitted by a sigmoid function.

clear all; close all; clc;

% This example runs many times to compare a lot of results
for examples=1:10:100
clearvars -except examples

%%  Creatte Training Data 

% x is the continues input and y is the category of every output [1 or 0]
x = (1:100)';   % independent variables x(s)
y(1:examples)  = 0;    % Dependent variables y(s) -- class 0
y(examples+1:100) = 1;    % Dependent variables y(s) -- class 1
y=y';
y = y(randperm(length(y))); % Random order of y array
x=[ones(length(x),1) x]; % This is done for vectorized code

%% Initialize Linear regression parameters

m = length(y); % number of training examples
% initialize fitting parameters - all zeros
Alpha = 0; % gradient
Beta = 0;  % offset
% Some gradient descent settings
% iterations must be a big number because we are taking very small steps .
iterations = 100000;
% Learning step must be small because the line must fit the data between 
% [0 and 1]
Learning_step_a = 0.0005;  % step parameter

%% Run Gradient descent 

fprintf('Running Gradient Descent ...\n')
for iter = 1:iterations

% Linear hypothesis function 
h= Alpha.*x(:,2)+ Beta.*x(:,1);

% Non - Linear hypothesis function
hx = 1 ./ (1 + exp(-h));

% Update coefficients
Alpha=Alpha + Learning_step_a * (1/m)* sum((y-hx).* x(:,2));
Beta=Beta + Learning_step_a * (1/m) *  sum((y-hx).*x(:,1));

end

% Make a prediction for p(y==1|x==10)
Prediction1=hx(10)

%% Confirmation with matlab function mnrfit

B=mnrfit(x(:,2),y+1); % Find Logistic Regression Coefficients
mnrvalPDF = mnrval(B,x(:,2));
% Make a prediction .. p(y==1|x==10)
Prediction2=mnrvalPDF(10,2)

%% Plotting Results 

% Plot Logistic Regression Results ...
figure;
subplot(1,2,1),plot(x(:,2),y,'g*');
hold on
subplot(1,2,1),plot(x(:,2),hx,'k--');
hold off
title('My Logistic Regression PDF')
xlabel('continues input');
ylabel('propability density function');

% Plot Logistic Regression Results (mnrfit) ...      
subplot(1,2,2),plot(x(:,2),y,'g*');
hold on   
subplot(1,2,2),plot(x(:,2),mnrvalPDF(:,2),'--k') 
hold off   
title('mnrval Logistic Regression PDF')
xlabel('continues input');
ylabel('propability density function')    
end

结果

非常感谢

有什么建议吗？你的评论没有通知任何人。你可以对我的答案发表评论，让我知道你编辑了这个问题；事实上，直到现在我才意识到这一点。你会问为什么情节不同。但logistic回归与用sigmoid构成的线性回归不同。从数学上讲，没有理由期望这两个过程得到相同的结果。是的，我看到我的评论没有通知任何人，我感到惊讶，因为这是一个简单的问题。如果我的问题错了，必须再次有人告诉我，我在问一些不可理解的问题。不管怎样，你能给我解释一下为什么逻辑回归和通过乙状结肠的线性回归不一样吗？从互联网上的例子中，我明白了这一点。如何通过一个非矢量化的matlab代码正确实现逻辑回归？请您发布代码，以了解在收敛和提取系数时，我应该在for循环中做什么？您真正的问题似乎是：1什么是更新规则2我如何获得系数？以下参考资料回答了这两个问题：第5页显示了更新规则。更新规则执行约1500次后，系数为最后记录的值参见下面代码中的第101行。熟悉线性回归解决方案，看看在逻辑回归案例中该怎么做。见：