Matlab 多项式回归的10倍交叉验证

Matlab 多项式回归的10倍交叉验证,matlab,polynomial-math,cross-validation,Matlab,Polynomial Math,Cross Validation,我想使用10倍交叉验证方法,测试多项式形式(第一、第二或第三) 第三级)更适合。我想将我的数据集划分为10个子集,并从10个数据集中删除1个子集。导出不含此子集的回归模型,使用导出的回归模型预测此子集的输出值,并计算残差。最后,对每个子集重复计算程序,并将所得残差的平方和。 我已经在Matlab2013b上编写了以下代码,对数据进行采样,并在训练数据上测试回归。我被困在如何对每个子集重复这个问题上,以及如何比较哪种多项式形式更适合 % Sample the data parm = [AT]; n

我想使用10倍交叉验证方法,测试多项式形式(第一、第二或第三) 第三级)更适合。我想将我的数据集划分为10个子集,并从10个数据集中删除1个子集。导出不含此子集的回归模型,使用导出的回归模型预测此子集的输出值,并计算残差。最后,对每个子集重复计算程序,并将所得残差的平方和。 我已经在Matlab2013b上编写了以下代码,对数据进行采样,并在训练数据上测试回归。我被困在如何对每个子集重复这个问题上,以及如何比较哪种多项式形式更适合

% Sample the data
parm = [AT];
n = length(parm);
k = 10;                 % how many parts to use
allix = randperm(n);    % all data indices, randomly ordered
numineach = ceil(n/k);  % at least one part must have this many data points
allix = reshape([allix NaN(1,k*numineach-n)],k,numineach);
for p=1:k
testix = allix(p,:);            % indices to use for testing
testix(isnan(testix)) = [];     % remove NaNs if necessary
trainix = setdiff(1:n,testix);  % indices to use for training
%train = parm(trainix); %gives the training data
%test = parm(testix);  %gives the testing data
end 

% Derive regression on the training data 
Sal = Salinity(trainix);
Temp = Temperature(trainix);
At = parm(trainix);

xyz =[Sal Temp At];
% Fit a Polynomial Surface
surffit = fit([xyz(:,1), xyz(:,2)],xyz(:,3), 'poly11');
% Shows equation, rsquare, rmse 
[b,bint,r] = fit([xyz(:,1), xyz(:,2)],xyz(:,3), 'poly11');

关于为每个子集执行代码,您可以将fit放入循环中并存储结果,例如

% Sample the data
parm = [AT];
n = length(parm);
k = 10;                 % how many parts to use
allix = randperm(n);    % all data indices, randomly ordered
numineach = ceil(n/k);  % at least one part must have this many data points
allix = reshape([allix NaN(1,k*numineach-n)],k,numineach);

bAll = []; bintAll = []; rAll = [];

for p=1:k
    testix = allix(p,:);            % indices to use for testing
    testix(isnan(testix)) = [];     % remove NaNs if necessary
    trainix = setdiff(1:n,testix);  % indices to use for training
    %train = parm(trainix); %gives the training data
    %test = parm(testix);  %gives the testing data

    % Derive regression on the training data 
    Sal = Salinity(trainix);
    Temp = Temperature(trainix);
    At = parm(trainix);

    xyz =[Sal Temp At];
    % Fit a Polynomial Surface
    surffit = fit([xyz(:,1), xyz(:,2)],xyz(:,3), 'poly11');
    % Shows equation, rsquare, rmse 
    [b,bint,r] = fit([xyz(:,1), xyz(:,2)],xyz(:,3), 'poly11');

    bAll = [bAll, coeffvalues(b)]; bintAll = [bintAll,bint]; rAll = [rAll,r]; 
end 

关于最佳拟合,您可能可以选择rmse最低的拟合。

不允许使用fittype/horzcat(第7行)串联双对象的错误。哦,我明白了。从b开始,您可能只需要保存系数。如果是,您可以尝试bAll=[bAll,coeffvalues(b)]而不是bAll=[bAll,b]。抱歉,无法自行测试,因为未安装曲线拟合工具箱