Algorithm 在曲线上寻找最佳折衷点_Algorithm_Matlab_Data Modeling_Model Fitting

Algorithm 在曲线上寻找最佳折衷点

algorithm matlab

Algorithm 在曲线上寻找最佳折衷点,algorithm,matlab,data-modeling,model-fitting,Algorithm,Matlab,Data Modeling,Model Fitting,假设我有一些数据，我想在上面拟合一个参数化模型。我的目标是找到此模型参数的最佳值我正在使用//类型的标准进行模型选择，该标准奖励错误率低的模型，但也惩罚复杂度高的模型（我们正在寻找最简单但最有说服力的解释，可以说是la）根据以上内容，这是我在三个不同标准（两个最小化，一个最大化）下得到的一个示例：在视觉上，您可以很容易地看到弯头形状，并且可以在该区域的某个位置为参数选择一个值。问题是我正在做大量的实验，我需要一种不需要干预就能找到这个值的方法我的第一个直觉是尝试从拐角处画一条45度

假设我有一些数据，我想在上面拟合一个参数化模型。我的目标是找到此模型参数的最佳值

我正在使用//类型的标准进行模型选择，该标准奖励错误率低的模型，但也惩罚复杂度高的模型（我们正在寻找最简单但最有说服力的解释，可以说是la）

根据以上内容，这是我在三个不同标准（两个最小化，一个最大化）下得到的一个示例：

在视觉上，您可以很容易地看到弯头形状，并且可以在该区域的某个位置为参数选择一个值。问题是我正在做大量的实验，我需要一种不需要干预就能找到这个值的方法

我的第一个直觉是尝试从拐角处画一条45度角的线，并一直移动它直到它与曲线相交，但这说起来容易做起来难：）如果曲线有点倾斜，它也可能错过感兴趣的区域

有没有关于如何实施的想法，或者更好的想法

以下是复制上述其中一个图所需的样本：

curve = [8.4663 8.3457 5.4507 5.3275 4.8305 4.7895 4.6889 4.6833 4.6819 4.6542 4.6501 4.6287 4.6162 4.585 4.5535 4.5134 4.474 4.4089 4.3797 4.3494 4.3268 4.3218 4.3206 4.3206 4.3203 4.2975 4.2864 4.2821 4.2544 4.2288 4.2281 4.2265 4.2226 4.2206 4.2146 4.2144 4.2114 4.1923 4.19 4.1894 4.1785 4.178 4.1694 4.1694 4.1694 4.1556 4.1498 4.1498 4.1357 4.1222 4.1222 4.1217 4.1192 4.1178 4.1139 4.1135 4.1125 4.1035 4.1025 4.1023 4.0971 4.0969 4.0915 4.0915 4.0914 4.0836 4.0804 4.0803 4.0722 4.065 4.065 4.0649 4.0644 4.0637 4.0616 4.0616 4.061 4.0572 4.0563 4.056 4.0545 4.0545 4.0522 4.0519 4.0514 4.0484 4.0467 4.0463 4.0422 4.0392 4.0388 4.0385 4.0385 4.0383 4.038 4.0379 4.0375 4.0364 4.0353 4.0344];
plot(1:100, curve)

编辑我接受了作者给出的解决方案。基本上，对于曲线上的每个点

，我们可以找到最大距离

的点，如下所示：

首先，快速回顾一下微积分：每个图形的一阶导数

f'

表示被绘制的函数

的变化率。二阶导数

f'

表示

f'

的变化率。如果

f'

很小，则表示图形正在以适度的速度改变方向。但是如果

f'

很大，则表示图形正在快速改变方向

您希望隔离

f''

在图形域中最大的点。这些将是为您的最佳模型选择的候选点。你选择哪一点取决于你自己，因为你还没有明确说明你对适合度和复杂度的重视程度。

所以解决这个问题的一个方法是在你的肘部的L处画两条线。但是，由于曲线的一部分中只有几个点（如我在评论中所提到的），除非检测到哪些点间隔开并在它们之间进行插值以生成更均匀的序列，然后使用RANSAC找到两条线来拟合L-有点复杂但并非不可能
因此，这里有一个更简单的解决方案——由于MATLAB的可伸缩性（显然），您所绘制的图形看起来是这样的。所以我所做的就是使用比例信息最小化从“原点”到点的距离
请注意：原点估计可以显著改进，但我将留给您
代码如下：

%% Order curve = [8.4663 8.3457 5.4507 5.3275 4.8305 4.7895 4.6889 4.6833 4.6819 4.6542 4.6501 4.6287 4.6162 4.585 4.5535 4.5134 4.474 4.4089 4.3797 4.3494 4.3268 4.3218 4.3206 4.3206 4.3203 4.2975 4.2864 4.2821 4.2544 4.2288 4.2281 4.2265 4.2226 4.2206 4.2146 4.2144 4.2114 4.1923 4.19 4.1894 4.1785 4.178 4.1694 4.1694 4.1694 4.1556 4.1498 4.1498 4.1357 4.1222 4.1222 4.1217 4.1192 4.1178 4.1139 4.1135 4.1125 4.1035 4.1025 4.1023 4.0971 4.0969 4.0915 4.0915 4.0914 4.0836 4.0804 4.0803 4.0722 4.065 4.065 4.0649 4.0644 4.0637 4.0616 4.0616 4.061 4.0572 4.0563 4.056 4.0545 4.0545 4.0522 4.0519 4.0514 4.0484 4.0467 4.0463 4.0422 4.0392 4.0388 4.0385 4.0385 4.0383 4.038 4.0379 4.0375 4.0364 4.0353 4.0344]; x_axis = 1:numel(curve); points = [x_axis ; curve ]'; %' - SO formatting %% Get the scaling info f = figure(1); plot(points(:,1),points(:,2)); ticks = get(get(f,'CurrentAxes'),'YTickLabel'); ticks = str2num(ticks); aspect = get(get(f,'CurrentAxes'),'DataAspectRatio'); aspect = [aspect(2) aspect(1)]; close(f); %% Get the "origin" O = [x_axis(1) ticks(1)]; %% Scale the data - now the scaled values look like MATLAB''s idea of % what a good plot should look like scaled_O = O.*aspect; scaled_points = bsxfun(@times,points,aspect); %% Find the closest point del = sum((bsxfun(@minus,scaled_points,scaled_O).^2),2); [val ind] = min(del); best_ROC = [ind curve(ind)]; %% Display plot(x_axis,curve,'.-'); hold on; plot(O(1),O(2),'r*'); plot(best_ROC(1),best_ROC(2),'k*');
结果:

同样对于
拟合（最大化）
曲线，您必须将原点更改为
[x_轴（1）刻度（结束）]
信息论模型选择的要点是，它已经考虑了参数的数量。因此，没有必要找到弯头，只需找到最小值即可
查找曲线的弯头仅在使用“拟合”时相关。即使如此，选择弯头的方法在某种意义上也是对参数数量的惩罚。要选择弯头，您需要最小化从原点到曲线的距离。距离计算中两个维度的相对权重将产生一个固有的惩罚项。信息论标准根据用于估计模型的参数数量和数据样本数量设置此度量

底线建议：使用BIC并取最小值。
查找弯头的快速方法是从曲线的第一个点到最后一个点绘制一条线，然后查找距离该线最远的数据点
当然，这在某种程度上取决于直线平坦部分的点数，但是如果每次测试相同数量的参数，结果应该是合理的

curve = [8.4663 8.3457 5.4507 5.3275 4.8305 4.7895 4.6889 4.6833 4.6819 4.6542 4.6501 4.6287 4.6162 4.585 4.5535 4.5134 4.474 4.4089 4.3797 4.3494 4.3268 4.3218 4.3206 4.3206 4.3203 4.2975 4.2864 4.2821 4.2544 4.2288 4.2281 4.2265 4.2226 4.2206 4.2146 4.2144 4.2114 4.1923 4.19 4.1894 4.1785 4.178 4.1694 4.1694 4.1694 4.1556 4.1498 4.1498 4.1357 4.1222 4.1222 4.1217 4.1192 4.1178 4.1139 4.1135 4.1125 4.1035 4.1025 4.1023 4.0971 4.0969 4.0915 4.0915 4.0914 4.0836 4.0804 4.0803 4.0722 4.065 4.065 4.0649 4.0644 4.0637 4.0616 4.0616 4.061 4.0572 4.0563 4.056 4.0545 4.0545 4.0522 4.0519 4.0514 4.0484 4.0467 4.0463 4.0422 4.0392 4.0388 4.0385 4.0385 4.0383 4.038 4.0379 4.0375 4.0364 4.0353 4.0344]; %# get coordinates of all the points nPoints = length(curve); allCoord = [1:nPoints;curve]'; %'# SO formatting %# pull out first point firstPoint = allCoord(1,:); %# get vector between first and last point - this is the line lineVec = allCoord(end,:) - firstPoint; %# normalize the line vector lineVecN = lineVec / sqrt(sum(lineVec.^2)); %# find the distance from each point to the line: %# vector between all points and first point vecFromFirst = bsxfun(@minus, allCoord, firstPoint); %# To calculate the distance to the line, we split vecFromFirst into two %# components, one that is parallel to the line and one that is perpendicular %# Then, we take the norm of the part that is perpendicular to the line and %# get the distance. %# We find the vector parallel to the line by projecting vecFromFirst onto %# the line. The perpendicular vector is vecFromFirst - vecFromFirstParallel %# We project vecFromFirst by taking the scalar product of the vector with %# the unit vector that points in the direction of the line (this gives us %# the length of the projection of vecFromFirst onto the line). If we %# multiply the scalar product by the unit vector, we have vecFromFirstParallel scalarProduct = dot(vecFromFirst, repmat(lineVecN,nPoints,1), 2); vecFromFirstParallel = scalarProduct * lineVecN; vecToLine = vecFromFirst - vecFromFirstParallel; %# distance to line is the norm of vecToLine distToLine = sqrt(sum(vecToLine.^2,2)); %# plot the distance to the line figure('Name','distance from curve to line'), plot(distToLine) %# now all you need is to find the maximum [maxDist,idxOfBestPoint] = max(distToLine); %# plot figure, plot(curve) hold on plot(allCoord(idxOfBestPoint,1), allCoord(idxOfBestPoint,2), 'or')

用一种简单直观的方式，我们可以这么说
如果我们从曲线上的任意点到曲线的两个端点画两条线，这两条线形成最小角度（度）的点就是所需点

在这里，两条线可以可视化为手臂，点可以可视化为肘部点
以防有人需要上面发布的Matlab代码的Python版本

双重推导法。然而，对于有噪声的数据来说，它似乎并不能很好地工作。对于输出，只需找到d2的最大值即可识别弯头。这个实现是在R

elbow_finder <- function(x_values, y_values) { i_max <- length(x_values) - 1 # First and second derived first_derived <- list() second_derived <- list() # First derived for(i in 2:i_max){ slope1 <- (y_values[i+1] - y_values[i]) / (x_values[i+1] - x_values[i]) slope2 <- (y_values[i] - y_values[i-1]) / (x_values[i] - x_values[i-1]) slope_avg <- (slope1 + slope2) / 2 first_derived[[i]] <- slope_avg } first_derived[[1]] <- NA first_derived[[i_max+1]] <- NA first_derived <- unlist(first_derived) # Second derived for(i in 3:i_max-1){ d1 <- (first_derived[i+1] - first_derived[i]) / (x_values[i+1] - x_values[i]) d2 <- (first_derived[i] - first_derived[i-1]) / (x_values[i] - x_values[i-1]) d_avg <- (d1 + d2) / 2 second_derived[[i]] <- d_avg } second_derived[[1]] <- NA second_derived[[2]] <- NA second_derived[[i_max]] <- NA second_derived[[i_max+1]] <- NA second_derived <- unlist(second_derived) return(list(d1 = first_derived, d2 = second_derived)) }

以下是Jonas在R中实现的解决方案：

elbow_finder <- function(x_values, y_values) { # Max values to create line max_x_x <- max(x_values) max_x_y <- y_values[which.max(x_values)] max_y_y <- max(y_values) max_y_x <- x_values[which.max(y_values)] max_df <- data.frame(x = c(max_y_x, max_x_x), y = c(max_y_y, max_x_y)) # Creating straight line between the max values fit <- lm(max_df$y ~ max_df$x) # Distance from point to line distances <- c() for(i in 1:length(x_values)) { distances <- c(distances, abs(coef(fit)[2]*x_values[i] - y_values[i] + coef(fit)[1]) / sqrt(coef(fit)[2]^2 + 1^2)) } # Max distance point x_max_dist <- x_values[which.max(distances)] y_max_dist <- y_values[which.max(distances)] return(c(x_max_dist, y_max_dist)) }

bowle\u finder我从事膝关节/肘关节点检测已有一段时间了。我决不是专家。一些可能与此问题相关的方法 DFDT代表动态一阶导数阈值。它计算一阶导数，并使用阈值算法检测膝盖/肘部点。DSDT类似，但使用二阶导数，我的评估表明它们具有相似的性能 S方法是L方法的推广。L法将两条直线拟合到曲线上，两条直线之间的截距是膝盖/肘部点。通过循环整体点、拟合直线并计算均方误差（MSE），可以找到最佳拟合。S方法拟合3条直线，这提高了精度，但也需要更多的计算我的所有代码都可以在上公开获取。此外，这可以帮助您找到有关该主题的更多信息。它只有四页长，所以应该很容易阅读。你可以使用代码，如果你想讨论 elbow_finder <- function(x_values, y_values) { # Max values to create line max_x_x <- max(x_values) max_x_y <- y_values[which.max(x_values)] max_y_y <- max(y_values) max_y_x <- x_values[which.max(y_values)] max_df <- data.frame(x = c(max_y_x, max_x_x), y = c(max_y_y, max_x_y)) # Creating straight line between the max values fit <- lm(max_df$y ~ max_df$x) # Distance from point to line distances <- c() for(i in 1:length(x_values)) { distances <- c(distances, abs(coef(fit)[2]*x_values[i] - y_values[i] + coef(fit)[1]) / sqrt(coef(fit)[2]^2 + 1^2)) } # Max distance point x_max_dist <- x_values[which.max(distances)] y_max_dist <- y_values[which.max(distances)] return(c(x_max_dist, y_max_dist)) } elbow.point = function(x){ elbow.curve = c(x) nPoints = length(elbow.curve); allCoord = cbind(c(1:nPoints),c(elbow.curve)) # pull out first point firstPoint = allCoord[1,] # get vector between first and last point - this is the line lineVec = allCoord[nPoints,] - firstPoint; # normalize the line vector lineVecN = lineVec / sqrt(sum(lineVec^2)); # find the distance from each point to the line: # vector between all points and first point vecFromFirst = lapply(c(1:nPoints), function(x){ allCoord[x,] - firstPoint }) vecFromFirst = do.call(rbind, vecFromFirst) rep.row<-function(x,n){ matrix(rep(x,each=n),nrow=n) } scalarProduct = matrix(nrow = nPoints, ncol = 2) scalarProduct[,1] = vecFromFirst[,1] * rep.row(lineVecN,nPoints)[,1] scalarProduct[,2] = vecFromFirst[,2] * rep.row(lineVecN,nPoints)[,2] scalarProduct = as.matrix(rowSums(scalarProduct)) vecFromFirstParallel = matrix(nrow = nPoints, ncol = 2) vecFromFirstParallel[,1] = scalarProduct * lineVecN[1] vecFromFirstParallel[,2] = scalarProduct * lineVecN[2] vecToLine = lapply(c(1:nPoints), function(x){ vecFromFirst[x,] - vecFromFirstParallel[x,] }) vecToLine = do.call(rbind, vecToLine) # distance to line is the norm of vecToLine distToLine = as.matrix(sqrt(rowSums(vecToLine^2))) ## which.max(distToLine) }