R 两条geom_smooth（）线之间的差异_R_Ggplot2_Difference_Smoothing_Gam

R 两条geom_smooth（）线之间的差异

R 两条geom_smooth（）线之间的差异,r,ggplot2,difference,smoothing,gam,R,Ggplot2,Difference,Smoothing,Gam,我为我的数据做了一个绘图，现在我想得到由geom_smooth（）估计的每个x的y差。不幸的是，有一个问题没有答案。例如，如何获得以下绘图的差异（以下数据）：编辑提出了两个建议，但我仍然不知道如何计算差异。第一个建议是从ggplot对象访问数据。我这样做是因为 pb <- ggplot_build(p) pb[["data"]][[1]] pb大家好，欢迎收看堆栈溢出第一个建议很好。要使x序列匹配，可以使用函数（在stats中）在两者之间插入值 library（“ggplot2

我为我的数据做了一个绘图，现在我想得到由geom_smooth（）估计的每个x的y差。不幸的是，有一个问题没有答案。例如，如何获得以下绘图的差异（以下数据）：

编辑

提出了两个建议，但我仍然不知道如何计算差异。

第一个建议是从ggplot对象访问数据。我这样做是因为

pb <- ggplot_build(p) pb[["data"]][[1]]

pb大家好，欢迎收看堆栈溢出第一个建议很好。要使x序列匹配，可以使用函数（在stats 中）在两者之间插入值 library（“ggplot2”）#library ggplot 设定种子（1）#使示例重现 n正如我在上面的评论中所提到的，最好在ggplot之外进行此操作，而是使用两个平滑的完整模型进行此操作，从中可以计算差异的不确定性，等等这基本上是我一年前写的一篇短文 OP的exmaple数据然后，我们使用该预测数据生成Xp矩阵，该矩阵将协变量的值映射到平滑的基展开值；我们可以操纵该矩阵以获得所需的平滑差分： xp <- predict(m, newdata = pdat, type = "lpmatrix") 现在，我们可以为正在比较的级别对区分xp 行 ## difference rows of xp for data from comparison X <- xp[r1, ] - xp[r2, ] 现在dif 包含两个平滑之间的差异我们可以再次使用X 和模型系数的协方差矩阵来计算该差异的标准误差，然后计算估计差异的95%（在这种情况下）置信区间 ## se of difference se <- sqrt(rowSums((X %*% vcov(m)) * X)) ## confidence interval on difference crit <- qt(.975, df.residual(m)) upr <- dif + (crit * se) lwr <- dif - (crit * se) 这就产生了这与一项评估相一致，该评估显示，与具有不同组平均值的模型相比，具有组级平滑的模型的拟合度没有显著提高，但在x 中只有一个通用平滑器： r$> m0 <- gam(y ~ g + s(x), data = df, method = "REML") r$> AIC(m0, m) df AIC m0 9.68355 30277.93 m 14.70675 30285.02 r$> anova(m0, m, test = 'F') Analysis of Deviance Table Model 1: y ~ g + s(x) Model 2: y ~ g + s(x, by = g) Resid. Df Resid. Dev Df Deviance F Pr(>F) 1 4990.1 124372 2 4983.9 124298 6.1762 73.591 0.4781 0.8301 使用此函数，我们可以重复整个分析，并绘制差异： out <- smooth_diff(m, pdat, '0', '1', 'g') out <- cbind(x = with(df, seq(min(x), max(x), length = 200)), out) ggplot(out, aes(x = x, y = diff)) + geom_ribbon(aes(ymin = lower, ymax = upper, x = x), alpha = 0.2) + geom_line() out我可以想出两种方法：找出给定数据的曲线计算公式geom_smooth（），或者将ggplot对象保存到变量并尝试访问曲线数据（不确定是否可以做到）。除了@sindri_baldur的注释外，将绘图另存为p ，然后调用pb@markus：这种方法是可行的，但是在使用pb@schwantke之后，pb[[“data”][[1]]中提供的数据只需要仔细阅读ggplot文档并手动计算即可。我仍然会继续马库斯提供的线索。。。我想你什么都有。如果不适用于组，请分别绘制每个组并分两步提取数据。今晚我可能有机会用您提供的示例数据向您展示如何做到这一点。如果我没有使用基于gam（）的解决方案在下面添加答案，请在几天内给我打电话。我喜欢这种方法的原因是，geom_smooth内部使用的方法（例如黄土或gam）无关紧要。非常感谢。 xp <- predict(m, newdata = pdat, type = "lpmatrix") ## which cols of xp relate to splines of interest? c1 <- grepl('g0', colnames(xp)) c2 <- grepl('g1', colnames(xp)) ## which rows of xp relate to sites of interest? r1 <- with(pdat, g == 0) r2 <- with(pdat, g == 1) ## difference rows of xp for data from comparison X <- xp[r1, ] - xp[r2, ] ## zero out cols of X related to splines for other lochs X[, ! (c1 | c2)] <- 0 ## zero out the parametric cols X[, !grepl('^s\\(', colnames(xp))] <- 0 ## difference between smooths dif <- X %*% coef(m) ## se of difference se <- sqrt(rowSums((X %*% vcov(m)) * X)) ## confidence interval on difference crit <- qt(.975, df.residual(m)) upr <- dif + (crit * se) lwr <- dif - (crit * se) res <- data.frame(x = with(df, seq(min(x), max(x), length = 200)), dif = dif, upr = upr, lwr = lwr) ggplot(res, aes(x = x, y = dif)) + geom_ribbon(aes(ymin = lwr, ymax = upr, x = x), alpha = 0.2) + geom_line() r$> m0 <- gam(y ~ g + s(x), data = df, method = "REML") r$> AIC(m0, m) df AIC m0 9.68355 30277.93 m 14.70675 30285.02 r$> anova(m0, m, test = 'F') Analysis of Deviance Table Model 1: y ~ g + s(x) Model 2: y ~ g + s(x, by = g) Resid. Df Resid. Dev Df Deviance F Pr(>F) 1 4990.1 124372 2 4983.9 124298 6.1762 73.591 0.4781 0.8301 smooth_diff <- function(model, newdata, f1, f2, var, alpha = 0.05, unconditional = FALSE) { xp <- predict(model, newdata = newdata, type = 'lpmatrix') c1 <- grepl(f1, colnames(xp)) c2 <- grepl(f2, colnames(xp)) r1 <- newdata[[var]] == f1 r2 <- newdata[[var]] == f2 ## difference rows of xp for data from comparison X <- xp[r1, ] - xp[r2, ] ## zero out cols of X related to splines for other lochs X[, ! (c1 | c2)] <- 0 ## zero out the parametric cols X[, !grepl('^s\\(', colnames(xp))] <- 0 dif <- X %*% coef(model) se <- sqrt(rowSums((X %*% vcov(model, unconditional = unconditional)) * X)) crit <- qt(alpha/2, df.residual(model), lower.tail = FALSE) upr <- dif + (crit * se) lwr <- dif - (crit * se) data.frame(pair = paste(f1, f2, sep = '-'), diff = dif, se = se, upper = upr, lower = lwr) } out <- smooth_diff(m, pdat, '0', '1', 'g') out <- cbind(x = with(df, seq(min(x), max(x), length = 200)), out) ggplot(out, aes(x = x, y = diff)) + geom_ribbon(aes(ymin = lower, ymax = upper, x = x), alpha = 0.2) + geom_line()