boot()中与替换长度和数据或数据类型相关的错误?-R

boot()中与替换长度和数据或数据类型相关的错误?-R,r,replace,p-value,statistics-bootstrap,R,Replace,P Value,Statistics Bootstrap,boot()在一个数据集上失败,在另一个数据集上成功。。。一定是数据问题?我就是想不出有什么不同。但至少现在我认为我已经可以复制了。在这两种情况下,整数和因子变量之间的交互作用回归(lm)到数值因变量上。boot()命令失败,出现错误: Error in boot(data = data, statistic = bs_p, R = 1000) : number of items to replace is not a multiple of replacement length 我返回

boot()在一个数据集上失败,在另一个数据集上成功。。。一定是数据问题?我就是想不出有什么不同。但至少现在我认为我已经可以复制了。在这两种情况下,整数和因子变量之间的交互作用回归(lm)到数值因变量上。boot()命令失败,出现错误:

Error in boot(data = data, statistic = bs_p, R = 1000) : 
  number of items to replace is not a multiple of replacement length
我返回p值的统计函数是:

    bs_p <- function (data, i) {
      d <- data[i,]
      fit <- lm (y~x*fac, data=d)
      return(summary(fit)$coefficients[,4])
    }

bs_p某些(或至少一个)自举重采样不包含所有因子级别,导致系数数量较少(以及相应的p值),这导致组合自举结果时出现错误。我想你们需要分层引导或残差引导(假设引导p值是合理的,我对此表示怀疑)。

我有一个类似的错误,我用这段手工编写的代码解决了这个问题,我希望它对其他人有所帮助

bs_p <- function (data, i) {
  d <- data[i,]
  fit <- lm (y~x*fac, data=d)

  cf <- coef(fit)

  # identify differing coefficients and create dummy ones
  df <- setdiff(colnames(d), names(cf))
  ad <- rep(0, length(df))
  names(ad) <- df

  return(c(cf, ad))
}

bs\p谢谢你;一定是这样。我的因素分布不均匀;当我将它们更改为时,每个重采样必须包含所有级别,并且没有生成错误。我承认,自举p值并不是特别有意义——系数本身对于我的目的来说已经足够了,而且似乎加上t统计和p值真的是太多了。我的论文委员会的一个成员不这么认为。哎呀。再次感谢,我不明白。如果你引导你的系数,你可以使用结果来推导p值。我无法理解为什么要引导p值。你可能想咨询统计学家。我同意你在上面的评论。再次感谢。
    results <- boot(data=data, statistic=bs_p, R=1000)
    y <- c(17.820, 13.764, 18.880, 25.830, 26.576, 29.832, 22.610, 24.180, 26.572, 26.030, 29.200, 28.560, 28.600, 16.614, 16.302, 18.080, 22.704, 28.101, 38.280, 17.100, 19.292, 33.165, 18.395, 19.434, 27.544, 17.010, 21.560, 28.120, 17.513, 21.646,24.060, 27.984, 20.830, 21.588, 26.280, 29.640, 17.313, 16.344, 16.362, 34.496, 22.785, 20.203, 29.040, 19.092, 20.890,20.739, 17.700, 17.424, 28.737, 18.318, 39.470, 28.072, 17.176, 28.098)
    x <- as.integer(c(9,  5,  0,  8,  3,  4,  9,  6,  9,  2, 15, 10,  5,  1, 11, 11,  4, 8, 13,  1,  2,  4,  7,  7, 12,  1,  6,  6,  4,  3,  5,  5,  7,  9,  8, 3, 3, 14,  6,  4,  3,  6, 17,  3,  6,  6,  7,  1,  6, 10 , 2, 14 , 5,  8))
    fac <- as.factor(c("F", "F", "F", "F", "F", "Ds", "F", "Ds","F","F","F","E", "Ds","F", "F", "E", "Ds","F", "Ds", "F", "Ds","E", "F", "E", "F", "Ds", "E", "Ds","F", "F", "F",  "Ds","Ds", "F", "Ds","F", "F", "E", "F","F","F", "F", "F", "Ds","F", "F", "F", "F", "Ds", "E", "F", "F", "F", "E"))
    data <- data.frame(x=x, y=y, fac=fac)
bs_p <- function (data, i) {
  d <- data[i,]
  fit <- lm (y~x*fac, data=d)

  cf <- coef(fit)

  # identify differing coefficients and create dummy ones
  df <- setdiff(colnames(d), names(cf))
  ad <- rep(0, length(df))
  names(ad) <- df

  return(c(cf, ad))
}