在循环中删除Bonferroni异常测试结果

在循环中删除Bonferroni异常测试结果,r,linear-regression,outliers,data-cleaning,R,Linear Regression,Outliers,Data Cleaning,我使用线性回归对数据建模。我想多次运行Bonferroni离群值测试,并从数据中删除相应的记录。我的问题是:我无法从异常值结果中提取id。这是可复制的代码。我想根据伪代码编写一个while循环。我用R编码 # URL <- "http://www.math.uah.edu/stat/data/Galton.csv" # download.file(URL, destfile = "./galton.csv", method="curl") galton <-read.csv("gal

我使用线性回归对数据建模。我想多次运行Bonferroni离群值测试,并从数据中删除相应的记录。我的问题是:我无法从异常值结果中提取id。这是可复制的代码。我想根据伪代码编写一个while循环。我用R编码

# URL <- "http://www.math.uah.edu/stat/data/Galton.csv"
# download.file(URL, destfile = "./galton.csv", method="curl")
galton <-read.csv("galton.csv")
attach(galton)

dim(galton)
head(galton)

##creating outliers
set.seed(1)
random_index <- sample(1:nrow(galton), size = 5, replace = FALSE, prob = NULL)
print(random_index)
galton[random_index,"Height"] = galton[random_index,"Height"] +100

set.seed(2)
random_index2 <- sample(1:nrow(galton), size = 5, replace = FALSE, prob = NULL)
galton[random_index2,"Height"] = galton[random_index2,"Height"] +75

set.seed(3)
random_index3 <- sample(1:nrow(galton), size = 5, replace = FALSE, prob = NULL)
galton[random_index3,"Height"] = galton[random_index3,"Height"] +50


linear_reg <- lm(Height~Father+Mother,data=galton)

require(car, quietly=TRUE)
outlierResult <-outlierTest(linear_reg)
outlierResult


# the pseudocode
# while outlierResult is not empty
#   remove the corresponding records
#   linear_reg <- lm(Height~Father+Mother,data=galton)
#   outlierResult <-outlierTest(linear_reg)

#URL请参见下文。诀窍是要注意,如果我读对了,异常值结果会给出行名

library(car, quietly=TRUE)
galton <-read.csv("http://www.math.uah.edu/stat/data/Galton.csv")
attach(galton)

dim(galton)
head(galton)

##creating outliers
set.seed(1)
random_index <- sample(1:nrow(galton), size = 5, replace = FALSE, prob = NULL)
print(random_index)
galton[random_index,"Height"] = galton[random_index,"Height"] +100

set.seed(2)
random_index2 <- sample(1:nrow(galton), size = 5, replace = FALSE, prob = NULL)
galton[random_index2,"Height"] = galton[random_index2,"Height"] +75

set.seed(3)
random_index3 <- sample(1:nrow(galton), size = 5, replace = FALSE, prob = NULL)
galton[random_index3,"Height"] = galton[random_index3,"Height"] +50




currentData <- galton
linear_reg <- lm(Height~Father+Mother,data=currentData)

outlierResult <-outlierTest(linear_reg)
outlierResult

while(length(outlierResult)!=0){
  exclusionRows <-names(outlierResult[[1]])
  inclusionRows <- !(rownames(currentData) %in% exclusionRows)
  currentData <- currentData[inclusionRows,]

  linear_reg <- lm(Height~Father+Mother,data=currentData)
  outlierResult <-outlierTest(linear_reg)

}
库(车,安静=真)

高尔顿我认为这是一个与编程相关的问题。也许这样会更好:)谢谢你的回答。当异常值结果等于“无Bonferonni p<0.05的学生化残差”时,第三轮循环中存在问题。它通过最大的rstudent,在使用所有数据行之前,永远不会退出循环。