randomForest中出错,NA,对象中缺少值
当我试着randomForest中出错,NA,对象中缺少值,r,random-forest,R,Random Forest,当我试着 marketing.rf <- randomForest(formula = as.numeric(y) ~., data = marketing.train, importance = TRUE) 当我尝试时: y.val <- ifelse(marketing.train$y=="yes", 1,0) marketing.rf <- randomForest(formula = as.numeric(y.val) ~., data = marketing.t
marketing.rf <- randomForest(formula = as.numeric(y) ~., data = marketing.train, importance = TRUE)
当我尝试时:
y.val <- ifelse(marketing.train$y=="yes", 1,0)
marketing.rf <- randomForest(formula = as.numeric(y.val) ~., data = marketing.train, importance = TRUE)
我试图将用作.factor(y)
,但它显示了类似的错误。
我使用了dput(marketing.test$y)
查看这些值,但在其中找不到任何NA或无效值
我对R很陌生,有人能帮我修一下吗?谢谢
以下是样本列车数据:
age job marital edu default balance housing loan y
58 management married tertiary no 2143 yes no no
33 entrepreneur married secondary no 2 yes yes no
33 unknown single unknown no 1 no no no
42 entrepreneur divorced tertiary yes 2 yes no no
下面是一个包含reprex数据的完整示例。没有你的数据,我无法做出完美的答案,但如果你遵循这个逻辑,你应该会没事的
library(randomForest)
# Generate Some Fake Data
fake_data <- data.frame(
age = runif(500, 30, 65),
martial = sample(c("single", "married", "divorced"), 500, T),
default = sample(c("yes", "no"), 500, T),
balance = runif(500,0,2100),
housing = sample(c("yes", "no"), 500, T),
loan = sample(c("yes", "no"), 500, T),
stringsAsFactors = FALSE
)
# Add some missing data for example
fake_data[sample(x = 1:500, size = 5), "loan"] <- NA
# Check for NAs
fake_data_2 <- fake_data[!is.na(fake_data$loan),]
cat("You have removed ", nrow(fake_data)-nrow(fake_data_2), " records")
# Add target and make sure it is a factor
fake_data_2$y <- as.factor(fake_data_2$loan)
# Make characters into factors
library(dplyr)
fake_data_2 <- fake_data_2 %>%
mutate_if(is.character, as.factor)
fit <- randomForest(y ~ ., data = fake_data_2)
库(随机林)
#生成一些虚假数据
伪_数据这里是一个完整的reprex数据示例。没有你的数据,我无法做出完美的答案,但如果你遵循这个逻辑,你应该会没事的
library(randomForest)
# Generate Some Fake Data
fake_data <- data.frame(
age = runif(500, 30, 65),
martial = sample(c("single", "married", "divorced"), 500, T),
default = sample(c("yes", "no"), 500, T),
balance = runif(500,0,2100),
housing = sample(c("yes", "no"), 500, T),
loan = sample(c("yes", "no"), 500, T),
stringsAsFactors = FALSE
)
# Add some missing data for example
fake_data[sample(x = 1:500, size = 5), "loan"] <- NA
# Check for NAs
fake_data_2 <- fake_data[!is.na(fake_data$loan),]
cat("You have removed ", nrow(fake_data)-nrow(fake_data_2), " records")
# Add target and make sure it is a factor
fake_data_2$y <- as.factor(fake_data_2$loan)
# Make characters into factors
library(dplyr)
fake_data_2 <- fake_data_2 %>%
mutate_if(is.character, as.factor)
fit <- randomForest(y ~ ., data = fake_data_2)
库(随机林)
#生成一些虚假数据
伪_数据“y”列中缺少值,因此它不知道如何对这些数据行进行训练。使用train\u dat谢谢@MDEWITT!但是,现在它显示了这个错误:randomForest中的错误。默认值(m,y,…):响应长度必须与预测器相同“y”列中缺少值,因此它不知道如何对这些数据行进行训练。使用train\u dat谢谢@MDEWITT!但是,现在它显示了这个错误:randomForest中的错误。默认值(m,y,…):响应的长度必须与predictors@Andrea很高兴听到!如果它有效,请你接受答案,让其他人知道它解决了你的问题(这篇文章上下箭头下方的绿色小复选标记),@Andrea很高兴听到!如果它有效,请你接受答案,让其他人知道它解决了你的问题(这篇文章上下箭头下方的绿色小复选标记),
library(randomForest)
# Generate Some Fake Data
fake_data <- data.frame(
age = runif(500, 30, 65),
martial = sample(c("single", "married", "divorced"), 500, T),
default = sample(c("yes", "no"), 500, T),
balance = runif(500,0,2100),
housing = sample(c("yes", "no"), 500, T),
loan = sample(c("yes", "no"), 500, T),
stringsAsFactors = FALSE
)
# Add some missing data for example
fake_data[sample(x = 1:500, size = 5), "loan"] <- NA
# Check for NAs
fake_data_2 <- fake_data[!is.na(fake_data$loan),]
cat("You have removed ", nrow(fake_data)-nrow(fake_data_2), " records")
# Add target and make sure it is a factor
fake_data_2$y <- as.factor(fake_data_2$loan)
# Make characters into factors
library(dplyr)
fake_data_2 <- fake_data_2 %>%
mutate_if(is.character, as.factor)
fit <- randomForest(y ~ ., data = fake_data_2)