r-SVM中的train（）函数_R_Machine Learning_Svm_R Caret_Confusion Matrix

r-SVM中的train（）函数

r machine-learning

r-SVM中的train（）函数,r,machine-learning,svm,r-caret,confusion-matrix,R,Machine Learning,Svm,R Caret,Confusion Matrix,我正在运行一个带插入符号的SVM程序包。我的整个df（名为总计，包括训练和测试）是从0到1的比例数字。我的Y是二进制的（0-1）。所有变量都具有类“num”。代码如下： model_SVM <- train( Y ~ ., training, method = "svmPoly", trControl = trainControl( method = "cv", number = 10, verboseIter = TRUE ) ) summary(mod

我正在运行一个带插入符号的SVM程序包。我的整个df（名为

总计

，包括训练和测试）是从0到1的比例数字。我的Y是二进制的（0-1）。所有变量都具有类“num”。代码如下：

model_SVM <- train(
  Y ~ ., training,
  method = "svmPoly",
  trControl = trainControl(
    method = "cv", number = 10,
    verboseIter = TRUE
  )
)

summary(model_SVM) 
model_SVM

SVMprediction <-predict(model_SVM, testing)
cmSVM <-confusionMatrix(SVMprediction,testing$Y) # ERROR
print(SVMprediction)

通过添加以下内容解决了此问题：

SVMprediction<-as.factor(SVMprediction)
testing$Y<-as.factor(testing$Y)

当我检查级别时，

svmpediction

有361个级别，

测试$Y

2个级别。如果Y只有两个级别，那么SVM预测如何获得361个级别

谢谢

PS：完整代码：

totalY <- total

total <- total%>%
  select(-Y)

# Missing Values with MICE
mod_mice <- mice(data = total, m = 5,meth='cart')
total <- complete(mod_mice)
post_mv_var_top10 <- total


Y <- totalY$Y
total<-cbind(total,Y)


train_ <- total%>%
  filter(is.na(Y)==FALSE)

test_ <- total%>%
  filter(is.na(Y)==TRUE)


inTraining <- createDataPartition(train_$Y, p = .70, list = FALSE)
training <- train_[ inTraining,]
testing  <- train_[-inTraining,]


# MODEL SVM

model_SVM <- train(
  Y ~ ., training,
  method = "svmPoly",
  trControl = trainControl(
    method = "cv", number = 10,
    verboseIter = TRUE
  )
)

summary(model_SVM)

SVMprediction <-predict(model_SVM, testing)

SVMprediction<-as.factor(SVMprediction)
testing$Y<-as.factor(testing$Y)

cmSVM <-confusionMatrix(SVMprediction,testing$Y) # ERROR 2
print(cmSVM)

confusionMatrix的全部文档

数据：预测类的系数，参考：用作真实结果的类的系数。所以，是的，在调用confusionMatrix之前进行因式分解。@phiver问题似乎是“SVMPprediction”的级别数。我写了整个代码，但我不明白为什么“SVMPprediction”有这么多的级别，Y只有两个级别。检查你的预测是否得到了0和1，或者预测的概率。这通常是一个问题，大多数R分类模型都假设结果向量是一个因素。将二进制数字数据传递给它会使它进行回归，而不会生成预测类。提供一个可复制的示例和

sessionInfo

的结果将有助于明确回答您的问题。

Error in confusionMatrix.default(SVMprediction, testing$Y) : 
  the data cannot have more levels than the reference

totalY <- total

total <- total%>%
  select(-Y)

# Missing Values with MICE
mod_mice <- mice(data = total, m = 5,meth='cart')
total <- complete(mod_mice)
post_mv_var_top10 <- total


Y <- totalY$Y
total<-cbind(total,Y)


train_ <- total%>%
  filter(is.na(Y)==FALSE)

test_ <- total%>%
  filter(is.na(Y)==TRUE)


inTraining <- createDataPartition(train_$Y, p = .70, list = FALSE)
training <- train_[ inTraining,]
testing  <- train_[-inTraining,]


# MODEL SVM

model_SVM <- train(
  Y ~ ., training,
  method = "svmPoly",
  trControl = trainControl(
    method = "cv", number = 10,
    verboseIter = TRUE
  )
)

summary(model_SVM)

SVMprediction <-predict(model_SVM, testing)

SVMprediction<-as.factor(SVMprediction)
testing$Y<-as.factor(testing$Y)

cmSVM <-confusionMatrix(SVMprediction,testing$Y) # ERROR 2
print(cmSVM)