R SVM测试集的预测值数量是否正确？_R_Testing_Svm_Predict

R SVM测试集的预测值数量是否正确？

r testing

R SVM测试集的预测值数量是否正确？,r,testing,svm,predict,R,Testing,Svm,Predict,所以我有一个nrow=218的数据集，我正在经历[这个][示例[git here][。我已经将我的数据分为train（nrow=163；~75%）和test（nrow=55；~25%）当我到了那个地方"pred我实际上能够使用下面的代码得到55行的结果。我所做的一些更改是针对pt2而不是as.character我将其转换为as.factor而不是predOk，我意识到我在训练数据集上训练模型，然后在测试集上测试它。我需要首先在重新预测列车组时对其进行测试，然后将其输入测试组 summary(

所以我有一个nrow=218的数据集，我正在经历[这个][示例[git here][。我已经将我的数据分为train（nrow=163；~75%）和test（nrow=55；~25%）

当我到了那个地方"pred我实际上能够使用下面的代码得到55行的结果。我所做的一些更改是针对

pt2

而不是

as.character

我将其转换为

as.factor

而不是

predOk，我意识到我在训练数据集上训练模型，然后在测试集上测试它。我需要首先在重新预测列车组时对其进行测试，然后将其输入测试组
 summary(model_svm)
#Use the predictions on the data
pred <- predict(model_svm, train)

model_svm <- svm(trainlabel ~ as.matrix(test) )
 summary(model_svm)
#Use the predictions on the data
pred <- predict(model_svm, test)```

摘要（模型支持向量机）
#使用对数据的预测
pred所以我更新了因子，但是pred的as.matrix位似乎没有什么不同。当使用55的测试集时，pred仍然有163行。不确定这是否正确。另外，@not_dave，当我在你指向的网站上的动物声音上尝试这个，还有一些真实的数据时，我得到的ROC曲线是这样的：哪一个假设不是期望的（寻找具有清晰路肩的曲线）。
# load libraries
library(data.table)
library(e1071)

# create dataset with random values
featuredata_all <- matrix(rnorm(23*218), ncol=23)

# scale features
pt1 <- scale(featuredata_all[,1:22],center=T)

# make column as factor
pt2 <- as.factor(ifelse(featuredata_all[,23]>0, 0,1)) #since the label is a string I kept it separate 

# join data (optional)
ft<-cbind.data.frame(pt1,pt2) #to preserve the label in text
colnames(ft)[23]<- "Cluster"

## 75% of the sample size
smp_size <- floor(0.75 * nrow(ft))

## set the seed to make your partition reproducible
set.seed(123)
train_ind <- sample(seq_len(nrow(ft)), size = smp_size)

# split data to train
train <- ft[train_ind,1:22] #163 reads
test  <- ft[-train_ind,1:22] #55 reads
dim(train)
# [1] 163  22

dim(test)
# [1] 55  22

# split data to test
trainlabel<- ft[train_ind,23] #163 labels
testlabel <- ft[-train_ind,23] #55 labels
length(trainlabel)
[1] 163

length(testlabel)
[1] 55

#Support Vector Machine for classification
model_svm <- svm(x= as.matrix(train), y = trainlabel, probability = T)
summary(model_svm)

# Call:
#   svm.default(x = as.matrix(train), y = trainlabel, probability = T)
# 
# 
# Parameters:
#   SVM-Type:  C-classification 
# SVM-Kernel:  radial 
# cost:  1 
# 
# Number of Support Vectors:  159
# 
# ( 78 81 )
# 
# 
# Number of Classes:  2 
# 
# Levels: 
#   0 1

#Use the predictions on the data
# ---------------- This is where the question is ---------------- #
pred <- predict(model_svm, as.matrix(test))
length(pred)
# [1] 55
# ----------------------------------------------------------------#

print(table(pred[1:nrow(test)],testlabel))
#    testlabel
#    0  1
# 0 14 14
# 1 11 16

 summary(model_svm)
#Use the predictions on the data
pred <- predict(model_svm, train)

model_svm <- svm(trainlabel ~ as.matrix(test) )
 summary(model_svm)
#Use the predictions on the data
pred <- predict(model_svm, test)```