Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/apache-kafka/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 在插入符号序列()中指定结果变量的正类_R_R Caret - Fatal编程技术网

R 在插入符号序列()中指定结果变量的正类

R 在插入符号序列()中指定结果变量的正类,r,r-caret,R,R Caret,我想知道是否有办法指定插入符号的train()函数中哪类结果变量为正。一个简单的例子: # Settings ctrl <- trainControl(method = "repeatedcv", number = 10, savePredictions = TRUE, summaryFunction = twoClassSummary, classProbs = TRUE) # Data data <- mtcars %>% mutate(am = factor(am, l

我想知道是否有办法指定插入符号的
train()
函数中哪类结果变量为正。一个简单的例子:

# Settings
ctrl <- trainControl(method = "repeatedcv", number = 10, savePredictions = TRUE, summaryFunction = twoClassSummary, classProbs = TRUE)

# Data
data <- mtcars %>% mutate(am = factor(am, levels = c(0,1), labels = c("automatic", "manual"), ordered = T))

# Train
set.seed(123)
model1 <- train(am ~ disp + wt, data = data, method = "glm", family = "binomial", trControl = ctrl, tuneLength = 5)

# Data (factor ordering switched)
data <- mtcars %>% mutate(am = factor(am, levels = c(1,0), labels = c("manual", "automatic"), ordered = T))

# Train
set.seed(123)
model2 <- train(am ~ disp + wt, data = data, method = "glm", family = "binomial", trControl = ctrl, tuneLength = 5)

# Specifity and Sensitivity is switched
model1
model2
#设置

ctrl问题不在于函数
train()
,而在于函数
twoClassSummary
,如下所示:

function (data, lev = NULL, model = NULL) 
{
  lvls <- levels(data$obs)

  [...]    

  out <- c(rocAUC, 
           sensitivity(data[, "pred"], data[, "obs"], 
             lev[1]),  # Hard coded positive class
           specificity(data[, "pred"], data[, "obs"], 
             lev[2])) # Hard coded negative class
  names(out) <- c("ROC", "Sens", "Spec")
  out
}
如果您不愿意更改级别的顺序,有一种非侵入性的方法可以更改twoClassSummary()函数

sensitivity()
specificity()
分别采用
阳性
阴性
级别名称(次优设计选择)。因此,我们将这两个参数包含到自定义函数中。 接下来,我们将这些参数传递给相应的函数以解决问题

customTwoClassSummary <- function(data, lev = NULL, model = NULL, positive = NULL, negative=NULL) 
{
  lvls <- levels(data$obs)
  if (length(lvls) > 2) 
    stop(paste("Your outcome has", length(lvls), "levels. The twoClassSummary() function isn't appropriate."))
  caret:::requireNamespaceQuietStop("ModelMetrics")
  if (!all(levels(data[, "pred"]) == lvls)) 
    stop("levels of observed and predicted data do not match")
  rocAUC <- ModelMetrics::auc(ifelse(data$obs == lev[2], 0, 
                                     1), data[, lvls[1]])
  out <- c(rocAUC, 
           # Only change happens here!
           sensitivity(data[, "pred"], data[, "obs"], positive=positive), 
           specificity(data[, "pred"], data[, "obs"], negative=negative))
  names(out) <- c("ROC", "Sens", "Spec")
  out
}

参数确保
caret
传递给匿名函数的所有其他参数都传递给
customTwoClassSummary()

我认为@Johannes是过度设计简单流程的例子

只需恢复因子的顺序:

   df$target <- factor(df$target, levels=rev(levels(df$target)))

df$目标公平点。我在回答中添加了这句话,以避免让人们在绕道时寻找快速解决方案。我确实认为OP知道修复,但希望明确指定正级别,以使代码更健壮。
ctrl <- trainControl(method = "repeatedcv", number = 10, savePredictions = TRUE, 
                     # This is a trick how to fix arguments for a function call
                     summaryFunction = function(...) customTwoClassSummary(..., 
                                       positive = "manual", negative="automatic"), 
                     classProbs = TRUE)
   df$target <- factor(df$target, levels=rev(levels(df$target)))