Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/72.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R:随机林中的mtry调整错误(回归)_R_Random Forest_R Caret - Fatal编程技术网

R:随机林中的mtry调整错误(回归)

R:随机林中的mtry调整错误(回归),r,random-forest,r-caret,R,Random Forest,R Caret,我有以下代码来调整随机森林回归模型的mtry超参数: set.seed(42) mtry <- 1:10 # Define train control trControl <- trainControl(method = "cv", number = 10, search = "grid") for (i in mtry) { rf

我有以下代码来调整随机森林回归模型的mtry超参数:

set.seed(42)

mtry <- 1:10

# Define train control
trControl <- trainControl(method = "cv",
                          number = 10,
                          search = "grid")

for (i in mtry) {
  rf_random <- train(Price.Gas~., data=data_train,
                 method = "rf",
                 mtry = i,
                 metric = "RMSE",
                 trControl = trControl)
}

我怎样才能测试不同的mtry值呢?

默认情况下,插入符号会在网格上调整mtry,请参见,这样您就不需要使用循环,而是在
tuneGrid=
中定义它:

library(caret)
set.seed(42)

data_train = data.frame(Price.Gas = rnorm(100),matrix(rnorm(1000),ncol=10))

trControl <- trainControl(method = "cv",number = 10)

rf_random <- train(Price.Gas~., data=data_train,
                   method = "rf",
                   tuneGrid = data.frame(mtry = 1:10),
                   metric = "RMSE",
                   trControl = trControl)

Random Forest 

100 samples
 10 predictor

No pre-processing
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 89, 90, 91, 89, 91, 90, ... 
Resampling results across tuning parameters:

  mtry  RMSE       Rsquared   MAE      
   1    0.8556649  0.2122988  0.6921878
   2    0.8458829  0.2102749  0.6808978
   3    0.8518204  0.1975061  0.6909111
   4    0.8451160  0.1918390  0.6871511
   5    0.8386129  0.2037676  0.6808157
   6    0.8476718  0.1949056  0.6889514
   7    0.8434816  0.2082844  0.6833892
   8    0.8447137  0.1979602  0.6860908
   9    0.8419739  0.1960369  0.6825207
  10    0.8533284  0.1876459  0.6892574

RMSE was used to select the optimal model using the smallest value.
The final value used for the model was mtry = 5.
库(插入符号)
种子(42)
数据列=数据帧(Price.Gas=rnorm(100),矩阵(rnorm(1000),ncol=10))

trControl我收到此错误无效mtry:重置为在有效范围内。我想我选择了错误的mtry范围。我如何保证在不需要自己提供mtry值的情况下不会出现此错误?有什么经验法则吗?例如,您可以不指定
tuneGrid=
,而是指定
tuneLength=5
。插入符号将为您选择。我想你的mtry不能超过列数
library(caret)
set.seed(42)

data_train = data.frame(Price.Gas = rnorm(100),matrix(rnorm(1000),ncol=10))

trControl <- trainControl(method = "cv",number = 10)

rf_random <- train(Price.Gas~., data=data_train,
                   method = "rf",
                   tuneGrid = data.frame(mtry = 1:10),
                   metric = "RMSE",
                   trControl = trControl)

Random Forest 

100 samples
 10 predictor

No pre-processing
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 89, 90, 91, 89, 91, 90, ... 
Resampling results across tuning parameters:

  mtry  RMSE       Rsquared   MAE      
   1    0.8556649  0.2122988  0.6921878
   2    0.8458829  0.2102749  0.6808978
   3    0.8518204  0.1975061  0.6909111
   4    0.8451160  0.1918390  0.6871511
   5    0.8386129  0.2037676  0.6808157
   6    0.8476718  0.1949056  0.6889514
   7    0.8434816  0.2082844  0.6833892
   8    0.8447137  0.1979602  0.6860908
   9    0.8419739  0.1960369  0.6825207
  10    0.8533284  0.1876459  0.6892574

RMSE was used to select the optimal model using the smallest value.
The final value used for the model was mtry = 5.