R 有没有更优雅的方法来测试预测模型是否正确?

R 有没有更优雅的方法来测试预测模型是否正确?,r,R,我有一个建模/预测的变更和一个实际的变更。预测的更改位于名为ForecastSpicChange的列中,实际更改名为HPIChange。它的形式如下: HPIChange forecastHPIChange 1 NA 1.547368e-02 2 -0.0026155187 1.485668e-02 3 0.0002906977 1.251108e-02 4 -0.0077877127 1.71872

我有一个建模/预测的变更和一个实际的变更。预测的更改位于名为ForecastSpicChange的列中,实际更改名为HPIChange。它的形式如下:

        HPIChange forecastHPIChange
1              NA      1.547368e-02
2   -0.0026155187      1.485668e-02
3    0.0002906977      1.251108e-02
4   -0.0077877127      1.718729e-02
5    0.0200058841      2.143551e-02
我想测试143个实例,预测的符号对齐是否正确。因此,实际上有四种情况:

  • 预测为正,实际为正->正确为正
  • 预测为负,实际为负->纠正为负
  • 预测为正,实际为负->不正确的正
  • 预测为负,实际为正->不正确的负
  • 为了检查这一点,我把下面的代码拼凑在一起,我可以把它们输入到一个数据帧中,但我想检查一下是否有一种更优雅的方法来进行检查

    data1 %>%
      filter(forecastHPIChange > 0 & HPIChange > 0) %>%
      summarise(correct = n())  
    
    data1 %>%
      filter(forecastHPIChange < 0 & HPIChange < 0) %>%
                summarise(correct = n())  
    
    data1 %>%
      filter(forecastHPIChange < 0 & HPIChange > 0) %>%
                summarise(wrong = n())  
    
    data1 %>%
      filter(forecastHPIChange > 0 & HPIChange < 0) %>%
                summarise(wrong = n())  
    
    data1%>%
    过滤器(预测更改>0和HPI更改>0)%>%
    总结(正确=n()
    数据1%>%
    过滤器(预测更改<0和HPI更改<0)%>%
    总结(正确=n()
    数据1%>%
    过滤器(预测更改<0&HPI更改>0)%>%
    总结(错误=n())
    数据1%>%
    过滤器(预测更改>0和HPI更改<0)%>%
    总结(错误=n())
    
    在插入符号包中尝试
    混淆矩阵

    library(caret)
    
    make_factor <- function(x) factor(sign(x), levels = c(-1, 1))
    signs <- as.data.frame(lapply(data1, make_factor))
    with(signs, confusionMatrix(forecastHPIChange, reference = HPIChange))
    
    其中一个给出:

    Confusion Matrix and Statistics
    
              Reference
    Prediction -1 1
            -1  0 0
            1   2 2
    
                   Accuracy : 0.5             
                     95% CI : (0.0676, 0.9324)
        No Information Rate : 0.5             
        P-Value [Acc > NIR] : 0.6875          
    
                      Kappa : 0               
     Mcnemar's Test P-Value : 0.4795          
    
                Sensitivity : 0.0             
                Specificity : 1.0             
             Pos Pred Value : NaN             
             Neg Pred Value : 0.5             
                 Prevalence : 0.5             
             Detection Rate : 0.0             
       Detection Prevalence : 0.0             
          Balanced Accuracy : 0.5        
    
    对于显示的输入,并不是所有的因子水平都出现了,但如果实际输入确实有所有的因子水平,那么我们可以消除
    make_factor
    ,而只使用
    符号

    注:上述可复制形式的输入数据为:

    data1 <- structure(list(HPIChange = c(NA, -0.0026155187, 0.0002906977, 
    -0.0077877127, 0.0200058841), forecastHPIChange = c(0.01547368, 
    0.01485668, 0.01251108, 0.01718729, 0.02143551)), .Names = c("HPIChange", 
    "forecastHPIChange"), class = "data.frame", row.names = c("1", 
    "2", "3", "4", "5"))
    

    data1从以下数据开始(稍微更改示例数据,使所有类TP、FP、TN、FN都有数据点):

    data1
    HPI变更预测变更
    1 NA 0.01547368
    2 -0.0026155187        0.01485668
    3  0.0002906977        0.01251108
    4 -0.0077877127       -0.01718729
    5  0.0200058841       -0.02143551
    #将data1转换为dataset data2,其中只有+和-标签(由+1和-1表示)
    数据2(0,1,-1)))
    表(数据2)
    预测变更
    HPIChange-1
    -1#1,1=tp1,-1=FN
    1   1 1   # -1. -1=TN-1,1=FP
    #使用包插入符号
    图书馆(插入符号)
    混淆矩阵(数据2$forecastpichange,数据2$HPIChange)
    
    data1 <- structure(list(HPIChange = c(NA, -0.0026155187, 0.0002906977, 
    -0.0077877127, 0.0200058841), forecastHPIChange = c(0.01547368, 
    0.01485668, 0.01251108, 0.01718729, 0.02143551)), .Names = c("HPIChange", 
    "forecastHPIChange"), class = "data.frame", row.names = c("1", 
    "2", "3", "4", "5"))
    
     data1
          HPIChange forecastHPIChange
    1            NA        0.01547368
    2 -0.0026155187        0.01485668
    3  0.0002906977        0.01251108
    4 -0.0077877127       -0.01718729
    5  0.0200058841       -0.02143551
    
    # transform the data1 to dataset data2 where we have only + and - labels (represented by +1 and -1)
    data2 <- as.data.frame(sapply(data1, function(x) ifelse(x > 0, 1, -1)))
    
    table(data2)       
    
        forecastHPIChange
    HPIChange  -1 1
           -1   1 1   #  1,  1 = TP   1, -1 = FN
            1   1 1   # -1. -1 = TN  -1,  1 = FP
    
    # using the package caret
    library(caret)
    confusionMatrix(data2$forecastHPIChange, data2$HPIChange)