R中的线性回归环_R - Fatal编程技术网

R中的线性回归环

R中的线性回归环,r,R,我需要多个股票的贝塔系数和剩余方差。我的问题是，如何创建多元线性回归的循环，并将上述系数提取到输出中这是我的数据，MR是我的自变量，其余的列是因变量，我必须对每个变量分别进行线性回归多谢各位 //编辑： > dput(head(Beta_market_model_test)) structure(list(...1 = structure(c(1422748800, 1425168000, 1427846400, 1430438400, 1433116800, 1435708800

我需要多个股票的贝塔系数和剩余方差。我的问题是，如何创建多元线性回归的循环，并将上述系数提取到输出中

这是我的数据，MR是我的自变量，其余的列是因变量，我必须对每个变量分别进行线性回归

多谢各位

//编辑：

> dput(head(Beta_market_model_test))
structure(list(...1 = structure(c(1422748800, 1425168000, 1427846400, 
1430438400, 1433116800, 1435708800), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), R1 = c(-0.0225553678146582, 0.084773882172773, -0.00628335525823254, 
0.189767902403849, -0.129765571642446, -0.02268699227135), R2 = c(-0.000634819869861802, 
0.0566396021070485, 0.0504313735522286, -0.0275926732076482, 
0.0473125483284236, -0.0501700832780339), R3 = c(-0.0607564272876455, 
0.0915928283206455, -0.116429377153136, 0.0338313435925748, -0.0731748018356279, 
-0.082292041771696), R4 = c(0.036716647443291, 0.0409790469126645, 
-0.0594941218382615, 0.0477272727272728, 0.0115690527838033, 
-0.0187634024303074), R5 = c(0.00286365940192601, 0.0128875748616479, 
0.000174637626924046, 0.0238214018458469, 0.0120599342185406, 
-0.0627587867116033), R6 = c(-0.0944601447872712, 0.090838356632893, 
-0.0577132600192821, 0.136928528648433, -0.0137770071043408, 
0.0214549609033041), MR = c(-0.0388483879770769, 0.0858362570727453, 
-0.0178553084990147, 0.0567646974926548, -0.0391124787432181, 
-0.014626289866472)), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

我们可以使用

cbind

来指定

lm

model <- lm(cbind(R1, R2, R3, R4, R5, R6) ~ MR, data = df1)
s1 <- summary(model)

检查

摘要

summary(model)
Response R1 :

Call:
lm(formula = R1 ~ MR, data = Beta_market_model_test)

Residuals:
       1        2        3        4        5        6 
 0.03757 -0.06851  0.01791  0.08624 -0.06919 -0.00402 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept) 0.006368   0.028060   0.227   0.8316  
MR          1.711625   0.577571   2.963   0.0414 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.06831 on 4 degrees of freedom
Multiple R-squared:  0.6871,    Adjusted R-squared:  0.6088 
F-statistic: 8.782 on 1 and 4 DF,  p-value: 0.04141


Response R2 :

Call:
lm(formula = R2 ~ MR, data = Beta_market_model_test)

Residuals:
       1        2        3        4        5        6 
-0.01047  0.03882  0.03925 -0.04355  0.03750 -0.06155 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.01232    0.02079   0.593    0.585
MR           0.06402    0.42797   0.150    0.888

Residual standard error: 0.05062 on 4 degrees of freedom
Multiple R-squared:  0.005564,  Adjusted R-squared:  -0.243 
F-statistic: 0.02238 on 1 and 4 DF,  p-value: 0.8883


Response R3 :

Call:
lm(formula = R3 ~ MR, data = Beta_market_model_test)

Residuals:
        1         2         3         4         5         6 
 0.035081  0.014541 -0.049701 -0.002909  0.023029 -0.020041 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept) -0.04197    0.01431  -2.934  0.04266 * 
MR           1.38661    0.29449   4.709  0.00925 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.03483 on 4 degrees of freedom
Multiple R-squared:  0.8472,    Adjusted R-squared:  0.8089 
F-statistic: 22.17 on 1 and 4 DF,  p-value: 0.009249


Response R4 :

Call:
lm(formula = R4 ~ MR, data = Beta_market_model_test)

Residuals:
         1          2          3          4          5          6 
 0.0438966  0.0002996 -0.0603723  0.0182067  0.0188503 -0.0208810 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.007732   0.016804    0.46    0.669
MR          0.383843   0.345886    1.11    0.329

Residual standard error: 0.04091 on 4 degrees of freedom
Multiple R-squared:  0.2354,    Adjusted R-squared:  0.04425 
F-statistic: 1.232 on 1 and 4 DF,  p-value: 0.3293


Response R5 :

Call:
lm(formula = R5 ~ MR, data = Beta_market_model_test)

Residuals:
        1         2         3         4         5         6 
 0.013692 -0.001676  0.006728  0.015178  0.022942 -0.056863 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.002917   0.013351  -0.218    0.838
MR           0.203653   0.274801   0.741    0.500

Residual standard error: 0.0325 on 4 degrees of freedom
Multiple R-squared:  0.1207,    Adjusted R-squared:  -0.09909 
F-statistic: 0.5492 on 1 and 4 DF,  p-value: 0.4998


Response R6 :

Call:
lm(formula = R6 ~ MR, data = Beta_market_model_test)

Residuals:
       1        2        3        4        5        6 
-0.04498 -0.03837 -0.03832  0.04938  0.03608  0.03622 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept) 0.006197   0.020555   0.302   0.7781  
MR          1.433135   0.423083   3.387   0.0276 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.05004 on 4 degrees of freedom
Multiple R-squared:  0.7415,    Adjusted R-squared:  0.6769 
F-statistic: 11.47 on 1 and 4 DF,  p-value: 0.0276

我们可以通过

tridy

从

broom

library(purrr)
library(broom)
map_dfr(summary(model), tidy, .id = 'dep_var')
# A tibble: 12 x 6
#   dep_var   term        estimate std.error statistic p.value
#   <chr>       <chr>          <dbl>     <dbl>     <dbl>   <dbl>
# 1 Response R1 (Intercept)  0.00637    0.0281     0.227 0.832  
# 2 Response R1 MR           1.71       0.578      2.96  0.0414 
# 3 Response R2 (Intercept)  0.0123     0.0208     0.593 0.585  
# 4 Response R2 MR           0.0640     0.428      0.150 0.888  
# 5 Response R3 (Intercept) -0.0420     0.0143    -2.93  0.0427 
# 6 Response R3 MR           1.39       0.294      4.71  0.00925
# 7 Response R4 (Intercept)  0.00773    0.0168     0.460 0.669  
# 8 Response R4 MR           0.384      0.346      1.11  0.329  
# 9 Response R5 (Intercept) -0.00292    0.0134    -0.218 0.838  
#10 Response R5 MR           0.204      0.275      0.741 0.500  
#11 Response R6 (Intercept)  0.00620    0.0206     0.302 0.778  
#12 Response R6 MR           1.43       0.423      3.39  0.0276

我们可以使用

cbind

来指定

lm

model <- lm(cbind(R1, R2, R3, R4, R5, R6) ~ MR, data = df1)
s1 <- summary(model)

检查

摘要

summary(model)
Response R1 :

Call:
lm(formula = R1 ~ MR, data = Beta_market_model_test)

Residuals:
       1        2        3        4        5        6 
 0.03757 -0.06851  0.01791  0.08624 -0.06919 -0.00402 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept) 0.006368   0.028060   0.227   0.8316  
MR          1.711625   0.577571   2.963   0.0414 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.06831 on 4 degrees of freedom
Multiple R-squared:  0.6871,    Adjusted R-squared:  0.6088 
F-statistic: 8.782 on 1 and 4 DF,  p-value: 0.04141


Response R2 :

Call:
lm(formula = R2 ~ MR, data = Beta_market_model_test)

Residuals:
       1        2        3        4        5        6 
-0.01047  0.03882  0.03925 -0.04355  0.03750 -0.06155 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.01232    0.02079   0.593    0.585
MR           0.06402    0.42797   0.150    0.888

Residual standard error: 0.05062 on 4 degrees of freedom
Multiple R-squared:  0.005564,  Adjusted R-squared:  -0.243 
F-statistic: 0.02238 on 1 and 4 DF,  p-value: 0.8883


Response R3 :

Call:
lm(formula = R3 ~ MR, data = Beta_market_model_test)

Residuals:
        1         2         3         4         5         6 
 0.035081  0.014541 -0.049701 -0.002909  0.023029 -0.020041 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept) -0.04197    0.01431  -2.934  0.04266 * 
MR           1.38661    0.29449   4.709  0.00925 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.03483 on 4 degrees of freedom
Multiple R-squared:  0.8472,    Adjusted R-squared:  0.8089 
F-statistic: 22.17 on 1 and 4 DF,  p-value: 0.009249


Response R4 :

Call:
lm(formula = R4 ~ MR, data = Beta_market_model_test)

Residuals:
         1          2          3          4          5          6 
 0.0438966  0.0002996 -0.0603723  0.0182067  0.0188503 -0.0208810 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.007732   0.016804    0.46    0.669
MR          0.383843   0.345886    1.11    0.329

Residual standard error: 0.04091 on 4 degrees of freedom
Multiple R-squared:  0.2354,    Adjusted R-squared:  0.04425 
F-statistic: 1.232 on 1 and 4 DF,  p-value: 0.3293


Response R5 :

Call:
lm(formula = R5 ~ MR, data = Beta_market_model_test)

Residuals:
        1         2         3         4         5         6 
 0.013692 -0.001676  0.006728  0.015178  0.022942 -0.056863 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.002917   0.013351  -0.218    0.838
MR           0.203653   0.274801   0.741    0.500

Residual standard error: 0.0325 on 4 degrees of freedom
Multiple R-squared:  0.1207,    Adjusted R-squared:  -0.09909 
F-statistic: 0.5492 on 1 and 4 DF,  p-value: 0.4998


Response R6 :

Call:
lm(formula = R6 ~ MR, data = Beta_market_model_test)

Residuals:
       1        2        3        4        5        6 
-0.04498 -0.03837 -0.03832  0.04938  0.03608  0.03622 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept) 0.006197   0.020555   0.302   0.7781  
MR          1.433135   0.423083   3.387   0.0276 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.05004 on 4 degrees of freedom
Multiple R-squared:  0.7415,    Adjusted R-squared:  0.6769 
F-statistic: 11.47 on 1 and 4 DF,  p-value: 0.0276

我们可以通过

tridy

从

broom

library(purrr)
library(broom)
map_dfr(summary(model), tidy, .id = 'dep_var')
# A tibble: 12 x 6
#   dep_var   term        estimate std.error statistic p.value
#   <chr>       <chr>          <dbl>     <dbl>     <dbl>   <dbl>
# 1 Response R1 (Intercept)  0.00637    0.0281     0.227 0.832  
# 2 Response R1 MR           1.71       0.578      2.96  0.0414 
# 3 Response R2 (Intercept)  0.0123     0.0208     0.593 0.585  
# 4 Response R2 MR           0.0640     0.428      0.150 0.888  
# 5 Response R3 (Intercept) -0.0420     0.0143    -2.93  0.0427 
# 6 Response R3 MR           1.39       0.294      4.71  0.00925
# 7 Response R4 (Intercept)  0.00773    0.0168     0.460 0.669  
# 8 Response R4 MR           0.384      0.346      1.11  0.329  
# 9 Response R5 (Intercept) -0.00292    0.0134    -0.218 0.838  
#10 Response R5 MR           0.204      0.275      0.741 0.500  
#11 Response R6 (Intercept)  0.00620    0.0206     0.302 0.778  
#12 Response R6 MR           1.43       0.423      3.39  0.0276

我只是想问一个关于我的代码的问题：

library(dplyr)
library(tidyr)
library(broom)

df %>%
  select(-...1) %>%
  pivot_longer(R1:R6) %>%
  group_by(name) %>%
  nest(data = c(MR, value)) %>%
  mutate(model = map(data, ~ lm(MR ~ value, data = .)), 
         glance = map(model, ~ glance(.x))) %>%
  unnest(glance) %>% 
  select(- c(data, model))

# A tibble: 6 x 13
# Groups:   name [6]
  name  r.squared adj.r.squared  sigma statistic p.value    df logLik   AIC   BIC deviance
  <chr>     <dbl>         <dbl>  <dbl>     <dbl>   <dbl> <dbl>  <dbl> <dbl> <dbl>    <dbl>
1 R1      0.687          0.609  0.0331    8.78   0.0414      1  13.2  -20.3 -20.9  0.00438
2 R2      0.00556       -0.243  0.0590    0.0224 0.888       1   9.69 -13.4 -14.0  0.0139 
3 R3      0.847          0.809  0.0231   22.2    0.00925     1  15.3  -24.6 -25.2  0.00214
4 R4      0.235          0.0443 0.0517    1.23   0.329       1  10.5  -15.0 -15.6  0.0107 
5 R5      0.121         -0.0991 0.0555    0.549  0.500       1  10.1  -14.1 -14.7  0.0123 
6 R6      0.742          0.677  0.0301   11.5    0.0276      1  13.7  -21.5 -22.1  0.00362
# ... with 2 more variables: df.residual <int>, nobs <int>

库（dplyr）
图书馆（tidyr）
图书馆（扫帚）
df%>%
选择（-…1）%>%
枢轴长度（R1:R6）%>%
分组单位（名称）%>%
嵌套（数据=c（MR，值））%>%
突变（模型=映射（数据，~lm（MR~值，数据=），
浏览=地图（型号，~glance（.x）））%>%
最新（浏览）%>%
选择（-c（数据、模型））
#一个tibble:6x13
#分组：名称[6]
名称r.平方调整r.平方西格玛统计p.值df logLik AIC BIC偏差
1 R1 0.687 0.609 0.0331 8.78 0.0414 1 13.2-20.3-20.9 0.00438
2 R2 0.00556-0.243 0.0590 0.0224 0.888 1 9.69-13.4-14.0 0 0.0139
3 R3 0.847 0.809 0.0231 22.2 0.00925 1 15.3-24.6-25.2 0.00214
4 R4 0.235 0.0443 0.0517 1.23 0.329 1 10.5-15.0-15.6 0.0107
5 R50.121-0.0991 0.0555 0.549 0.500 110.1-14.1-14.7 0.0123
6 R6 0.742 0.677 0.03011.5 0.0276 1 13.7-21.5-22.1 0.00362
# ... 还有两个变量：df.残差，nobs

更新

感谢我亲爱的朋友@akrun，他总是给我提供有价值的建议

如果您希望避免数据透视，因为数据透视可能会将行数增加到超出限制的程度，您也可以使用以下代码：

library(dplyr)
library(tidyr)
library(broom)

df %>% 
  select(-1) %>% 
  summarise(across(-MR, ~ list(lm(reformulate('MR', response = cur_column()), 
                                   data = df) %>% 
                                  summary))) %>% 
  unclass %>% 
  map_dfr(~ tidy(.x[[1]]))

# A tibble: 12 x 5
   term        estimate std.error statistic p.value
   <chr>          <dbl>     <dbl>     <dbl>   <dbl>
 1 (Intercept)  0.00637    0.0281     0.227 0.832  
 2 MR           1.71       0.578      2.96  0.0414 
 3 (Intercept)  0.0123     0.0208     0.593 0.585  
 4 MR           0.0640     0.428      0.150 0.888  
 5 (Intercept) -0.0420     0.0143    -2.93  0.0427 
 6 MR           1.39       0.294      4.71  0.00925
 7 (Intercept)  0.00773    0.0168     0.460 0.669  
 8 MR           0.384      0.346      1.11  0.329  
 9 (Intercept) -0.00292    0.0134    -0.218 0.838  
10 MR           0.204      0.275      0.741 0.500  
11 (Intercept)  0.00620    0.0206     0.302 0.778  
12 MR           1.43       0.423      3.39  0.0276

库（dplyr）
图书馆（tidyr）
图书馆（扫帚）
df%>%
选择（-1）%>%
总结（跨（-MR，~list）（lm（重新格式化（'MR'，response=cur_column（）），
数据=df）%>%
（摘要））%>%
取消分类%>%
地图_-dfr（~tidy（.x[[1]]））
#一个tibble:12x5
术语估计标准误差统计p值
1（截距）0.00637 0.0281 0.227 0.832
2 MR 1.71 0.578 2.96 0.0414
3（截距）0.0123 0.0208 0.593 0.585
4 MR 0.0640 0.428 0.150 0.888
5（截距）-0.0420 0.0143-2.93 0.0427
6 MR 1.39 0.294 4.71 0.00925
7（截距）0.00773 0.0168 0.460 0.669
8 MR 0.384 0.346 1.11 0.329
9（截距）-0.00292 0.0134-0.218 0.838
10 MR 0.204 0.275 0.741 0.500
11（截距）0.00620 0.0206 0.302 0.778
12 MR 1.43 0.423 3.39 0.0276

我发这个帖子只是想问一个关于我的代码的问题：

library(dplyr)
library(tidyr)
library(broom)

df %>%
  select(-...1) %>%
  pivot_longer(R1:R6) %>%
  group_by(name) %>%
  nest(data = c(MR, value)) %>%
  mutate(model = map(data, ~ lm(MR ~ value, data = .)), 
         glance = map(model, ~ glance(.x))) %>%
  unnest(glance) %>% 
  select(- c(data, model))

# A tibble: 6 x 13
# Groups:   name [6]
  name  r.squared adj.r.squared  sigma statistic p.value    df logLik   AIC   BIC deviance
  <chr>     <dbl>         <dbl>  <dbl>     <dbl>   <dbl> <dbl>  <dbl> <dbl> <dbl>    <dbl>
1 R1      0.687          0.609  0.0331    8.78   0.0414      1  13.2  -20.3 -20.9  0.00438
2 R2      0.00556       -0.243  0.0590    0.0224 0.888       1   9.69 -13.4 -14.0  0.0139 
3 R3      0.847          0.809  0.0231   22.2    0.00925     1  15.3  -24.6 -25.2  0.00214
4 R4      0.235          0.0443 0.0517    1.23   0.329       1  10.5  -15.0 -15.6  0.0107 
5 R5      0.121         -0.0991 0.0555    0.549  0.500       1  10.1  -14.1 -14.7  0.0123 
6 R6      0.742          0.677  0.0301   11.5    0.0276      1  13.7  -21.5 -22.1  0.00362
# ... with 2 more variables: df.residual <int>, nobs <int>

库（dplyr）
图书馆（tidyr）
图书馆（扫帚）
df%>%
选择（-…1）%>%
枢轴长度（R1:R6）%>%
分组单位（名称）%>%
嵌套（数据=c（MR，值））%>%
突变（模型=映射（数据，~lm（MR~值，数据=），
浏览=地图（型号，~glance（.x）））%>%
最新（浏览）%>%
选择（-c（数据、模型））
#一个tibble:6x13
#分组：名称[6]
名称r.平方调整r.平方西格玛统计p.值df logLik AIC BIC偏差
1 R1 0.687 0.609 0.0331 8.78 0.0414 1 13.2-20.3-20.9 0.00438
2 R2 0.00556-0.243 0.0590 0.0224 0.888 1 9.69-13.4-14.0 0 0.0139
3 R3 0.847 0.809 0.0231 22.2 0.00925 1 15.3-24.6-25.2 0.00214
4 R4 0.235 0.0443 0.0517 1.23 0.329 1 10.5-15.0-15.6 0.0107
5 R50.121-0.0991 0.0555 0.549 0.500 110.1-14.1-14.7 0.0123
6 R6 0.742 0.677 0.03011.5 0.0276 1 13.7-21.5-22.1 0.00362
# ... 还有两个变量：df.残差，nobs

更新

感谢我亲爱的朋友@akrun，他总是给我提供有价值的建议

如果您希望避免数据透视，因为数据透视可能会将行数增加到超出限制的程度，您也可以使用以下代码：

library(dplyr)
library(tidyr)
library(broom)

df %>% 
  select(-1) %>% 
  summarise(across(-MR, ~ list(lm(reformulate('MR', response = cur_column()), 
                                   data = df) %>% 
                                  summary))) %>% 
  unclass %>% 
  map_dfr(~ tidy(.x[[1]]))

# A tibble: 12 x 5
   term        estimate std.error statistic p.value
   <chr>          <dbl>     <dbl>     <dbl>   <dbl>
 1 (Intercept)  0.00637    0.0281     0.227 0.832  
 2 MR           1.71       0.578      2.96  0.0414 
 3 (Intercept)  0.0123     0.0208     0.593 0.585  
 4 MR           0.0640     0.428      0.150 0.888  
 5 (Intercept) -0.0420     0.0143    -2.93  0.0427 
 6 MR           1.39       0.294      4.71  0.00925
 7 (Intercept)  0.00773    0.0168     0.460 0.669  
 8 MR           0.384      0.346      1.11  0.329  
 9 (Intercept) -0.00292    0.0134    -0.218 0.838  
10 MR           0.204      0.275      0.741 0.500  
11 (Intercept)  0.00620    0.0206     0.302 0.778  
12 MR           1.43       0.423      3.39  0.0276

库（dplyr）
图书馆（tidyr）
图书馆（扫帚）
df%>%
选择（-1）%>%
总结（跨（-MR，~list）（lm（重新格式化（'MR'，response=cur_column（）），
数据=df）%>%
（摘要））%>%
取消分类%>%
地图_-dfr（~tidy（.x[[1]]））
#一个tibble:12x5
术语估计标准误差统计p值
1（截距）0.00637 0.0281 0.227 0.832
2 MR 1.71 0.578 2.96 0.0414
3（截距）0.0123 0.0208 0.593 0.585
4 MR 0.0640 0.428 0.150 0.888
5（截距）-0.0420 0.0143-2.93 0.0427
6 MR 1.39 0.294 4.71 0.00925
7（截距）0.00773 0.0168 0.460 0.669
8 MR 0.384 0.346 1.11 0.329
9（截距）-0.00292 0.0134-0.218 0.838
10 MR 0.204 0.275 0.741 0.500
11（截距）0.00620 0.0206 0.302 0.778
12 MR 1.43 0.423 3.39 0