如何计算R中不平衡面板数据的年平均值?

如何计算R中不平衡面板数据的年平均值?,r,panel,data-cleaning,R,Panel,Data Cleaning,我有一个季度的不平衡面板数据如下所示: Firm Date Var_1 AAA 19701130 24.46 AAA 19701231 NA AAA 19710131 NA AAA 19710228 34.19325 AAA 19710331 NA AAA 19710430

我有一个季度的不平衡面板数据如下所示:

Firm    Date    Var_1               
AAA 19701130    24.46               
AAA 19701231    NA              
AAA 19710131    NA              
AAA 19710228    34.19325                
AAA 19710331    NA              
AAA 19710430    NA              
AAA 19710531    29.0235             
AAA 19710630    NA              
AAA 19710731    NA              
AAA 19710831    16.256875               
AAA 19710930    NA              
AAA 19711031    NA              
AAA 19711130    17.22125                
AAA 19711231    NA              
BBB 19730630    4.57                
BBB 19730731    NA              
BBB 19730831    NA              
BBB 19730930    8.736               
BBB 19731031    NA              
BBB 19731130    NA              
BBB 19731231    4.97                
BBB 19740131    NA              
BBB 19740228    NA              
BBB 19740331    6.85125             
BBB 19740430    NA              
BBB 19740531    NA              
BBB 19740630    6.87225             
BBB 19740731    NA              
BBB 19740831    NA              
BBB 19740930    5.454875                
BBB 19741031    NA              
BBB 19741130    NA              
BBB 19741231    4.56875             
BBB 19750131    NA              
BBB 19750228    NA              
BBB 19750331    6.276               
BBB 19750430    NA              
BBB 19750531    NA              
BBB 19750630    6.0145              
BBB 19750731    NA              
BBB 19750831    NA              
BBB 19750930    8.376               
BBB 19751031    NA              
BBB 19751130    NA              
BBB 19751231    9.17875             
实际数据持续到数万行。这里的要点是,每家公司在不同的月末报告。如何计算每家公司每年的
Var_1
平均值?最终结果应该在今年而不是季度。理想的结果如下所示

Firm    Date    Var_1   
AAA     1970    24.46   
AAA     1971    24.17   
BBB     1973    6.09    
BBB     1974    5.94    
BBB     1975    7.46    

我们可以使用其中一个GROUPBY函数。按“公司”和“日期”子字符串分组后,得到“Var_1”的
平均值

library(dplyr)
df1 %>% 
    group_by(Firm, Date = substr(Date, 1,4 )) %>% 
    summarise(Var_1 = round(mean(Var_1, na.rm = TRUE), 2))
#   Firm  Date Var_1
#  <chr> <chr> <dbl>
#1   AAA  1970 24.46
#2   AAA  1971 24.17
#3   BBB  1973  6.09
#4   BBB  1974  5.94
#5   BBB  1975  7.46

我们可以使用其中一个GROUPBY函数。按“公司”和“日期”子字符串分组后,得到“Var_1”的
平均值

library(dplyr)
df1 %>% 
    group_by(Firm, Date = substr(Date, 1,4 )) %>% 
    summarise(Var_1 = round(mean(Var_1, na.rm = TRUE), 2))
#   Firm  Date Var_1
#  <chr> <chr> <dbl>
#1   AAA  1970 24.46
#2   AAA  1971 24.17
#3   BBB  1973  6.09
#4   BBB  1974  5.94
#5   BBB  1975  7.46