如何计算R中不平衡面板数据的年平均值?
我有一个季度的不平衡面板数据如下所示:如何计算R中不平衡面板数据的年平均值?,r,panel,data-cleaning,R,Panel,Data Cleaning,我有一个季度的不平衡面板数据如下所示: Firm Date Var_1 AAA 19701130 24.46 AAA 19701231 NA AAA 19710131 NA AAA 19710228 34.19325 AAA 19710331 NA AAA 19710430
Firm Date Var_1
AAA 19701130 24.46
AAA 19701231 NA
AAA 19710131 NA
AAA 19710228 34.19325
AAA 19710331 NA
AAA 19710430 NA
AAA 19710531 29.0235
AAA 19710630 NA
AAA 19710731 NA
AAA 19710831 16.256875
AAA 19710930 NA
AAA 19711031 NA
AAA 19711130 17.22125
AAA 19711231 NA
BBB 19730630 4.57
BBB 19730731 NA
BBB 19730831 NA
BBB 19730930 8.736
BBB 19731031 NA
BBB 19731130 NA
BBB 19731231 4.97
BBB 19740131 NA
BBB 19740228 NA
BBB 19740331 6.85125
BBB 19740430 NA
BBB 19740531 NA
BBB 19740630 6.87225
BBB 19740731 NA
BBB 19740831 NA
BBB 19740930 5.454875
BBB 19741031 NA
BBB 19741130 NA
BBB 19741231 4.56875
BBB 19750131 NA
BBB 19750228 NA
BBB 19750331 6.276
BBB 19750430 NA
BBB 19750531 NA
BBB 19750630 6.0145
BBB 19750731 NA
BBB 19750831 NA
BBB 19750930 8.376
BBB 19751031 NA
BBB 19751130 NA
BBB 19751231 9.17875
实际数据持续到数万行。这里的要点是,每家公司在不同的月末报告。如何计算每家公司每年的Var_1
平均值?最终结果应该在今年而不是季度。理想的结果如下所示
Firm Date Var_1
AAA 1970 24.46
AAA 1971 24.17
BBB 1973 6.09
BBB 1974 5.94
BBB 1975 7.46
我们可以使用其中一个GROUPBY函数。按“公司”和“日期”子字符串分组后,得到“Var_1”的
平均值
library(dplyr)
df1 %>%
group_by(Firm, Date = substr(Date, 1,4 )) %>%
summarise(Var_1 = round(mean(Var_1, na.rm = TRUE), 2))
# Firm Date Var_1
# <chr> <chr> <dbl>
#1 AAA 1970 24.46
#2 AAA 1971 24.17
#3 BBB 1973 6.09
#4 BBB 1974 5.94
#5 BBB 1975 7.46
我们可以使用其中一个GROUPBY函数。按“公司”和“日期”子字符串分组后,得到“Var_1”的平均值
library(dplyr)
df1 %>%
group_by(Firm, Date = substr(Date, 1,4 )) %>%
summarise(Var_1 = round(mean(Var_1, na.rm = TRUE), 2))
# Firm Date Var_1
# <chr> <chr> <dbl>
#1 AAA 1970 24.46
#2 AAA 1971 24.17
#3 BBB 1973 6.09
#4 BBB 1974 5.94
#5 BBB 1975 7.46