R 如何聚合数据并运行自定义函数以计算置信区间
我有一个包含三个变量的数据框架:年份、位置和集中度,我希望按年份和位置汇总数据,并计算集中度的置信区间R 如何聚合数据并运行自定义函数以计算置信区间,r,R,我有一个包含三个变量的数据框架:年份、位置和集中度,我希望按年份和位置汇总数据,并计算集中度的置信区间 Year <- rep(c(2010, 2011, 2012, 2013), each=15) Location <- rep(c("Texas", "Colorado", "Washington"), times = 4, each = 5) Concentration <- runif(60, 0, 100) conc_data <- cbind.data.fra
Year <- rep(c(2010, 2011, 2012, 2013), each=15)
Location <- rep(c("Texas", "Colorado", "Washington"), times = 4, each = 5)
Concentration <- runif(60, 0, 100)
conc_data <- cbind.data.frame(Year, Location, Concentration)
head(conc_data)
Year Location Concentration
1 2010 Texas 22.54480
2 2010 Texas 70.38605
3 2010 Texas 79.53292
4 2010 Texas 95.62562
5 2010 Texas 38.81795
6 2010 Colorado 68.69821
如果我们提供匿名函数(
函数(x)
),则“x”返回“浓度”
aggregate(cbind(lwr = Concentration) ~ Location + Year, data = conc_data,
function(x) confidence_interval_lwr(x, 0.95))
# Location Year lwr
#1 Colorado 2010 13.1289089
#2 Texas 2010 14.3379460
#3 Washington 2010 30.4922382
#4 Colorado 2011 18.9369171
#5 Texas 2011 0.6261571
#6 Washington 2011 12.2817138
#7 Colorado 2012 3.7365737
#8 Texas 2012 11.1165898
#9 Washington 2012 32.9729329
#10 Colorado 2013 23.9445299
#11 Texas 2013 3.0298597
#12 Washington 2013 9.0199863
注意:由于在创建runif
列时没有设置seed,因此这些值会有所不同
confidence_interval_lwr <- function(vector, interval) {
# Standard deviation of sample
vec_sd <- sd(vector)
# Sample size
n <- length(vector)
# Mean of sample
vec_mean <- mean(vector)
# Error according to t distribution
error <- qt((interval + 1)/2, df = n - 1) * vec_sd / sqrt(n)
# Confidence interval as a vector
lwr <- c("lower" = vec_mean - error)
return(lwr)
}
Year Location lwr
1 2010 Texas 8.2
2 2010 Colorado 5.9
3 2010 Washington 15.0
4 2011 Texas 10.0
5 2011 Colorado 2.0
6 2011 Washington 18.0
aggregate(cbind(lwr = Concentration) ~ Location + Year, data = conc_data,
function(x) confidence_interval_lwr(x, 0.95))
# Location Year lwr
#1 Colorado 2010 13.1289089
#2 Texas 2010 14.3379460
#3 Washington 2010 30.4922382
#4 Colorado 2011 18.9369171
#5 Texas 2011 0.6261571
#6 Washington 2011 12.2817138
#7 Colorado 2012 3.7365737
#8 Texas 2012 11.1165898
#9 Washington 2012 32.9729329
#10 Colorado 2013 23.9445299
#11 Texas 2013 3.0298597
#12 Washington 2013 9.0199863