使用dplyr中的两个函数汇总数据
考虑此示例数据帧:使用dplyr中的两个函数汇总数据,r,dplyr,group-summaries,R,Dplyr,Group Summaries,考虑此示例数据帧: d <- read.table(text=" trt rep y 1 1 30 1 1 50 1 1 70 1 2 0 1 2 0 1 2 0 2 1 10 2 1 0 2 1 0 2 2 5 2 2 0 2 2 . " , header =
d <- read.table(text="
trt rep y
1 1 30
1 1 50
1 1 70
1 2 0
1 2 0
1 2 0
2 1 10
2 1 0
2 1 0
2 2 5
2 2 0
2 2 .
"
, header = TRUE, check.names = F, na.strings = ".")
第二个是每个代表的trt正值的比例
by_rep2 = d %>%
group_by(trt, rep) %>%
summarise_each(funs(round(mean(.>0, na.rm=TRUE),2)), y)
我做了这么长时间,因为我不知道如何一步到位:
inner_join(by_rep1, by_rep2, by = c("trt", "rep"))
# trt rep mean_y y
# (int) (int) (dbl) (dbl)
#1 1 1 50.000000 1.00
#2 1 2 0.000000 0.00
#3 2 1 3.333333 0.33
#4 2 2 2.500000 0.50
有人知道如何在一个步骤中完成这两个功能吗 您可以将它们放在一个
摘要语句中:
d %>% group_by(trt, rep) %>% summarise(mean_y = mean(y, na.rm = T),
y = round(mean(y > 0, na.rm = T), 2))
Source: local data frame [4 x 4]
Groups: trt [?]
trt rep mean_y y
(int) (int) (dbl) (dbl)
1 1 1 50.000000 1.00
2 1 2 0.000000 0.00
3 2 1 3.333333 0.33
4 2 2 2.500000 0.50
您可以将它们放在一个摘要语句中:
d %>% group_by(trt, rep) %>% summarise(mean_y = mean(y, na.rm = T),
y = round(mean(y > 0, na.rm = T), 2))
Source: local data frame [4 x 4]
Groups: trt [?]
trt rep mean_y y
(int) (int) (dbl) (dbl)
1 1 1 50.000000 1.00
2 1 2 0.000000 0.00
3 2 1 3.333333 0.33
4 2 2 2.500000 0.50
我们也可以使用data.table
library(data.table)
setDT(d)[, .(mean_y = mean(y, na.rm = TRUE), y = round(mean(y > 0,
na.rm = TRUE), 2)) , .(trt, rep)]
# trt rep mean_y y
#1: 1 1 50.000000 1.00
#2: 1 2 0.000000 0.00
#3: 2 1 3.333333 0.33
#4: 2 2 2.500000 0.50
也可以仅使用base R
do.call(data.frame, aggregate(y~., d, FUN = function(x)
c(mean_y=mean(x, na.rm=TRUE), y=round(mean(x > 0, na.rm=TRUE),2)), na.action=NULL))
我们也可以使用data.table
library(data.table)
setDT(d)[, .(mean_y = mean(y, na.rm = TRUE), y = round(mean(y > 0,
na.rm = TRUE), 2)) , .(trt, rep)]
# trt rep mean_y y
#1: 1 1 50.000000 1.00
#2: 1 2 0.000000 0.00
#3: 2 1 3.333333 0.33
#4: 2 2 2.500000 0.50
也可以仅使用base R
do.call(data.frame, aggregate(y~., d, FUN = function(x)
c(mean_y=mean(x, na.rm=TRUE), y=round(mean(x > 0, na.rm=TRUE),2)), na.action=NULL))