Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/79.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/ruby-on-rails-4/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R:在带有分类变量的数据帧上使用ddply_R_Plyr - Fatal编程技术网

R:在带有分类变量的数据帧上使用ddply

R:在带有分类变量的数据帧上使用ddply,r,plyr,R,Plyr,问题的权利 我有一个data.frame(结构如下),主要是分类变量(大多数是二进制变量,即yes或no,还有一个有三个级别(data.frame$tertile) 我想生成一个数据框架,其中包含所有分类变量的汇总统计信息,即按数据分组的是患者比例。框架$tertile 对于分类变量可以使用ddply吗?我已经设法使用ddply来处理连续变量 x <- ddply(data.frame,.(tertile), numcolwise(mean,)) x要知道“是”和“否”的比例,请计算逻辑

问题的权利

我有一个data.frame(结构如下),主要是分类变量(大多数是二进制变量,即yes或no,还有一个有三个级别(data.frame$tertile)

我想生成一个数据框架,其中包含所有分类变量的汇总统计信息,即按
数据分组的是患者比例。框架$tertile

对于分类变量可以使用ddply吗?我已经设法使用ddply来处理连续变量

x <- ddply(data.frame,.(tertile), numcolwise(mean,))
x要知道“是”和“否”的比例,请计算逻辑计算给出的时间
TRUE
(TRUE=1,FALSE=0)

nYes您可以尝试:

 fun1 <- function(x) round(100*(table(x)/length(x))[1],2)
 ddply(dat, .(tertile),colwise(fun1) )

如果你能提供代码来重现你的数据帧,这将是非常有帮助的。你似乎有主要或唯一的因子变量。你认为“是”和“否”的意思是什么?thanksa Akrun和BBrill…工作出色,为我节省了很多时间…你们都是传奇人物。
vec <- c("Yes","No")
vec2 <- c(1,2,3)
tmp <- data.frame("smoker" = sample(vec,10, replace=TRUE),
             "mi" = sample(vec,10, replace=TRUE),
             "tertile" = sample(vec2,10, replace=TRUE))
 fun1 <- function(x) round(100*(table(x)/length(x))[1],2)
 ddply(dat, .(tertile),colwise(fun1) )
dat <- structure(list(smoker = structure(c(2L, 2L, 1L, 2L, 2L, 2L, 2L, 
1L, 2L, 2L), .Label = c("Yes", "No"), class = "factor"), mi = structure(c(1L, 
2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L), .Label = c("Yes", "No"), class = "factor"), 
angina = structure(c(2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 
2L), .Label = c("Yes", "No"), class = "factor"), pvd = structure(c(2L, 
2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L), .Label = c("Yes", "No"
), class = "factor"), isch.stroke = structure(c(1L, 1L, 1L, 
2L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("Yes", "No"), class = "factor"), 
ht.1 = structure(c(1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L
), .Label = c("Yes", "No"), class = "factor"), tertile = structure(c(3L, 
3L, 3L, 2L, 3L, 1L, 1L, 3L, 3L, 1L), .Label = c("1", "2", 
"3"), class = "factor")), .Names = c("smoker", "mi", "angina", 
"pvd", "isch.stroke", "ht.1", "tertile"), row.names = c(NA, -10L
), class = "data.frame")


  ddply(dat, .(tertile),colwise(fun1) )
#  tertile smoker     mi angina   pvd isch.stroke ht.1
#1       1   0.00   0.00  33.33 33.33        0.00    0
#2       2   0.00 100.00   0.00  0.00        0.00    0
#3       3  33.33  66.67  50.00 50.00       66.67  100
 library(dplyr)
  dat%>%
  group_by(tertile)%>% 
  summarise_each(funs(fun1))
  #Source: local data frame [3 x 7]

 #   tertile smoker     mi angina   pvd isch.stroke ht.1
 #1       1   0.00   0.00  33.33 33.33        0.00    0
 #2       2   0.00 100.00   0.00  0.00        0.00    0
 #3       3  33.33  66.67  50.00 50.00       66.67  100