从R中的分组数据框计算性别百分比

从R中的分组数据框计算性别百分比,r,grouping,R,Grouping,我有一个相当大的数据框架,其中包括关于被分为治疗组的个人的信息。我正在尝试生成每个组的可变平均值和性别百分比。我能够计算出平均数,但我不知道如何得到性别百分比 下面,我生成了我的数据的一个小副本: library(plyr) #create variables and data frame sampleid<-seq(1:100) gender = rep(c("female","male"),c(50,50)) score <- rnorm(100) age<-sample(

我有一个相当大的数据框架,其中包括关于被分为治疗组的个人的信息。我正在尝试生成每个组的可变平均值和性别百分比。我能够计算出平均数,但我不知道如何得到性别百分比

下面,我生成了我的数据的一个小副本:

library(plyr)
#create variables and data frame
sampleid<-seq(1:100)
gender = rep(c("female","male"),c(50,50))
score <- rnorm(100)
age<-sample(25:35,100,replace=TRUE)
treatment <- rep(seq(1:5), each=4)
d <- data.frame(sampleid,gender,age,score, treatment)

>head(d)

  sampleid gender age      score treatment
1        1 female  34  1.6917201         1
2        2 female  26 -1.6189545         1
3        3 female  28  1.2867895         1
4        4 female  34 -0.5027578         1
5        5 female  29 -1.3652895         2
6        6 female  26 -2.4430843         2
然而,我还需要一个额外的栏“女性百分比”,它应该给我每个治疗组中女性的百分比1:5。 有人能帮我添加这个吗?

试试这个

groupstat<-ddply(d, .(treatment),summarise,
                 meansc= mean(score),
                 meanage= mean(age),
                 meanID= mean(sampleid),
                 nfem= length(gender[gender=="female"]), # number females per treatment group
                 nmale= length(gender[gender=="male"]), # number of males per treatment group
                 percentfem= nfem/(nfem+nmale)) # percent females by treatment group
groupstat试试这个

groupstat<-ddply(d, .(treatment),summarise,
                 meansc= mean(score),
                 meanage= mean(age),
                 meanID= mean(sampleid),
                 nfem= length(gender[gender=="female"]), # number females per treatment group
                 nmale= length(gender[gender=="male"]), # number of males per treatment group
                 percentfem= nfem/(nfem+nmale)) # percent females by treatment group

groupstat我将首先分成治疗组(
split(d,f=d$treatment)
),然后计算每组的平均值(
function(x)sum(x$gender==“female”)/length(x$gender)


我将首先分成治疗组(
split(d,f=d$treatment)
),然后计算每组的平均值(
function(x)sum(x$gender==“female”)/length(x$gender)


创建另一个变量,女性=1,男性=0。取其平均值。创建另一个变量,女性=1,男性=0。取其平均值。
groupstat<-ddply(d, .(treatment),summarise,
                 meansc= mean(score),
                 meanage= mean(age),
                 meanID= mean(sampleid),
                 nfem= length(gender[gender=="female"]), # number females per treatment group
                 nmale= length(gender[gender=="male"]), # number of males per treatment group
                 percentfem= nfem/(nfem+nmale)) # percent females by treatment group
sapply(split(d, f = d$treatment), function(x) sum(x$gender == "female")/length(x$gender))