Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/67.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 以百分比表示的汇总_R_Summary - Fatal编程技术网

R 以百分比表示的汇总

R 以百分比表示的汇总,r,summary,R,Summary,我有一个如下所示的示例数据集。我很容易从这里得到摘要 a <- structure(list(Occ = c(1, 1, 2, 2, 3, 3, 4, 5, 5, 5), Type = c("A", "B", "C", "A", "A", "A", "B", "C", "C", "B"), Alc = c("A", "B", "N", "A", "N", "N", "N", "A", "B", "B"), Count = c(10, 10, 20, 10, 15, 15, 10, 1

我有一个如下所示的示例数据集。我很容易从这里得到摘要

a <- structure(list(Occ = c(1, 1, 2, 2, 3, 3, 4, 5, 5, 5), 
Type = c("A", "B", "C", "A", "A", "A", "B", "C", "C", "B"), 
Alc = c("A", "B", "N", "A", "N", "N", "N", "A", "B", "B"), 
Count = c(10, 10, 20, 10, 15, 15, 10, 10, 20, 15)),
.Names = c("Occ", "Type", "Alc", "Count"), row.names = c(NA, -10L), class = "data.frame")
a$Occ <- factor(a$Occ)
a$Type <- factor(a$Type)
a$Alc<- factor(a$Alc)
a
   Occ Type Alc Count
1    1    A   A    10
2    1    B   B    10
3    2    C   N    20
4    2    A   A    10
5    3    A   N    15
6    3    A   N    15
7    4    B   N    10
8    5    C   A    10
9    5    C   B    20
10   5    B   B    15

summary(a)
Occ   Type  Alc       Count     
1:2   A:4   A:3   Min.   :10.0  
2:2   B:3   B:3   1st Qu.:10.0  
3:2   C:3   N:4   Median :12.5  
4:1               Mean   :13.5  
5:3               3rd Qu.:15.0  
                  Max.   :20.0 

感谢您的帮助。

这里是您的起点。您可能需要稍微修改以满足您的特定需求

library(data.table)
dt = as.data.table(a)

for(b in names(dt)[1:3]) print(dt[, sum(Count), by = b][, V1 := 100*V1/sum(V1)])
#   Occ        V1
#1:   1 14.814815
#2:   2 22.222222
#3:   3 22.222222
#4:   4  7.407407
#5:   5 33.333333
#   Type       V1
#1:    A 37.03704
#2:    B 25.92593
#3:    C 37.03704
#   Alc       V1
#1:   A 22.22222
#2:   B 33.33333
#3:   N 44.44444

这是你的出发点。您可能需要稍微修改以满足您的特定需求

library(data.table)
dt = as.data.table(a)

for(b in names(dt)[1:3]) print(dt[, sum(Count), by = b][, V1 := 100*V1/sum(V1)])
#   Occ        V1
#1:   1 14.814815
#2:   2 22.222222
#3:   3 22.222222
#4:   4  7.407407
#5:   5 33.333333
#   Type       V1
#1:    A 37.03704
#2:    B 25.92593
#3:    C 37.03704
#   Alc       V1
#1:   A 22.22222
#2:   B 33.33333
#3:   N 44.44444

计算这些值的最佳基函数可能是
xtabs
。在这里,我用一些格式将其包装起来,使其成为一个百分比值

myfactors <- names(a)[sapply(a, is.factor)]
lapply(myfactors, function(f) {
    round(xtabs(as.formula(paste0("Count~", f)), a)/sum(a$Count)*100,2)
})

计算这些值的最佳基函数可能是
xtabs
。在这里,我用一些格式将其包装起来,使其成为一个百分比值

myfactors <- names(a)[sapply(a, is.factor)]
lapply(myfactors, function(f) {
    round(xtabs(as.formula(paste0("Count~", f)), a)/sum(a$Count)*100,2)
})
将另外两个看起来更好的答案向上投票

将另外两个看起来更好的答案向上投票

require(Hmisc)  # for wtd.table
 sapply( 
      sapply( 
             sapply(a[1:3], wtd.table, a$Count, "table"), 
             "/", sum(a$Count)/100 ), 
      round, 1)
$Occ
   1    2    3    4    5 
14.8 22.2 22.2  7.4 33.3 

$Type
   A    B    C 
37.0 25.9 37.0 

$Alc
   A    B    N 
22.2 33.3 44.4