为dataframe中的每列列出多个级别
所需输出:为dataframe中的每列列出多个级别,r,R,所需输出: > head(Gene) Key Func.ensGene Func.genericGene Func.refGene 1 1 intergenic intergenic intergenic 2 2 intergenic intergenic intergenic 3 3 intergenic intergenic intronic 4 4 exonic exo
> head(Gene)
Key Func.ensGene Func.genericGene Func.refGene
1 1 intergenic intergenic intergenic
2 2 intergenic intergenic intergenic
3 3 intergenic intergenic intronic
4 4 exonic exonic exonic
5 5 intergenic intergenic intronic
6 6 intergenic intergenic intronic
我尝试的解决方案仅适用于一列:
Type Func.ensGene Func.genericGene Func.refGene
exonic 1 1 1
intergenic 5 5 2
intronic 0 0 3
我能得到如上所示的输出表并得到条形图吗
其中,X轴具有“类型”,条形图表示每列的计数 我们可以从数据帧中获得所有
唯一的
级别,对于每一列,首先将其转换为因子
,然后再计算唯一级别,从而计算每个级别的计数
unique(Gene["Func.ensGene"])
唯一的\u名称只需使用?xtabs
和?stack
:
unique_names <- unique(unlist(df[-1]))
sapply(df[-1], function(x) table(factor(x, levels = unique_names)))
# Func.ensGene Func.genericGene Func.refGene
#intergenic 5 5 2
#exonic 1 1 1
#intronic 0 0 3
甚至更短,正如@nicola所说:
xtabs( ~ values + ind , stack(df1[,-1]))
对于这两种情况,您可以获得:
table(stack(df1[,-1]))
您更喜欢在data.frame上进一步工作吗
# ind
#values Func.ensGene Func.genericGene Func.refGene
# exonic 1 1 1
# intergenic 5 5 2
# intronic 0 0 3
喜欢基本的R解决方案,但使用data.table
和一些magrittr
可直接获得data.frame(而不是table):
或者更简洁地说(如亨里克所建议的):
如果您更喜欢tidyverse功能:
dcast(melt(df, "Key"), value ~ variable)
library(tidyr)
df%>%
聚集(键=键)%>%
分组依据(键,值)%>%
计数()%>%
排列(键,n,填充=0)
#一个tibble:3x4
#组:值[3]
值Func.ensGene Func.genericGene Func.refGene
1外显子1
2基因间5 2
3内含子0 0 3
数据:
dfdput(head(Gene))
比只head(Gene)
更适合那些可读性较差的人:dcast(melt(df,“Key”),value~variable)
谢谢@Henrik,我忘记了dcast()
的总结功能。
library(magrittr)
library(data.table)
setDT(df)
df %>%
melt(id.vars = "Key") %>%
.[, .N, .(variable, value)] %>%
dcast(value ~ variable, value.var = "N", fill = 0)
value Func.ensGene Func.genericGene Func.refGene
1: exonic 1 1 1
2: intergenic 5 5 2
3: intronic 0 0 3
dcast(melt(df, "Key"), value ~ variable)
library(tidyr)
df %>%
gather(key = Key) %>%
group_by(Key, value) %>%
count() %>%
spread(Key, n, fill = 0)
# A tibble: 3 x 4
# Groups: value [3]
value Func.ensGene Func.genericGene Func.refGene
<chr> <dbl> <dbl> <dbl>
1 exonic 1 1 1
2 intergenic 5 5 2
3 intronic 0 0 3
df <- data.frame(
Key = 1:6,
Func.ensGene = c("intergenic", "intergenic", "intergenic", "exonic", "intergenic", "intergenic"),
Func.genericGene = c("intergenic", "intergenic", "intergenic", "exonic", "intergenic", "intergenic"),
Func.refGene = c("intergenic", "intergenic", "intronic", "exonic", "intronic", "intronic"),
stringsAsFactors = FALSE
)