从下一代测序数据重塑R中的重复测量
我有R中的数据集:从下一代测序数据重塑R中的重复测量,r,bioinformatics,reshape,R,Bioinformatics,Reshape,我有R中的数据集: ddat <- data.frame(gene=rep(1:4,1), ID.pat=rep(c("0", "1"), each=10), allele.freq =runif(20,min=0,max=1), SNV=round(runif(20,min=0,max=4))) ddat gene ID.pat allele.freq SNV 1 1 0 0.96841970 1 2 2 0 0.778594
ddat <- data.frame(gene=rep(1:4,1), ID.pat=rep(c("0", "1"), each=10), allele.freq =runif(20,min=0,max=1), SNV=round(runif(20,min=0,max=4)))
ddat
gene ID.pat allele.freq SNV
1 1 0 0.96841970 1
2 2 0 0.77859462 2
3 3 0 0.38308071 0
4 4 0 0.03842660 4
5 1 0 0.11313244 1
6 2 0 0.25727911 0
7 3 0 0.73430856 1
8 4 0 0.93272543 0
9 1 0 0.48698303 3
10 2 0 0.76762848 1
11 3 1 0.86238286 1
12 4 1 0.87513463 2
13 1 1 0.78232771 2
14 2 1 0.24493196 1
15 3 1 0.41582649 0
16 4 1 0.49521680 4
17 1 1 0.17983000 2
18 2 1 0.06170987 0
19 3 1 0.23552103 1
20 4 1 0.26549472 0
如何修改代码以生成所需的输出?加载
重塑2
包
library(reshape2)
首先,修改您的SNV
变量(在输出数据框中将其标记为前缀“SNP_2;”,因此我将使用它)
使用restrape2
中的dcast
将数据帧形成宽格式:
dcast(ddat,ID.pat+gene~SNV,fun.aggregate,value.var="allele.freq")
然后您的输出将如下所示:
ID.pat gene SNP_0 SNP_1 SNP_2 SNP_3 SNP_4
1 0 1 <NA> 0.387 0.125 0.825 <NA>
2 0 2 <NA> <NA> 0.296,0.775 <NA> 0.971
3 0 3 <NA> 0.172 <NA> <NA> 0.873
4 0 4 0.87 0.337 <NA> <NA> <NA>
5 1 1 <NA> 0.49 <NA> 0.455 <NA>
6 1 2 0.169 <NA> 0.402 <NA> <NA>
7 1 3 <NA> <NA> 0.754,0.168 0.509 <NA>
8 1 4 <NA> <NA> 0.86 0.737 0.625
ID.pat基因SNP_0 SNP_1 SNP_2 SNP_3 SNP_4
1 0 1 0.387 0.125 0.825
2 0 2 0.296,0.775 0.971
3 0 3 0.172 0.873
4 0 4 0.87 0.337
5 1 1 0.49 0.455
6 1 2 0.169 0.402
7 1 3 0.754,0.168 0.509
8 1 4 0.86 0.737 0.625
fun.aggregate <- function(x)
if(length(x)==0) as.character(NA) else paste(round(x,3),collapse=",")
dcast(ddat,ID.pat+gene~SNV,fun.aggregate,value.var="allele.freq")
ID.pat gene SNP_0 SNP_1 SNP_2 SNP_3 SNP_4
1 0 1 <NA> 0.387 0.125 0.825 <NA>
2 0 2 <NA> <NA> 0.296,0.775 <NA> 0.971
3 0 3 <NA> 0.172 <NA> <NA> 0.873
4 0 4 0.87 0.337 <NA> <NA> <NA>
5 1 1 <NA> 0.49 <NA> 0.455 <NA>
6 1 2 0.169 <NA> 0.402 <NA> <NA>
7 1 3 <NA> <NA> 0.754,0.168 0.509 <NA>
8 1 4 <NA> <NA> 0.86 0.737 0.625