Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/78.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 在数据框中排列染色体编号_R - Fatal编程技术网

R 在数据框中排列染色体编号

R 在数据框中排列染色体编号,r,R,我有一个包含样本染色体及其频率的文件: a sample Chr_No frequency sample-1 chr1: 0 sample-1 chr2: 0 sample-1 chr3: 0 sample-1 chr4: 1 sample-1 chr5: 0 sample-1 chr6: 0 sample-1 chr7: 0 sample-1

我有一个包含样本染色体及其频率的文件:

 a
 sample   Chr_No   frequency
 sample-1  chr1:         0
 sample-1  chr2:         0
 sample-1  chr3:         0
 sample-1  chr4:         1
 sample-1  chr5:         0
 sample-1  chr6:         0
 sample-1  chr7:         0
 sample-1  chr8:         0
 sample-1  chr9:         1
 sample-1  chr10         0
 sample-1  chr11         0
 ......
我想将其转换为数据帧,因此,我在R中使用:

 b <- dcast( a, Sample ~ Chr_No, value.var = "Frequency", fill = 0 )

b首先从名称中删除冒号,然后使用
mixedsort
将名称排列为
chr1
chr2

library(gtools)

names(b) <- sub(":", "", names(b))
cbind(b[1], b[-1][mixedsort(names(b[-1]))])


#    sample chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11
#1 sample-1    0    0    0    1    0    0    0    0    1     0     0

dcast
之前的
order
的另一个选项是将其更改为
factor
列,在删除'Chr\u No'中字符串末尾的
后指定
级别

library(data.table)
setDT(a)[, Chr_No := factor(sub(':$', '', Chr_No), levels = paste0("chr", 1:11))]
然后,执行
dcast

dcast( a, sample ~ Chr_No, value.var = "frequency", fill = 0 )
#     sample chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11
#1: sample-1    0    0    0    1    0    0    0    0    1     0     0
数据
a在我的情况下它不起作用。是的,我意识到你需要的不止这些。我添加了一个答案,看看它是否适用于您的案例。@RochiSaurabh更新了答案。假设第一列中有
sample
dcast( a, sample ~ Chr_No, value.var = "frequency", fill = 0 )
#     sample chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11
#1: sample-1    0    0    0    1    0    0    0    0    1     0     0
a <- structure(list(sample = c("sample-1", "sample-1", "sample-1", 
"sample-1", "sample-1", "sample-1", "sample-1", "sample-1", "sample-1", 
 "sample-1", "sample-1"), Chr_No = c("chr1:", "chr2:", "chr3:", 
 "chr4:", "chr5:", "chr6:", "chr7:", "chr8:", "chr9:", "chr10", 
 "chr11"), frequency = c(0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 
 0L)), class = "data.frame", row.names = c(NA, -11L))