数据帧上的扩展函数(R中有重复项)
我正试图以一种有效的方式将复制从数据集中的结果中分离出来。我以这些数据为例:数据帧上的扩展函数(R中有重复项),r,tidyr,R,Tidyr,我正试图以一种有效的方式将复制从数据集中的结果中分离出来。我以这些数据为例: x <- data.frame(sample = c("AA", "AA", "BB", "BB", "CC", "CC"), Gene = c("HSA-let1","HSA-let1","HSA-let1","HSA-let1","HSA-let1","HSA-let1"), Cq = c(14.55, 14.45, 13.55, 13.45,
x <- data.frame(sample = c("AA", "AA", "BB", "BB", "CC", "CC"),
Gene = c("HSA-let1","HSA-let1","HSA-let1","HSA-let1","HSA-let1","HSA-let1"),
Cq = c(14.55, 14.45, 13.55, 13.45, 16.55, 16.45))
我得到了重复的标识符错误。我尝试了下面的修复代码,它在一个列中给出了两个值,用“,”分隔。这几乎成功了,但我希望他们分开:
x_test <- dcast(setDT(x), Gene ~ sample, value.var = 'Cq',
fun.aggregate = function(x) toString(unique(x)))
您可以更改
sample
列:
library(data.table)
setDT(x)[, sample := paste(sample, ifelse(!duplicated(sample), '1', '2'), sep = '_')]
dcast(x, ...~sample, value.var = 'Cq')
# Gene AA_1 AA_2 BB_1 BB_2 CC_1 CC_2
# 1: HSA-let1 14.55 14.45 13.55 13.45 16.55 16.45
注:spread
应称为spread(x,sample,Cq)
编辑
如果有不同数量的重复值(不总是2),可以执行以下操作:
x <- setDT(x)[order(sample),]
x[, sample := paste(sample, unlist(lapply(table(x$sample), function(x) 1:x)), sep = '_')]
dcast(x, ...~sample, value.var = 'Cq')
x你可以试试这个
library(dplyr)
x %>% group_by(Gene) %>%
mutate(sample = paste(sample, seq(n()), sep = "_")) %>%
spread(sample, Cq)
使样品独特,然后传播:
x %>%
group_by(sample) %>%
mutate(rn = row_number()) %>%
ungroup() %>%
mutate(sample = paste(sample, rn, sep = "_")) %>%
select(-rn) %>%
spread(key = sample, value = Cq)
# # A tibble: 1 x 7
# Gene AA_1 AA_2 BB_1 BB_2 CC_1 CC_2
# <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 HSA-let1 14.6 14.4 13.6 13.4 16.6 16.4
x%>%
分组依据(样本)%>%
变异(rn=行数())%>%
解组()%>%
变异(样本=粘贴(样本,rn,sep=“”))%>%
选择(-rn)%>%
排列(键=样本,值=Cq)
##A tible:1 x 7
#基因AA_1 AA_2 BB_1 BB_2 CC_1 CC_2
#
#1 HSA-let1 14.6 14.4 13.6 13.4 16.6 16.4
x <- setDT(x)[order(sample),]
x[, sample := paste(sample, unlist(lapply(table(x$sample), function(x) 1:x)), sep = '_')]
dcast(x, ...~sample, value.var = 'Cq')
library(dplyr)
x %>% group_by(Gene) %>%
mutate(sample = paste(sample, seq(n()), sep = "_")) %>%
spread(sample, Cq)
x %>%
group_by(sample) %>%
mutate(rn = row_number()) %>%
ungroup() %>%
mutate(sample = paste(sample, rn, sep = "_")) %>%
select(-rn) %>%
spread(key = sample, value = Cq)
# # A tibble: 1 x 7
# Gene AA_1 AA_2 BB_1 BB_2 CC_1 CC_2
# <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 HSA-let1 14.6 14.4 13.6 13.4 16.6 16.4