Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/81.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 如果分类变量';s频率低于定义的值_R - Fatal编程技术网

R 如果分类变量';s频率低于定义的值

R 如果分类变量';s频率低于定义的值,r,R,以下是数据集(d)的示例: 为了检查SNP基因型的频率(0,1,2),我们可以使用table命令 table (d$rs3) 输出将是 0 1 2 5 2 1 这里我们想重新编码变量,如果基因型2的频率,我们可以尝试 d[] <- lapply(d, function(x) if(sum(x==2, na.rm=TRUE) < 3) replace(x, x==2, 1) else x) d # rs3 rs4 rs5 rs6 #1 1 0 0

以下是数据集(d)的示例:

为了检查SNP基因型的频率(0,1,2),我们可以使用table命令

table (d$rs3)
输出将是

0 1 2 
5 2 1
这里我们想重新编码变量,如果基因型2的频率,我们可以尝试

 d[] <- lapply(d, function(x) 
    if(sum(x==2, na.rm=TRUE) < 3) replace(x, x==2, 1) else x)
d
#   rs3 rs4 rs5 rs6
#1   1   0   0   0
#2   1   0   1   0
#3   0   0   0   0
#4   1   0   1   0
#5   0   0   0   0
#6   0   2   0   1
#7   0   2  NA   1
#8   0   2   1   1
#9  NA   1   1   1
这是另一个可能的(矢量化)解决方案

indx
rs3 rs4 rs5 rs6
1   0   0   0
1   0   1   0
0   0   0   0
1   0   1   0
0   0   0   0
0   2   0   1
0   2   NA  1
0   2   1   1
NA  1   1   1
 d[] <- lapply(d, function(x) 
    if(sum(x==2, na.rm=TRUE) < 3) replace(x, x==2, 1) else x)
d
#   rs3 rs4 rs5 rs6
#1   1   0   0   0
#2   1   0   1   0
#3   0   0   0   0
#4   1   0   1   0
#5   0   0   0   0
#6   0   2   0   1
#7   0   2  NA   1
#8   0   2   1   1
#9  NA   1   1   1
library(dplyr)
d %>%
    mutate_each(funs(if(sum(.==2, na.rm=TRUE) <3) 
                replace(., .==2, 1) else .))
indx <- colSums(d == 2, na.rm = TRUE) < 3 # Select columns by condition
d[indx][d[indx] == 2] <- 1 # Inset 1 when the subset by condition equals 2
d
#   rs3 rs4 rs5 rs6
# 1   1   0   0   0
# 2   1   0   1   0
# 3   0   0   0   0
# 4   1   0   1   0
# 5   0   0   0   0
# 6   0   2   0   1
# 7   0   2  NA   1
# 8   0   2   1   1
# 9  NA   1   1   1