Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/76.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 如何根据字符串区分多个列_R - Fatal编程技术网

R 如何根据字符串区分多个列

R 如何根据字符串区分多个列,r,R,我有这样的数据 df<- structure(list(`1` = structure(c(3L, 3L, 4L, 3L, 2L, 2L, 3L, 3L, 4L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 3L, 3L, 4L, 4L, 4L, 2L), .Label = c("Het", "Het1-Het2", "Homo", "No"), class = "factor"), `2` = structure(c(4L, 5L

我有这样的数据

df<- structure(list(`1` = structure(c(3L, 3L, 4L, 3L, 2L, 2L, 3L, 
3L, 4L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 3L, 3L, 4L, 
4L, 4L, 2L), .Label = c("Het", "Het1-Het2", "Homo", "No"), class = "factor"), 
    `2` = structure(c(4L, 5L, 4L, 5L, 4L, 4L, 4L, 5L, 4L, 4L, 
    4L, 5L, 5L, 5L, 5L, 4L, 5L, 3L, 3L, 1L, 4L, 5L, 5L, 5L, 4L, 
    2L), .Label = c("Het", "Het1-Het2", "Het2", "Homo", "No"), class = "factor"), 
    `3` = structure(c(3L, 4L, 4L, 4L, 3L, 3L, 3L, 4L, 3L, 3L, 
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 3L, 4L, 3L, 3L, 4L, 
    2L), .Label = c("Het", "Het1-Het2", "Homo", "No"), class = "factor")), class = "data.frame", row.names = c(NA, 
-26L))

df我们可以通过
table()
函数和按频率排序来实现这一点:

out = data.frame(table(df))
> out[order(out$Freq, decreasing = T), ]  # Partial output given
          X1        X2        X3 Freq
55      Homo      Homo      Homo    5
60        No        No      Homo    5
79      Homo        No        No    4
9        Het      Het2       Het    2
54 Het1-Het2      Homo      Homo    2
56        No      Homo      Homo    2
59      Homo        No      Homo    2
76        No      Homo        No    2
1        Het       Het       Het    1
26 Het1-Het2 Het1-Het2 Het1-Het2    1
2  Het1-Het2       Het       Het    0
3       Homo       Het       Het    0
...
例如,第一行上5的
Freq
表示在
X1
中观察到
Homo
X2
X3
中出现了5次


我们可以将第三行中的
Freq
解释为4,这意味着存在4种情况,
X1
No
X2
No
X3
Homo
,使用
dplyr
,您可以只过滤您想要的值:

df %>%
  filter(`1` == "No",
         `2` != "No" & `3` != "No")
   1    2    3
1 No Homo Homo
2 No Homo Homo

使用
计数
进行计数

df %>%
  filter(`1` == "No",
         `2` != "No" & `3` != "No") %>%
  tally()
  n
1 2
当然,@Luis的解决方案更简单(在我的书中是首选),只要您修改以满足您的条件(即,
&
而不是
|
第2列和第3列)。修改是假设我正确阅读了您的请求:

df[df$`1` == "No" & (df$`2` != "No" & df$`3` != "No"),]
    1    2    3
9  No Homo Homo
16 No Homo Homo

sum(df$`1` == "No" & (df$`2` != "No" & df$`3` != "No"))
[1] 2

你想过逻辑比较吗?df$
1
=“No”&(df$
2
!=“No”| df$
3
!=“No”)为您提供第一列中不在第二列或第三列中的编号。此外,只是让您知道,以数字开头(或仅由数字组成)命名列不是一种好的做法。而且,在将来,包括你解决问题的尝试总是有帮助的,这样人们就可以用你的代码解决特定的问题。我喜欢你的答案。谢谢我喜欢并接受了你的邀请answer@Learner,太好了,谢谢你。很高兴我能帮忙:)
df[df$`1` == "No" & (df$`2` != "No" & df$`3` != "No"),]
    1    2    3
9  No Homo Homo
16 No Homo Homo

sum(df$`1` == "No" & (df$`2` != "No" & df$`3` != "No"))
[1] 2