使用R在单个单元格中存在多个值时创建唯一值的计数表
我正在尝试从如下所示的数据表创建计数表:使用R在单个单元格中存在多个值时创建唯一值的计数表,r,count,unique,R,Count,Unique,我正在尝试从如下所示的数据表创建计数表: df <- data.frame("Spring" = c("skirt, pants, shirt", "tshirt"), "Summer" = c("shorts, skirt", "pants, shoes"), Fall = c("Scarf", "purse, pants")) Spring Summer Fall 1 skirt, pants, shirt shorts,
df <- data.frame("Spring" = c("skirt, pants, shirt", "tshirt"), "Summer" =
c("shorts, skirt", "pants, shoes"), Fall = c("Scarf", "purse, pants"))
Spring Summer Fall
1 skirt, pants, shirt shorts, skirt Scarf
2 tshirt pants, shoes purse, pants
dfOnetidyverse
可能性可能是:
df %>%
mutate_if(is.factor, as.character) %>%
gather(var, val) %>%
mutate(val = strsplit(val, ", ")) %>%
unnest() %>%
group_by(var) %>%
summarise(val = n_distinct(val))
var val
<chr> <int>
1 Fall 3
2 Spring 4
3 Summer 4
或者使用@Sonny的基本思想(这只需要dplyr
):
使用总结所有内容
:
getCount <- function(x) {
x <- as.character(x)
length(unique(unlist(strsplit(x, ","))))
}
library(dplyr)
df %>%
summarise_all(funs(getCount))
Spring Summer Fall
1 4 4 3
getCount我不知道怎么在这里放一张表。。。我很抱歉!我要试着解决它。请提出这个问题。@NelsonGon我试过了,我希望这有帮助!不过,谢谢您,这对我的实际数据不起作用,因为我有一些列是相同的情况,例如表在开始时是这样构造的。你知道如何解决这个问题吗???你能描述一下到底是什么问题吗?现在我看到问题了。您可以使用:df%>%mutate\u if(is.factor,as.character)%%>%mutate(test=strsplit(test,”))%%>%unest()%%>%summary\u all(n\u distinct)
。太好了!!非常感谢你。你帮了大忙!!谢谢你,但是我的数据没有得到正确的答案。这就是我所拥有的:这是我的实际数据:测试
df %>%
mutate_if(is.factor, as.character) %>%
gather(var, val) %>%
mutate(val = strsplit(val, ", ")) %>%
unnest() %>%
group_by(var) %>%
summarise(val = n_distinct(val)) %>%
spread(var, val)
Fall Spring Summer
<int> <int> <int>
1 3 4 4
df %>%
mutate_if(is.factor, as.character) %>%
summarise_all(list(~ n_distinct(unlist(strsplit(., ", ")))))
Spring Summer Fall
1 4 4 3
getCount <- function(x) {
x <- as.character(x)
length(unique(unlist(strsplit(x, ","))))
}
library(dplyr)
df %>%
summarise_all(funs(getCount))
Spring Summer Fall
1 4 4 3