R因子级别作为列名和计数值
我希望将不同变量的因子级别作为列名,并将每个PatID的计数作为值。 我所拥有的是:R因子级别作为列名和计数值,r,R,我希望将不同变量的因子级别作为列名,并将每个PatID的计数作为值。 我所拥有的是: data_sample <- data.frame( PatID = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L), status1 = c("I250", "NA", "NA", "X560", "M206", "NA", "NA", "M206", "NA"), status2 = c(".", "M206", "NA", "I250", "I250",
data_sample <- data.frame(
PatID = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L),
status1 = c("I250", "NA", "NA", "X560", "M206", "NA", "NA", "M206", "NA"),
status2 = c(".", "M206", "NA", "I250", "I250", "M206", "NA", "NA", "X560"),
status3 = c(".", "I250", "NA", "NA", "NA", "I250", "X560", "NA", "NA")
)
有人能帮忙吗?我尝试了dcast和其他方法,但结果一直没有出现data\u sample%
data_sample <- data.frame(
PatID = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L),
status1 = c("I250", "NA", "NA", "X560", "M206", "NA", "NA", "M206", "NA"),
status2 = c(".", "M206", "NA", "I250", "I250", "M206", "NA", "NA", "X560"),
status3 = c(".", "I250", "NA", "NA", "NA", "I250", "X560", "NA", "NA")
)
library(tidyverse)
data_sample %>%
gather(status_num, value, -PatID) %>%
filter(value != "NA", value != ".") %>%
count(PatID, value) %>% # Improvement by @antoniosk
spread(value, n, fill = 0)
# A tibble: 3 x 4
# Groups: PatID [3]
PatID I250 M206 X560
<int> <int> <int> <int>
1 1 2 1 NA
2 2 2 1 1
3 3 1 2 2
聚集(状态数量,值,-PatID)%>%
过滤器(值!=“NA”,值!=”)%>%
计数(PatID,value)%>%#由@antoniosk改进
排列(值,n,填充=0)
#一个tibble:3x4
#组别:PatID[3]
帕蒂I250 M206 X560
1 1 2 1 NA
2 2 2 1 1
3 3 1 2 2
使用“data.table”可以执行如下操作:库(data.table);melt(如.data.table(data_sample),measure.vars=patterns(“^status”)[!值%c(“.”,“NA”),dcast(.SD,PatID~value)]
。
data_sample <- data.frame(
PatID = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L),
status1 = c("I250", "NA", "NA", "X560", "M206", "NA", "NA", "M206", "NA"),
status2 = c(".", "M206", "NA", "I250", "I250", "M206", "NA", "NA", "X560"),
status3 = c(".", "I250", "NA", "NA", "NA", "I250", "X560", "NA", "NA")
)
library(tidyverse)
data_sample %>%
gather(status_num, value, -PatID) %>%
filter(value != "NA", value != ".") %>%
count(PatID, value) %>% # Improvement by @antoniosk
spread(value, n, fill = 0)
# A tibble: 3 x 4
# Groups: PatID [3]
PatID I250 M206 X560
<int> <int> <int> <int>
1 1 2 1 NA
2 2 2 1 1
3 3 1 2 2