R因子级别作为列名和计数值

R因子级别作为列名和计数值,r,R,我希望将不同变量的因子级别作为列名,并将每个PatID的计数作为值。 我所拥有的是: data_sample <- data.frame( PatID = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L), status1 = c("I250", "NA", "NA", "X560", "M206", "NA", "NA", "M206", "NA"), status2 = c(".", "M206", "NA", "I250", "I250",

我希望将不同变量的因子级别作为列名,并将每个PatID的计数作为值。 我所拥有的是:

data_sample <- data.frame(
  PatID   = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L),
  status1 = c("I250", "NA", "NA", "X560", "M206", "NA", "NA", "M206", "NA"),
  status2 = c(".", "M206", "NA", "I250", "I250", "M206", "NA", "NA", "X560"),
  status3 = c(".", "I250", "NA", "NA", "NA", "I250", "X560", "NA", "NA")
)
有人能帮忙吗?我尝试了dcast和其他方法,但结果一直没有出现

data\u sample%
data_sample <- data.frame(
  PatID   = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L),
  status1 = c("I250", "NA", "NA", "X560", "M206", "NA", "NA", "M206", "NA"),
  status2 = c(".", "M206", "NA", "I250", "I250", "M206", "NA", "NA", "X560"),
  status3 = c(".", "I250", "NA", "NA", "NA", "I250", "X560", "NA", "NA")
)

library(tidyverse)
data_sample %>%
  gather(status_num, value, -PatID) %>%
  filter(value != "NA", value != ".") %>%
  count(PatID, value) %>%  # Improvement by @antoniosk 
  spread(value, n, fill = 0)

# A tibble: 3 x 4
# Groups:   PatID [3]
  PatID  I250  M206  X560
  <int> <int> <int> <int>
1     1     2     1    NA
2     2     2     1     1
3     3     1     2     2
聚集(状态数量,值,-PatID)%>% 过滤器(值!=“NA”,值!=”)%>% 计数(PatID,value)%>%#由@antoniosk改进 排列(值,n,填充=0) #一个tibble:3x4 #组别:PatID[3] 帕蒂I250 M206 X560 1 1 2 1 NA 2 2 2 1 1 3 3 1 2 2
使用“data.table”可以执行如下操作:
库(data.table);melt(如.data.table(data_sample),measure.vars=patterns(“^status”)[!值%c(“.”,“NA”),dcast(.SD,PatID~value)]
data_sample <- data.frame(
  PatID   = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L),
  status1 = c("I250", "NA", "NA", "X560", "M206", "NA", "NA", "M206", "NA"),
  status2 = c(".", "M206", "NA", "I250", "I250", "M206", "NA", "NA", "X560"),
  status3 = c(".", "I250", "NA", "NA", "NA", "I250", "X560", "NA", "NA")
)

library(tidyverse)
data_sample %>%
  gather(status_num, value, -PatID) %>%
  filter(value != "NA", value != ".") %>%
  count(PatID, value) %>%  # Improvement by @antoniosk 
  spread(value, n, fill = 0)

# A tibble: 3 x 4
# Groups:   PatID [3]
  PatID  I250  M206  X560
  <int> <int> <int> <int>
1     1     2     1    NA
2     2     2     1     1
3     3     1     2     2