R从宽变长:多变量、多指标观测

R从宽变长:多变量、多指标观测,r,reshape,R,Reshape,我得到了一些包含多个idicies$y{ibc}$的观测值的数据,这些idicies以凌乱的宽格式存储。我一直在摆弄tidyr和Reforme2,但没能弄清楚(Reforming确实是我的死敌) 以下是一个例子: df <- structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9), a1b1c1 = c(5, 2, 1, 4, 3, 1, 0, 1, 3), a2b1c1 = c(3, 4, 1, 1, 3, 2, 1, 4, 4 ), a3b1

我得到了一些包含多个idicies$y{ibc}$的观测值的数据,这些idicies以凌乱的宽格式存储。我一直在摆弄tidyr和Reforme2,但没能弄清楚(Reforming确实是我的死敌)

以下是一个例子:

df <- structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9), a1b1c1 = c(5, 
2, 1, 4, 3, 1, 0, 1, 3), a2b1c1 = c(3, 4, 1, 1, 3, 2, 1, 4, 4
), a3b1c1 = c(4, 0, 0, 1, 1, 1, 0, 0, 1), a1b2c1 = c(1, 0, 4, 
2, 4, 1, 0, 4, 2), a2b2c1 = c(2, 0, 1, 0, 1, 0, 3, 2, 0), a3b2c1 = c(2, 
4, 3, 0, 2, 3, 3, 3, 4), yc1 = c(1, 2, 2, 1, 2, 2, 2, 1, 1), a1b1c2 = c(4, 
2, 3, 0, 4, 4, 2, 1, 4), a2b1c2 = c(3, 0, 3, 3, 4, 4, 3, 2, 2
), a3b1c2 = c(3, 1, 0, 1, 4, 0, 2, 2, 3), a1b2c2 = c(2, 2, 0, 
3, 2, 1, 4, 1, 0), a2b2c2 = c(3, 0, 2, 3, 4, 4, 4, 0, 4), a3b2c2 = c(0, 
0, 0, 2, 0, 0, 1, 4, 3), yc2 = c(2, 2, 2, 1, 2, 2, 2, 1, 1), X = c(5, 
6, 3, 7, 4, 3, 2, 3, 2)), row.names = c(NA, -9L), class = c("tbl_df", 
"tbl", "data.frame"))

使用
tidyr
dplyr

library(tidyverse)

df %>% 
  pivot_longer(cols = matches("a.b.c."), names_to = "name", values_to = "value") %>% 
  separate(name, into = c("a", "b", "c"), sep = c(2,4)) %>% 
  mutate(y = case_when(c == "c1" ~ yc1,
                       c == "c2" ~ yc2)) %>% 
  pivot_wider(names_from = a, values_from = value) %>% 
  select(id, b, c, y, a1, a2, a3, X)
首先,将所有a/b/c列转换为长格式&将3个值分隔为单独的列。然后,根据使用
mutate
case\u时
c
的值,将
y
列合并为一列(您也可以使用
if\u else
作为两个选项,但
case\u时可扩展为更多值)。然后将
a
列旋转回宽格式,并使用
select
将它们按正确顺序排列,然后去掉
yc1
yc2

library(tidyverse)

df %>% 
  pivot_longer(cols = matches("a.b.c."), names_to = "name", values_to = "value") %>% 
  separate(name, into = c("a", "b", "c"), sep = c(2,4)) %>% 
  mutate(y = case_when(c == "c1" ~ yc1,
                       c == "c2" ~ yc2)) %>% 
  pivot_wider(names_from = a, values_from = value) %>% 
  select(id, b, c, y, a1, a2, a3, X)