R 通过将现有行的一部分移动到新行来重塑数据帧
我有以下格式的数据:R 通过将现有行的一部分移动到新行来重塑数据帧,r,dataframe,dplyr,tidyr,R,Dataframe,Dplyr,Tidyr,我有以下格式的数据: structure(list(choice = structure(c(1L, 1L, 2L, 1L), .Label = c("option1", "option2"), class = "factor"), option1var1 = structure(c(1L, 1L, 1L, 1L), .Label = "A", class = "factor"), option1var2 = structure(c(1L, 1L, 1L, 2L), .Label = c(
structure(list(choice = structure(c(1L, 1L, 2L, 1L), .Label = c("option1",
"option2"), class = "factor"), option1var1 = structure(c(1L,
1L, 1L, 1L), .Label = "A", class = "factor"), option1var2 = structure(c(1L,
1L, 1L, 2L), .Label = c("B", "H"), class = "factor"), option2var1 = structure(c(1L,
1L, 2L, 3L), .Label = c("C", "F", "I"), class = "factor"), option2var2 = structure(1:4, .Label = c("D",
"E", "G", "K"), class = "factor")), .Names = c("choice", "option1var1",
"option1var2", "option2var1", "option2var2"), class = "data.frame", row.names = c(NA,
-4L))
有六列。第一列包含受访者ID,第二列包含关于受访者所做选择的数据(选项1或选项2),第3列和第4列包含与选项1相关的属性,第4列和第5列包含与选项2相关的属性
我想转换数据帧,使其看起来像这样:
structure(list(respondent = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L),
choice = c(1L, 0L, 1L, 0L, 0L, 1L, 1L, 0L), option = structure(c(1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("option1", "option2"
), class = "factor"), var1 = structure(c(1L, 2L, 1L, 2L,
1L, 3L, 1L, 4L), .Label = c("A", "C", "F", "I"), class = "factor"),
var2 = structure(c(1L, 2L, 1L, 3L, 1L, 4L, 5L, 6L), .Label = c("B",
"D", "E", "G", "H", "K"), class = "factor")), .Names = c("respondent",
"choice", "option", "var1", "var2"), class = "data.frame", row.names = c(NA,
-8L))
这需要将每行拆分为两行,将option1数据保留在一行中,将option2数据移动到另一行,并创建一个新的数字变量,其中包含关于哪个选项的信息(每个选项的option1或option2)
似乎没有关于这种类型转换的任何信息——无论是在这里还是在我找到的R文档中。有人知道怎么做吗?假设原始数据帧是
df1
,最终输出是df2
library(tidyverse)
df2 <- df1 %>%
mutate(respondent = 1:n()) %>%
gather(Option, Value, starts_with("option")) %>%
separate(Option, into = c("option", "Var"), sep = 7) %>%
mutate(choice = ifelse(choice == option, 1L, 0L)) %>%
spread(Var, Value) %>%
select(respondent, choice, option, starts_with("var")) %>%
arrange(respondent, option)
df2
# respondent choice option var1 var2
# 1 1 1 option1 A B
# 2 1 0 option2 C D
# 3 2 1 option1 A B
# 4 2 0 option2 C E
# 5 3 0 option1 A B
# 6 3 1 option2 F G
# 7 4 1 option1 A H
# 8 4 0 option2 I K
库(tidyverse)
df2%
变异(应答者=1:n())%>%
聚集(选项,值,以(“选项”)开头)%>%
分离(期权,分为=c(“期权”,“风险值”),sep=7)%>%
变异(选项=ifelse(选项=option,1L,0L))%>%
价差(风险值、价值)%>%
选择(响应者、选择、选项,以(“var”)开头)%>%
安排(答辩人、选择权)
df2
#响应者选择选项var1 var2
#1选择1 A B
#2 1 0选择2 C D
#3 2 1选项1 A B
#4 2 0选择2 C E
#5 3 0选择1 A B
#6 3 1选择2 F G
#7 4 1选项1 A H
#8 4 0选择2 I K