将数据帧中的数据值重新编码为R中的组合值

将数据帧中的数据值重新编码为R中的组合值,r,dataframe,R,Dataframe,我试图比较婚姻状况,我的变量有“已婚”、“未结婚”、“订婚”、“单身”和“未结婚”。我如何使这些数据只读作“已婚”和“未结婚”?(已订婚,视为已婚,单身,不视为未结婚) 样本数据集 data.frame(mstatus = sample(x = c("married", "not married", "engaged",

我试图比较婚姻状况,我的变量有“已婚”、“未结婚”、“订婚”、“单身”和“未结婚”。我如何使这些数据只读作“已婚”和“未结婚”?(已订婚,视为已婚,单身,不视为未结婚)

样本数据集

data.frame(mstatus = sample(x = c("married", 
                                  "not married", 
                                  "engaged", 
                                  "single", 
                                  "not married"), 
                            size = 15, replace = TRUE))
这就是我目前所拥有的

df2 <- df%>%mutate(
  mstatus = (tolower(mstatus))
)
df2%变异(
mstatus=(tolower(mstatus))
)

如果我们需要对“mstatus”重新编码,一个选项是
forcats

library(dplyr)
library(forcats)
df2 %>%
      mutate(mstatus = fct_recode(mstatus, married = "engaged",
         `not married` = "single"))
#      mstatus
#1     married
#2 not married
#3     married
#4 not married
#5 not married
或者,如果有许多值需要更改,请使用
fct\u collapse
,它可以获取值向量

df2 %>%
   mutate(mstatus = fct_collapse(mstatus, married = c('engaged'), 
         `not married` = c("single")))
数据
df2您可以使用
dplyr
中的
mutate()

df%dplyr::mutate(mstatus=case_)(
mstatus==“已婚”| mstatus==“订婚”~“已婚”,
mstatus==“未结婚”| mstatus==“单身”~“未结婚”
))

我想最简单的方法是使用
ifelse
语句:

df2$mstatus_new <- ifelse(df2$mstatus=="engaged"|df2$mstatus=="married", "married", "not married")
df <- df %>% dplyr::mutate(mstatus = case_when(
    mstatus == "married" | mstatus == "engaged"  ~ "married",
    mstatus == "not married" | mstatus == "single" ~ "not married"
))
df2$mstatus_new <- ifelse(df2$mstatus=="engaged"|df2$mstatus=="married", "married", "not married")
df2 <- data.frame(
  mstatus = c("married", "not married", "engaged", "single", "nota married"))
df2
       mstatus
1      married
2  not married
3      engaged
4       single
5 nota married
df2
       mstatus mstatus_new
1      married     married
2  not married not married
3      engaged     married
4       single not married
5 nota married not married