如何根据R中的其他列对特定列中的值进行分类

如何根据R中的其他列对特定列中的值进行分类,r,dataframe,dplyr,tidyverse,R,Dataframe,Dplyr,Tidyverse,我有一个包含以下细节的数据框 BatchId Datetime Purchase_Status Current_Progress PRT-10011 2021-03-01 15:18:24 Sold Pending PRT-10012 2021-03-12 18:11:04 Sold PRT-10013 2021-03-15 21:13:45

我有一个包含以下细节的数据框

BatchId      Datetime              Purchase_Status        Current_Progress
PRT-10011    2021-03-01 15:18:24   Sold                   Pending
PRT-10012    2021-03-12 18:11:04   Sold                   
PRT-10013    2021-03-15 21:13:45   Open                   
PRT-10014                          Open                   
PRT-10015    2021-03-18 10:06:36   Return                 Pending
PRT-10016                          Process                Pending
Dput(df)

我需要在以下条件下再添加一列作为
Category

  • 如果
    Purchase\u Status
    已售出且
    Current\u Progress
    不为空,则Na或null将Purchase\u Status值和Current\u Progress值连接为“-”
  • 如果
    Purchase\u Status
    已售出且
    Current\u Progress
    为空,则Na或null将Purchase\u Status值与文本“Not Updated”(未更新)连接为“-”
  • 如果
    Purchase\u Status
    处于打开状态且Datetime不为空,则Na或null将Purchase\u Status值与文本“Order Placed”通过“-”连接起来
  • 如果
    Purchase\u Status
    处于打开状态且Datetime为空,则Na或null将Purchase\u Status值与文本“Order Not Placed”(未下订单)通过“-”连接起来
  • 对于除“已售出”和“未结”之外的其余
    Purchase\u状态
    ,将其作为其他,并根据Datetime列中值的可用性,将其与文本“未下单”或“下单”连接起来
输出测向

BatchId      Datetime              Purchase_Status        Current_Progress     Category
PRT-10011    2021-03-01 15:18:24   Sold                   Pending              Sold - Pending
PRT-10012    2021-03-12 18:11:04   Sold                                        Sold - Not Updated
PRT-10013    2021-03-15 21:13:45   Open                                        Open - Order Placed
PRT-10014                          Open                                        Open - Order Not Placed
PRT-10015    2021-03-18 10:06:36   Return                 Pending              Other - Order Placed
PRT-10016                          Process                Pending              Other - Order Not Placed

如注释所述,您应该能够在执行此操作时使用
dplyr::case\u。你的电话应该是这样的

df %>%
  dplyr::mutate(Category = dplyr::case_when(
    Purchase_Status == "Sold" & !is.na(Current_Progess) ~ paste(Purchase_Status, Current_Progess, sep = "-"),
    # OTHER CASES HERE)
)
添加其他案例并使用
~
将其映射到值

df %>%
  replace_na(list(Current_Progress = "")) %>%  # simplifies below to test for just "" 
                                               # instead of "" and NA
  mutate(Category = case_when(
    Purchase_Status == "Sold" & Current_Progress != "" ~ paste0(Purchase_Status, "-", Current_Progress),
    Purchase_Status == "Sold" ~ paste0(Purchase_Status, "-Not Updated"),
    Purchase_Status == "Open" & Current_Progress != "" ~ paste0(Purchase_Status, "-Order Placed"),
    Purchase_Status == "Open" ~ paste0(Purchase_Status, "-Order Not Placed"),
    is.na(Datetime) ~ "Order Not Placed",
    TRUE ~ "Order Placed")
  )
dplyr::case\u当
按顺序测试每个条件时,因此如果前面的情况都不匹配,则最后一步不需要测试——我们可以将其视为真

         BatchId            Datetime Purchase_Status Current_Progress              Category
12426  PRT-10011 2019-05-20 10:46:49            Sold          Pending          Sold-Pending
21988  PRT-10012 2020-09-24 12:28:10            Sold                       Sold-Not Updated
22555  PRT-10013 2019-05-31 06:12:12            Open                  Open-Order Not Placed
12486  PRT-10014                <NA>            Open                  Open-Order Not Placed
15432  PRT-10015 2019-09-26 11:36:58          Return          Pending          Order Placed
16934  PRT-10016                <NA>         Process          Pending      Order Not Placed
BatchId日期时间采购\u状态当前\u进度类别
12426 PRT-10011 2019-05-20 10:46:49待售待售
21988 PRT-10012 2020-09-24 12:28:10售出未更新
22555 PRT-10013 2019-05-31 06:12:12未结订单未下
12486 PRT-10014未结订单未下
15432 PRT-10015 2019-09-26 11:36:58退货待决订单已下
16934 PRT-10016流程待定订单未下达

dplyr::case\u的伟大用例。请您以我们可以直接加载的方式包含数据,例如,在您的问题中包含
dput(您的数据帧)
?@JonSpring-我已经更新了
dput
         BatchId            Datetime Purchase_Status Current_Progress              Category
12426  PRT-10011 2019-05-20 10:46:49            Sold          Pending          Sold-Pending
21988  PRT-10012 2020-09-24 12:28:10            Sold                       Sold-Not Updated
22555  PRT-10013 2019-05-31 06:12:12            Open                  Open-Order Not Placed
12486  PRT-10014                <NA>            Open                  Open-Order Not Placed
15432  PRT-10015 2019-09-26 11:36:58          Return          Pending          Order Placed
16934  PRT-10016                <NA>         Process          Pending      Order Not Placed