R 使用多个列按条件进行变异，每个列具有不同的设置_R_Dplyr_Conditional_Mutate

R 使用多个列按条件进行变异，每个列具有不同的设置

R 使用多个列按条件进行变异，每个列具有不同的设置,r,dplyr,conditional,mutate,R,Dplyr,Conditional,Mutate,我一直在寻找，但没有发现如何为dplyr中的许多列创建简单的if 我有以下代码（它可以工作）：我不想重复这种情况。诸如此类： PlantGrowth %>% mutate_if_foo ( group=="ctrl",{ a=weight*2, b=weight*1,5, c=weight*4, d=weight*5 } )%>% mutate_if_foo ( group!="ctrl",{ a=weight*100, b=weig

我一直在寻找，但没有发现如何为dplyr中的许多列创建简单的if

我有以下代码（它可以工作）：

我不想重复这种情况。诸如此类：

PlantGrowth %>% mutate_if_foo (
  group=="ctrl",{
   a=weight*2,
   b=weight*1,5,
   c=weight*4,
   d=weight*5
  }
)%>% mutate_if_foo (
  group!="ctrl",{
   a=weight*100,
   b=weight/100),
   c=weight*100),
   d=weight/1000)
  }
)

我已经找到了很多关于

mutate\u if

，

mutate\u all

，

mutate\u at

，

case\u when

的答案，但他们没有回答我的问题

请用dplyr/tidyverse

提前谢谢

编辑

我试过，从@Rohit_das关于函数的想法

mtcars %>% ( function(df) { 
  if (df$am==1){
    df%>% mutate(
      a=df$mpg*3,
      b=df$cyl*10) 
   }else{ 
     df%>% mutate(
      a=df$disp*300,
      d=df$cyl*1000) 
   }
})

但我有一个警告：

In if (df$am == 1) { : 
the condition has length > 1 
and only the first element will be used

我不确定我是否理解这里的问题。如果您只是想减少代码的冗长，那么只需创建一个自定义函数

customif = function(x,y) { 
   if_else(group=="ctrl", weight*x, weight*y)
}

然后你可以在你的mutate中调用这个函数

PlantGrowth %>% mutate (
  a=customif(2,100),
  b=customif(1,5, 1/100),
  c=customif(4, 100),
  d=customif(5, 1/1000)
)

我想我找到了一个很好的解决办法。它接受输入的数据帧，然后为每列动态命名新列

a:d

。第一列将使用

x=2

、

y=100

和

z=“a”

，然后使用下一行，依此类推。像这样的函数式编程最酷的地方是它很容易扩展

library(tidyverse)

iterate <- tibble(x = c(2, 1.5, 4, 5),
                  y = c(100, 1/100, 100, 1/1000),
                  z = c("a", "b", "c", "d"))

fun <- function(x, y, z) {
  PlantGrowth %>% 
    mutate(!!z := if_else(group == "ctrl", weight * x, weight * y)) %>% 
    select(3)
}

PlantGrowth %>% 
  bind_cols(
    pmap_dfc(iterate, fun)
    ) %>% 
  as_tibble

库（tidyverse）
迭代%
选择（3）
}
植物生长%>%
捆扎(
pmap_dfc（迭代，乐趣）
) %>% 
不可抵抗

这将为您提供相同的df：

# A tibble: 30 x 6
   weight group     a     b     c     d
    <dbl> <fct> <dbl> <dbl> <dbl> <dbl>
 1   4.17 ctrl   8.34  6.26  16.7  20.8
 2   5.58 ctrl  11.2   8.37  22.3  27.9
 3   5.18 ctrl  10.4   7.77  20.7  25.9
 4   6.11 ctrl  12.2   9.17  24.4  30.6
 5   4.5  ctrl   9     6.75  18    22.5

#一个tible:30 x 6
体重a、b、c、d组
1 4.17 ctrl 8.34 6.26 16.7 20.8
2 5.58 ctrl 11.2 8.37 22.3 27.9
3 5.18 ctrl 10.4 7.77 20.7 25.9
4 6.11 ctrl 12.2 9.17 24.4 30.6
5 4.5 ctrl 9 6.75 18 22.5

我想我找到了答案。我在mtcars上进行了测试。我还没有测试我真正的代码

请评论，如果我你认为我的概念是错误的

过滤器的条件必须是唯一的，否则我将采取重复行

        library(dplyr)
        library(magrittr)
        library(tibble) # only if necessary to preserve rownames
        mtcars %>% ( function(df) { 
                rbind(
                    (df 
                     %>% tibble::rownames_to_column(.) %>%tibble::rowid_to_column(.)  # to preserve rownames
                     %>%dplyr::filter(am==1) 
                     %>%dplyr::mutate(
                        a=mpg*3,
                        b=cyl*10,d=NA)),
                    (df 
                     %>% tibble::rownames_to_column(.) %>%tibble::rowid_to_column(.)  # to preserve rownames
                     %>%dplyr::filter(am!=1) 
                     %>%dplyr::mutate(
                      a=disp*3,
                      d=cyl*100,b=NA))
                )
        })     %>%arrange(rowid)

自定义函数可以是一个解决方案，谢谢这个例子，但你的解决方案不是我要找的。我采取了简单的操作，但实际上可能更复杂，而不是在同一个n-upplet之间。我正在寻找分发每个不同的条件一次只根据条件，也许分发您的自定义功能，一次只根据条件。在过程中，这是可能的，但我正在寻找dplyr流解决方案：也许它不是为那个而设计的，就像在SQL中一样。这是真的：只是为了减少冗长。这是一个美丽的大脑果汁，但在这种情况下，它比最初的代码更复杂：p@phili_b那当然是真的！只有当它使代码更容易时才应该使用它。如果您发现您需要在超过4列上执行相同的逻辑，并且代码变得“长”，这有助于保持头脑清醒：）

        library(dplyr)
        library(magrittr)
        library(tibble) # only if necessary to preserve rownames
        mtcars %>% ( function(df) { 
                rbind(
                    (df 
                     %>% tibble::rownames_to_column(.) %>%tibble::rowid_to_column(.)  # to preserve rownames
                     %>%dplyr::filter(am==1) 
                     %>%dplyr::mutate(
                        a=mpg*3,
                        b=cyl*10,d=NA)),
                    (df 
                     %>% tibble::rownames_to_column(.) %>%tibble::rowid_to_column(.)  # to preserve rownames
                     %>%dplyr::filter(am!=1) 
                     %>%dplyr::mutate(
                      a=disp*3,
                      d=cyl*100,b=NA))
                )
        })     %>%arrange(rowid)