如何在dplyr中的摘要中使用循环_R_Dplyr

如何在dplyr中的摘要中使用循环

如何在dplyr中的摘要中使用循环,r,dplyr,R,Dplyr,我试图使用dplyr的summary（）函数创建大量聚合变量。所以我考虑使用for循环，但它不起作用。有人有主意吗 library(dplyr) library(rlang) iris %>% group_by(Species) %>% summarise( total_Petal=sum(Petal.Length), total_Sepal=sum(Sepal.Length) ) ) # Trying the equivalent with a

我试图使用dplyr的summary（）函数创建大量聚合变量。所以我考虑使用for循环，但它不起作用。有人有主意吗

library(dplyr)
library(rlang)
iris %>% 
  group_by(Species) %>% 
  summarise(
    total_Petal=sum(Petal.Length),
    total_Sepal=sum(Sepal.Length)
  )
)
# Trying the equivalent with a for loop
iris %>% 
  group_by(Species) %>% 
  summarise(
    for (part in c("Petal","Sepal")) {
      !!sym(paste0("total_",part)) := sum(!!sym(paste0(part,".Length")))
    }
  )

非常感谢

您不应该在

摘要

中使用

for

循环。如果您需要对多个列重复相同的功能，方法是跨。请看下面的示例：

library(dplyr)

iris %>% 
 group_by(Species) %>% 
 summarise(across(c("Petal.Length", "Sepal.Length"), sum, .names = "total_{.col}"))

#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 3 x 3
#>   Species    total_Petal.Length total_Sepal.Length
#>   <fct>                   <dbl>              <dbl>
#> 1 setosa                   73.1               250.
#> 2 versicolor              213                 297.
#> 3 virginica               278.                329.

如果您想删除结尾处的“.Length”，我的建议是在第二个函数

rename\u中使用

：

iris %>% 
 group_by(Species) %>% 
 summarise(across(ends_with(".Length"), sum, .names = "total_{.col}")) %>% 
 rename_with(stringr::str_remove, ends_with("\\.Length$"), pattern = ".Length")
#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 3 x 3
#>   Species    total_Sepal.Length total_Petal.Length
#>   <fct>                   <dbl>              <dbl>
#> 1 setosa                   250.               73.1
#> 2 versicolor               297.              213  
#> 3 virginica                329.              278.

iris%>%
组别(种类)%>%
总结（跨越（以“.Length”结尾）、和、.names=“total”{.col}”））%>%
重命名为（stringr:：str\u remove，以（“\\.Length$”）结尾，pattern=“.Length”）
#>`summary（）`解组输出（用`.groups`参数重写）
#>#tibble:3 x 3
#>种总萼片长度总花瓣长度
#>                                    
#>1刚毛250。73.1
#>2彩色297。213
#>弗吉尼亚州329号。278

我以“\.Length$”的方式写“.Length”，以指定点应解释为点（\”），并且该模式位于最末端（$”）。

您不应在

摘要

中使用

for

循环。如果您需要对多个列重复相同的功能，方法是跨。请看下面的示例：

library(dplyr)

iris %>% 
 group_by(Species) %>% 
 summarise(across(c("Petal.Length", "Sepal.Length"), sum, .names = "total_{.col}"))

#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 3 x 3
#>   Species    total_Petal.Length total_Sepal.Length
#>   <fct>                   <dbl>              <dbl>
#> 1 setosa                   73.1               250.
#> 2 versicolor              213                 297.
#> 3 virginica               278.                329.

如果您想删除结尾处的“.Length”，我的建议是在第二个函数

rename\u中使用

：

iris %>% 
 group_by(Species) %>% 
 summarise(across(ends_with(".Length"), sum, .names = "total_{.col}")) %>% 
 rename_with(stringr::str_remove, ends_with("\\.Length$"), pattern = ".Length")
#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 3 x 3
#>   Species    total_Sepal.Length total_Petal.Length
#>   <fct>                   <dbl>              <dbl>
#> 1 setosa                   250.               73.1
#> 2 versicolor               297.              213  
#> 3 virginica                329.              278.

iris%>%
组别(种类)%>%
总结（跨越（以“.Length”结尾）、和、.names=“total”{.col}”））%>%
重命名为（stringr:：str\u remove，以（“\\.Length$”）结尾，pattern=“.Length”）
#>`summary（）`解组输出（用`.groups`参数重写）
#>#tibble:3 x 3
#>种总萼片长度总花瓣长度
#>                                    
#>1刚毛250。73.1
#>2彩色297。213
#>弗吉尼亚州329号。278

我用“\.Length$”这样写“.Length”，是为了指定点应该被解释为点（\”），并且该模式在最末端（$”）。

因为跨

的操作在现有列上，所以您不需要（也不应该）引用列名。它们是名字，不是字符串。两者都可以。如果需要使用字符串，可以。也许，你可以在（…）
的所有内容中设置它们会更好，但无论如何它都可以工作。当然，但是它非常容易混淆（不仅是，尤其是对于初学者）。我强烈建议严格区分字符串和名称（它们是根本不同的东西），并希望未来版本的R最终会反对允许威胁字符串作为名称的历史性错误。听起来不错，我在我的答案中添加了一条注释。谢谢你的回答。如果我想在末尾有一个名为“total_Petal”的列，而不是“total_Petal.Length”，那该怎么办？因为cross
对现有列进行操作，您不需要（也不应该）引用列名。它们是名字，不是字符串。两者都可以。如果需要使用字符串，可以。也许，你可以在（…）

的所有内容中设置它们会更好，但无论如何它都可以工作。当然，但是它非常容易混淆（不仅是，尤其是对于初学者）。我强烈建议严格区分字符串和名称（它们是根本不同的东西），并希望未来版本的R最终会反对允许威胁字符串作为名称的历史性错误。听起来不错，我在我的答案中添加了一条注释。谢谢你的回答。如果我想在末尾有一个名为“total_Petal”的列，而不是“total_Petal.Length”，该怎么办