如何将transmute与grep功能结合起来？_R_Dplyr

如何将transmute与grep功能结合起来？

如何将transmute与grep功能结合起来？,r,dplyr,R,Dplyr,我试图找到一种方法，从现有数据帧中使用rowSums（）函数创建一个包含变量的新表。例如，我现有的数据框被称为'asn'，我想对变量标题中包含“2011”的所有变量的每一行的值求和。我想要一个新表，它只包含一个名为asn_y2011的列，其中包含使用包含“2011”的变量的每一行的总和数据 structure(list(row = 1:3, south_2010 = c(1L, 5L, 7L), south_2011 = c(4L, 0L, 4L), south_2012 = c(5L, 8

我试图找到一种方法，从现有数据帧中使用

rowSums（）

函数创建一个包含变量的新表。例如，我现有的数据框被称为

'asn'

，我想对变量标题中包含“2011”的所有变量的每一行的值求和。我想要一个新表，它只包含一个名为

asn_y2011

的列，其中包含使用包含“2011”的变量的每一行的总和

数据

structure(list(row = 1:3, south_2010 = c(1L, 5L, 7L), south_2011 = c(4L, 
0L, 4L), south_2012 = c(5L, 8L, 6L), north_2010 = c(3L, 4L, 1L
), north_2011 = c(2L, 6L, 0L), north_2012 = c(1L, 1L, 2L)), class = "data.frame", row.names = c(NA, 
-3L))

df <- structure(list(row = 1:3, south_2010 = c(1L, 5L, 7L), south_2011 = c(4L, 0L, 4L), south_2012 = c(5L, 8L, 6L), north_2010 = c(3L, 4L, 1L), north_2011 = c(2L, 6L, 0L), north_2012 = c(1L, 1L, 2L)), class = "data.frame", row.names = c(NA,-3L))

现有的

'asn'

数据帧如下所示

row south_2010 south_2011 south_2012 north_2010 north_2011 north_2012
  1      1           4         5          3          2          1
  2      5           0         8          4          6          1
  3      7           4         6          1          0          2

row    asn_y2011
 1         6
 2         6
 3         4

我正在尝试使用以下函数：

asn %>%   
   transmute(asn_y2011 = rowSums(, grep("2011")))

得到这样的东西

row south_2010 south_2011 south_2012 north_2010 north_2011 north_2012
  1      1           4         5          3          2          1
  2      5           0         8          4          6          1
  3      7           4         6          1          0          2

row    asn_y2011
 1         6
 2         6
 3         4

我认为这段代码可以满足您的要求：

library(magrittr)
tibble::tibble(row = 1:3, south_2011 = c(4, 0, 4), north_2011 = c(2, 6, 0)) %>%
  tidyr::gather(- row, key = "key", value = "value") %>%
  dplyr::mutate(year = purrr::map_chr(.x = key, .f = function(x)stringr::str_split(x, pattern = "_")[[1]][2])) %>%
  dplyr::group_by(row, year) %>%
  dplyr::summarise(sum(value))

我首先加载包

magrittr

，以便使用管道

%%>%

。我已经明确列出了从中导出函数的包，但是如果您愿意，欢迎您使用

library

加载包

然后我创建一个TIBLE或数据帧，就像您指定的那样

我使用

collect

在创建新变量

year

之前重新组织数据帧。然后，我按

行

和

年

的值总结计数。继续您的代码，

grep（）

应如下所示：

library(dplyr)

asn %>%
  transmute(row, asn_y2011 = rowSums(.[grep("2011", names(.))]))

#   row asn_y2011
# 1   1         6
# 2   2         6
# 3   3         4

或者您可以在

c\u overs（）

中使用tidy selection：

你可以试试这种方法

library(tidyverse)
df2 <- df %>% 
  select(grep("_2011|row", names(df), value = TRUE)) %>% 
  rowwise() %>% 
  mutate(asn_y2011 = sum(c_across(south_2011:north_2011))) %>% 
  select(row, asn_y2011)
  
#     row asn_y2011
#   <int>     <int>
# 1     1         6
# 2     2         6
# 3     3         4

库（tidyverse）
df2%
选择（grep（“_2011 |行”，名称（df），值=TRUE））%>%
行（）
突变（asn_y2011=总和（c_跨越（南部_2011：北部_2011）））%>%
选择（世界其他地区，asn\U y2011）
#世界其他地区asn_y2011
#        
# 1     1         6
# 2     2         6
# 3     3         4

数据

structure(list(row = 1:3, south_2010 = c(1L, 5L, 7L), south_2011 = c(4L, 
0L, 4L), south_2012 = c(5L, 8L, 6L), north_2010 = c(3L, 4L, 1L
), north_2011 = c(2L, 6L, 0L), north_2012 = c(1L, 1L, 2L)), class = "data.frame", row.names = c(NA, 
-3L))

df <- structure(list(row = 1:3, south_2010 = c(1L, 5L, 7L), south_2011 = c(4L, 0L, 4L), south_2012 = c(5L, 8L, 6L), north_2010 = c(3L, 4L, 1L), north_2011 = c(2L, 6L, 0L), north_2012 = c(1L, 1L, 2L)), class = "data.frame", row.names = c(NA,-3L))

df带有Reduce的base R
中的一个选项
cbind(df['row'], asn_y2011 = Reduce(`+`, df[endsWith(names(df), '2011')]))
#  row asn_y2011
#1   1         6
#2   2         6
#3   3         4

数据
df使用行和的另一个基本R选项
cbind(asn[1],asn_y2011 = rowSums(asn[grep("2011",names(asn))]))

给
  row asn_y2011
1   1         6
2   2         6
3   3         4

pivot\u longer
是取代gather
的新功能和改进功能。更强大、更直观。