R 如何使用map()将分组索引添加到数据帧列中?
我有两组不同样本的两次测量数据。我提供了一个简单的版本,每个版本有6个示例,如下所示:R 如何使用map()将分组索引添加到数据帧列中?,r,tidyr,tidyverse,purrr,R,Tidyr,Tidyverse,Purrr,我有两组不同样本的两次测量数据。我提供了一个简单的版本,每个版本有6个示例,如下所示: library(tidyverse) df <- tibble(group = c(rep("group_A", 12), rep("group_B", 12)), sample = rep(1:6, 4), measurement = rep(c(rep("meas_A", 6), rep("meas_B", 6)), 2), value =
library(tidyverse)
df <- tibble(group = c(rep("group_A", 12), rep("group_B", 12)),
sample = rep(1:6, 4),
measurement = rep(c(rep("meas_A", 6), rep("meas_B", 6)), 2),
value = round(runif(24, min = 0, max = 60)))
可以按如下方式映射到列表中:
df2 %>% mutate(data = map(data, ~spread(.x, group_meas, value)))
我的问题出现在一个样本被测量了不止一次,然后spread()
不起作用,因为有
行的重复标识符
我认为解决这个问题的最佳方法是添加一个新的索引列,在组合的组/度量上分组,这将提供唯一的行标识符。这适用于单个数据帧
df %>% unite(group_meas, group, measurement) %>%
group_by(group_meas) %>%
mutate(gr_m_index = row_number())
但是,我无法缩放它以缩小列表
df2 %>% mutate(data = map(data, ~ group_by(.x, group_meas) %>%
mutate(gr_m_index = row_number())))
我认为这一定是一个tidyeval
问题,因为我得到了以下错误,表明它在错误的地方寻找
计算错误:列gr\u m\u索引的长度必须为24
行)或1,而不是4
如何使用map()
向数据帧列添加分组索引?据我所知,根据错误消息,行号()
返回c(1,2,3,4)
。这是因为行数是基于df2
,而不是嵌套的数据帧计算的
以下任何一种方法都应该有效:
方法1。定义所有要映射为独立函数的转换
index_spread <- function(data){
return(data %>%
group_by(group_meas) %>%
mutate(gr_m_index = row_number()) %>%
spread(group_meas, value))
}
df2 %>% mutate(data = map(data, index_spread)) %>% unnest()
# A tibble: 24 x 7
condition sample gr_m_index group_A_meas_A group_A_meas_B group_B_meas_A group_B_meas_B
<chr> <int> <int> <dbl> <dbl> <dbl> <dbl>
1 One 1 1 12 43 39 52
2 One 2 2 11 60 8 20
3 One 3 3 41 23 16 29
4 One 4 4 23 47 23 36
5 One 5 5 46 56 1 30
6 One 6 6 30 13 23 11
7 Two 1 1 12 43 39 52
8 Two 2 2 11 60 8 20
9 Two 3 3 41 23 16 29
10 Two 4 4 23 47 23 36
# ... with 14 more rows
df2 %>% mutate(data = map(data, ~ group_by(.x, group_meas) %>%
mutate(gr_m_index = row_number())))
index_spread <- function(data){
return(data %>%
group_by(group_meas) %>%
mutate(gr_m_index = row_number()) %>%
spread(group_meas, value))
}
df2 %>% mutate(data = map(data, index_spread)) %>% unnest()
# A tibble: 24 x 7
condition sample gr_m_index group_A_meas_A group_A_meas_B group_B_meas_A group_B_meas_B
<chr> <int> <int> <dbl> <dbl> <dbl> <dbl>
1 One 1 1 12 43 39 52
2 One 2 2 11 60 8 20
3 One 3 3 41 23 16 29
4 One 4 4 23 47 23 36
5 One 5 5 46 56 1 30
6 One 6 6 30 13 23 11
7 Two 1 1 12 43 39 52
8 Two 2 2 11 60 8 20
9 Two 3 3 41 23 16 29
10 Two 4 4 23 47 23 36
# ... with 14 more rows
df2$data <- map(df2$data, ~group_by(.x, group_meas) %>%
mutate(gr_m_index = row_number()) %>%
spread(group_meas, value))
df2 %>% unnest()
# (same output as above)