R中的子分组或子嵌套后的索引_R_Indexing_Dataset_Subset

R中的子分组或子嵌套后的索引

r indexing

R中的子分组或子嵌套后的索引,r,indexing,dataset,subset,R,Indexing,Dataset,Subset,这是我的数据集： record_id voucher_number ice 1 1 app1 app 2 1 00000 1 3 1 11111 1 4 1 22222 1 5 1 11111 2 6 2 app2 app 7 2

这是我的数据集：

  record_id voucher_number ice
1          1           app1 app
2          1          00000   1
3          1          11111   1
4          1          22222   1
5          1          11111   2
6          2           app2 app
7          2          33333   1
8          2          44444   1
9          2          33333   2
10         2          33333   3
11         3           app3 app
12         3          55555   1
13         3          66666   1
14         3          55555   2
15         3          66666   2
16         3          55555   3
17         3          77777   1

现在，在通过

record\u id

进行分组之后，我想创建一个名为

sets

的新变量

sets

应该是索引列

凭证编号

，当

ice

是

app

时，

sets

变量也是

app

，当

凭证编号

是组中唯一的编号/id时，它被索引为1、2、3、4等等

我使用了以下方法：

sets = match(voucher_number, unique(voucher_number))

这可以正常工作，但在编制索引时，我无法排除value

app

。最有效的方法是什么？

下面是一个使用

dplyr的解决方案：
data <- read.table(text = "  record_id voucher_number ice
1          1           app1 app
2          1          00000   1
3          1          11111   1
4          1          22222   1
5          1          11111   2
6          2           app2 app
7          2          33333   1
8          2          44444   1
9          2          33333   2
10         2          33333   3
11         3           app3 app
12         3          55555   1
13         3          66666   1
14         3          55555   2
15         3          66666   2
16         3          55555   3
17         3          77777   1", header = TRUE, stringsAsFactors = FALSE)

library(dplyr)

data %>% 
  group_by(record_id) %>% 
  mutate(sets = if_else(ice == "app",
                        "app",
                        as.character(row_number() - 1)))
# A tibble: 17 x 4
# Groups:   record_id [3]
   record_id voucher_number ice   sets 
       <int> <chr>          <chr> <chr>
 1         1 app1           app   app  
 2         1 00000          1     1    
 3         1 11111          1     2    
 4         1 22222          1     3    
 5         1 11111          2     4    
 6         2 app2           app   app  
 7         2 33333          1     1    
 8         2 44444          1     2    
 9         2 33333          2     3    
10         2 33333          3     4    
11         3 app3           app   app  
12         3 55555          1     1    
13         3 66666          1     2    
14         3 55555          2     3    
15         3 66666          2     4    
16         3 55555          3     5    
17         3 77777          1     6 

谢谢@starja！让我们给它添加一层。请查看位置#3、#5、#7、#9和#10
，其中两个不同的记录id
的凭证编号相同，如果凭证编号相同，我希望set
变量表示相同的索引。什么是最有效的方法？看我的编辑，这对你有用吗？不幸的是，我的代码现在有点复杂了，也许有更简单的解决方案这个cur\u group\u id（）
是什么意思？它是一行所属的；分组是由group\u by生成的，也是这一行的``索引
clean_id <- function(index, exclusion) {
  index_app <- which(exclusion %in% "app")
  correction_value <- min(index[-index_app])
  index <- as.character(index - correction_value + 1)
  index[index_app] <- "app"
  index
}

data %>% 
  group_by(record_id, voucher_number) %>%
  mutate(sets = cur_group_id()) %>% 
  group_by(record_id) %>% 
  mutate(sets = clean_id(sets, ice))

# A tibble: 17 x 4
# Groups:   record_id [3]
   record_id voucher_number ice   sets 
       <int> <chr>          <chr> <chr>
 1         1 app1           app   app  
 2         1 00000          1     1    
 3         1 11111          1     2    
 4         1 22222          1     3    
 5         1 11111          2     2    
 6         2 app2           app   app  
 7         2 33333          1     1    
 8         2 44444          1     2    
 9         2 33333          2     1    
10         2 33333          3     1    
11         3 app3           app   app  
12         3 55555          1     1    
13         3 66666          1     2    
14         3 55555          2     1    
15         3 66666          2     2    
16         3 55555          3     1    
17         3 77777          1     3