R:通过更改行值来分组\u id
1) 首先,我有这个数据框架:R:通过更改行值来分组\u id,r,group-by,grouping,R,Group By,Grouping,1) 首先,我有这个数据框架: df <- data.frame(value=c("a","a","a", "b", "b", "b", "a", "a", "a"), , desired_id=c(1,1,1,2,2,2,3,3,3)) df <- data.frame(value=c("a","a","a", "b", "b", "b", "a", "a", "a"), value2=c("a","a","
df <- data.frame(value=c("a","a","a", "b", "b", "b", "a", "a", "a"), ,
desired_id=c(1,1,1,2,2,2,3,3,3))
df <- data.frame(value=c("a","a","a", "b", "b", "b", "a", "a", "a"),
value2=c("a","a","c", "b", "b", "c", "a", "a", "d"),
desired_id=c(1,1,2,3,3,4,5,5,6))
如何从value
和value2
列生成所需的\u id
。
我的组再次按行分配。也就是说,每次value
和value2
的唯一组合发生变化时,都应分配下一个更高的所需的\u id
与上面类似,我尝试了df$desired\u id\u复制%group\u by(value,value2)%%>%group\u索引
但这不起作用,因为所有value==“a”&value2==“a”
将被分配相同的组索引
谢谢大家! 我们可以使用
data.table
中的rleid
(运行长度编码id),对于不等于前一个元素的每个元素,它基本上会增加1
library(data.table)
library(dplyr)
df%>%
mutate(newcol = rleid(value))
对于第二个数据集,它是
df %>%
mutate(new = rleid(value, value2))
# value value2 desired_id new
#1 a a 1 1
#2 a a 1 1
#3 a c 2 2
#4 b b 3 3
#5 b b 3 3
#6 b c 4 4
#7 a a 5 5
#8 a a 5 5
#9 a d 6 6
或者使用
rle
frombase R
df$newcol <- with(rle(df$value), rep(seq_along(values), lengths))
df$newcol