Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/76.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 对数据帧中组内从1到n的行进行编号_R_Dataframe - Fatal编程技术网

R 对数据帧中组内从1到n的行进行编号

R 对数据帧中组内从1到n的行进行编号,r,dataframe,R,Dataframe,对于这样的数据帧: cat val 1 aaa 0.05638315 2 aaa 0.25767250 3 aaa 0.30776611 4 aaa 0.46854928 5 aaa 0.55232243 6 bbb 0.17026205 7 bbb 0.37032054 8 bbb 0.48377074 9 bbb 0.54655860 10 bbb 0.81240262 11 ccc 0.28035384

对于这样的数据帧:

   cat        val  
1  aaa 0.05638315  
2  aaa 0.25767250  
3  aaa 0.30776611  
4  aaa 0.46854928  
5  aaa 0.55232243  
6  bbb 0.17026205  
7  bbb 0.37032054  
8  bbb 0.48377074  
9  bbb 0.54655860  
10 bbb 0.81240262  
11 ccc 0.28035384  
12 ccc 0.39848790  
13 ccc 0.62499648  
14 ccc 0.76255108  
15 ccc 0.88216552 
我希望将重复序列号按组分配给行,就像我只将编号从1分配给3一样,然后在同一组中,序列再次从1开始:

   cat        val num  
1  aaa 0.05638315   1  
2  aaa 0.25767250   2  
3  aaa 0.30776611   3  
4  aaa 0.46854928   1  
5  aaa 0.55232243   2  
6  bbb 0.17026205   1  
7  bbb 0.37032054   2  
8  bbb 0.48377074   3  
9  bbb 0.54655860   1  
10 bbb 0.81240262   2  
11 ccc 0.28035384   1  
12 ccc 0.39848790   2  
13 ccc 0.62499648   3  
14 ccc 0.76255108   1  
15 ccc 0.88216552   2

我怎样才能做到呢

这应该能奏效。您可以在data.frame中获得唯一的CAT,提取相应的行,然后附加一个从1开始的整数数字向量,包括序列(1,2,3)中的值。每只猫有1只

df <- data.frame(cat=c(rep("aaa", 5), rep("bbb", 2), rep("ccc", 4), rep("ddd", 7)), 
                 val = rnorm(n = 18))

df$num <- do.call(c, lapply(unique(df$cat), (function(i){
  slice <- df[df$cat==i,]
  rep(1:3, 1+as.integer(nrow(slice)/3))[1:nrow(slice)]
})))

这应该能奏效。您可以在data.frame中获得唯一的CAT,提取相应的行,然后附加一个从1开始的整数数字向量,包括序列(1,2,3)中的值。每只猫有1只

df <- data.frame(cat=c(rep("aaa", 5), rep("bbb", 2), rep("ccc", 4), rep("ddd", 7)), 
                 val = rnorm(n = 18))

df$num <- do.call(c, lapply(unique(df$cat), (function(i){
  slice <- df[df$cat==i,]
  rep(1:3, 1+as.integer(nrow(slice)/3))[1:nrow(slice)]
})))

这里有一个解决方案。虽然有一个警告,但我觉得它很简洁:

df=data.frame(cat=rep(letters[1:3],each=5),val=rnorm(3*5))
df[,"n"] <- tapply(df[,"val"],df[,"cat"],function(vec) rep.int(1:3,times=ceiling(length(vec)/3))[1:length(vec)])
df

这里有一个解决方案。虽然有一个警告,但我觉得它很简洁:

df=data.frame(cat=rep(letters[1:3],each=5),val=rnorm(3*5))
df[,"n"] <- tapply(df[,"val"],df[,"cat"],function(vec) rep.int(1:3,times=ceiling(length(vec)/3))[1:length(vec)])
df

下面是一种经典的拆分/应用/合并方法:

df <- unsplit(lapply(split(df, df$cat), function(x) 
              cbind(x, id = rep(1:3, length.out = nrow(x)))), df$cat)

#    cat        val id
# 1  aaa 0.05638315  1
# 2  aaa 0.25767250  2
# 3  aaa 0.30776611  3
# 4  aaa 0.46854928  1
# 5  aaa 0.55232243  2
# 6  bbb 0.17026205  1
# 7  bbb 0.37032054  2
# 8  bbb 0.48377074  3
# 9  bbb 0.54655860  1
# 10 bbb 0.81240262  2
# 11 ccc 0.28035384  1
# 12 ccc 0.39848790  2
# 13 ccc 0.62499648  3
# 14 ccc 0.76255108  1
# 15 ccc 0.88216552  2
还有一个data.table替代方案:

library(data.table)
setDT(df)
df[, id := rep(1:3, length.out = .N), by = cat]

下面是一种经典的拆分/应用/合并方法:

df <- unsplit(lapply(split(df, df$cat), function(x) 
              cbind(x, id = rep(1:3, length.out = nrow(x)))), df$cat)

#    cat        val id
# 1  aaa 0.05638315  1
# 2  aaa 0.25767250  2
# 3  aaa 0.30776611  3
# 4  aaa 0.46854928  1
# 5  aaa 0.55232243  2
# 6  bbb 0.17026205  1
# 7  bbb 0.37032054  2
# 8  bbb 0.48377074  3
# 9  bbb 0.54655860  1
# 10 bbb 0.81240262  2
# 11 ccc 0.28035384  1
# 12 ccc 0.39848790  2
# 13 ccc 0.62499648  3
# 14 ccc 0.76255108  1
# 15 ccc 0.88216552  2
还有一个data.table替代方案:

library(data.table)
setDT(df)
df[, id := rep(1:3, length.out = .N), by = cat]

还有
ave
ave(dat$val,dat$cat,FUN=function(x)rep(1:3,length.out=length(x))
。还有
ave
ave(dat$val,dat$cat,FUN=function(x)rep(1:3,length.out=length(x))