R 使用data.table在每组数据后插入一行NAs_R_Data.table

R 使用data.table在每组数据后插入一行NAs

R 使用data.table在每组数据后插入一行NAs,r,data.table,R,Data.table,我试图在R中的每组数据后添加一行NAs 早些时候也有人问过类似的问题在这种情况下，被接受的答案也很好，如下所示 group <- c("a","b","b","c","c","c","d","d","d","d") xvalue <- c(16:25) yvalue <- c(1:10) df <- data.frame(cbind(group,xvalue,yvalue)) df_new <- as.data.frame(lapply(df, as.charac

我试图在

中的每组数据后添加一行NAs

早些时候也有人问过类似的问题

在这种情况下，被接受的答案也很好，如下所示

group <- c("a","b","b","c","c","c","d","d","d","d")
xvalue <- c(16:25)
yvalue <- c(1:10)
df <- data.frame(cbind(group,xvalue,yvalue))
df_new <- as.data.frame(lapply(df, as.character), stringsAsFactors = FALSE)
head(do.call(rbind, by(df_new, df$group, rbind, NA)), -1 )
     group xvalue yvalue
a.1      a     16      1
a.2   <NA>   <NA>   <NA>
b.2      b     17      2
b.3      b     18      3
b.31  <NA>   <NA>   <NA>
c.4      c     19      4
c.5      c     20      5
c.6      c     21      6
c.41  <NA>   <NA>   <NA>
d.7      d     22      7
d.8      d     23      8
d.9      d     24      9
d.10     d     25     10

组您可以尝试
df$group <- as.character(df$group)
setDT(df)[, .SD[1:(.N+1)], by=group][is.na(xvalue), group:=NA][!.N]
#     group xvalue yvalue
#1:     a     16      1
#2:    NA     NA     NA
#3:     b     17      2
#4:     b     18      3
#5:    NA     NA     NA
#6:     c     19      4
#7:     c     20      5
#8:     c     21      6
#9:    NA     NA     NA
#10:    d     22      7
#11:    d     23      8
#12:    d     24      9
#13:    d     25     10

或
或者可以根据@eddi的评论进一步简化
 setDT(df)[df[, c(.I, NA), group]$V1][!.N]

我能想到的一种方法是首先构造一个向量，如下所示：
foo <- function(x) {
    o = order(rep.int(seq_along(x), 2L))
    c(x, rep.int(NA, length(x)))[o]
}
join_values = head(foo(unique(df_new$group)), -1L)
# [1] "a" NA  "b" NA  "c" NA  "d"

这是一个非常简洁的解决方案，尽管我认为您可以避免与group
混在一起，只需创建一些索引并将其留在那里（或之后将其删除），可能类似于setDT（df）[，indx:=.GRP，group][，.SD[1:（.N+1）]，.SD[1:（.N+1）]，indx][或者只是setDT df，indx:=group][，.SD[1:（.N+1）]，indx:=NULL][]
@DavidArenburg我认为如果我使用.I
，它可以变得更紧凑。最后一个解决方案非常简洁。我认为…df[，c（.I，NA），group]…更容易阅读/理解。您认为这会以某种方式提高性能吗？因为akrun解决方案对于me@DavidArenburg，我不明白为什么它们在这里都应该是惯用的。这只是另一种方式。我使用连接是因为它直接给出答案，而不是以后必须用NAs替换。
 setDT(df)[df[, c(.I, NA), group]$V1][!.N]

foo <- function(x) {
    o = order(rep.int(seq_along(x), 2L))
    c(x, rep.int(NA, length(x)))[o]
}
join_values = head(foo(unique(df_new$group)), -1L)
# [1] "a" NA  "b" NA  "c" NA  "d"

setkey(setDT(df_new), group)
df_new[.(join_values), allow.cartesian=TRUE]
#     group xvalue yvalue
#  1:     a     16      1
#  2:    NA     NA     NA
#  3:     b     17      2
#  4:     b     18      3
#  5:    NA     NA     NA
#  6:     c     19      4
#  7:     c     20      5
#  8:     c     21      6
#  9:    NA     NA     NA
# 10:     d     22      7
# 11:     d     23      8
# 12:     d     24      9
# 13:     d     25     10