Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/72.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 将基于NAs的列聚合到其他列中_R_Aggregate - Fatal编程技术网

R 将基于NAs的列聚合到其他列中

R 将基于NAs的列聚合到其他列中,r,aggregate,R,Aggregate,我想基于组1中的NAs聚合组2: Datetime group1 group2 2011-08-08 21:00:00 1 1 2011-08-08 21:10:00 NA 2 2011-08-08 21:20:00 NA 3 2011-08-08 21:30:00 2 4 2011-08-08 21:40:00 NA 5 2011-08-08 21:50:00 NA 6 2011-0

我想基于组1中的NAs聚合组2:

Datetime            group1  group2
2011-08-08 21:00:00   1       1
2011-08-08 21:10:00   NA      2
2011-08-08 21:20:00   NA      3
2011-08-08 21:30:00   2       4
2011-08-08 21:40:00   NA      5
2011-08-08 21:50:00   NA      6
2011-08-08 22:00:00   3       7
这是我想要的输出:

Datetime            group1  group2
2011-08-08 21:00:00   1       1
2011-08-08 21:30:00   2       9 
2011-08-08 22:00:00   3       18
编辑: 9=2+3+4和18=5+6+7

aggregate(group2~group1, data=Data, subset(Data,group1==NA),sum)

如有任何建议,我们将不胜感激。我能用骨料做吗?或者我应该使用不同的软件包吗?

它看起来像是
na。来自软件包
zoo
的locf
在这里非常有用

假设
dat
是您的原始数据,我们可以获取非NA
group1
级别的日期,并使用
cbind
将其与聚合的
group2
数据结合在一起

> library(zoo)
> Datetime <- dat$Datetime[!is.na(dat$group1)]
> cbind(Datetime, aggregate(group2~group1, na.locf(dat, fromLast = TRUE), sum))
#              Datetime group1 group2
# 1 2011-08-08 21:00:00      1      1
# 2 2011-08-08 21:30:00      2      9
# 3 2011-08-08 22:00:00      3     18
>图书馆(动物园)
>Datetime cbind(Datetime,聚合(group2~group1,na.locf(dat,fromLast=TRUE),sum))
#日期时间组1组2
# 1 2011-08-08 21:00:00      1      1
# 2 2011-08-08 21:30:00      2      9
# 3 2011-08-08 22:00:00      3     18

PS:感谢您使用
数据更新/编辑您的问题(+1)。

表格

library(data.table)
DT1 <- DT[, group1:=cumsum(!is.na(c(0, group1[1:(.N-1)])))][,
       list(Datetime=Datetime[.N],group2=sum(group2)), by=group1][,c(2,1,3), with=FALSE]
  DT1
 #              Datetime group1 group2
 #1: 2011-08-08 21:00:00      1      1
 #2: 2011-08-08 21:30:00      2      9
 #3: 2011-08-08 22:00:00      3     18
库(data.table)

DT1使用碱R的溶液:

ddf = structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "2011-08-08", class = "factor"), 
    time = structure(1:7, .Label = c("21:00:00", "21:10:00", 
    "21:20:00", "21:30:00", "21:40:00", "21:50:00", "22:00:00"
    ), class = "factor"), group1 = c(1L, NA, NA, 2L, NA, NA, 
    3L), group2 = 1:7), .Names = c("Date", "time", "group1", 
"group2"), class = "data.frame", row.names = c(NA, -7L))

ddf$group1a = ddf$group1
for(i in nrow(ddf):1)   
     if(is.na(ddf$group1a[i])) 
          ddf$group1a[i] = ddf$group1a[i+1]
outdf = stack(with(ddf, tapply(group2, group1a, sum)))
names(outdf) = c("group2","group1")
outdf = outdf[,c(2,1)]
outdf

#  group1 group2
#1      1      1
#2      2      9
#3      3     18

@理查德,是的。但是由于NA没有任何模式发生,我无法理解。此外,我不需要代码,只是任何建议。嗨,我按照上面的行,我收到以下错误消息<代码>错误(X[[1L]],…):参数的“类型”(字符)无效
na.locf(dat,fromLast=TRUE)
正在执行其任务。但是,
aggregate(group2~group1,na.locf(dat,fromLast=TRUE),sum)
不起作用。有什么想法吗?确保
group1
group2
列是class
numeric
。检查
sapply(dat,class)
我在下面使用了akrun的dat并创建了一个数据框,并确保group1和group2是数字的(以前它们是整数)。我还清理了R控制台并重新启动了R。不知何故,我仍然收到相同的错误消息。可能的原因是什么?谢谢你的帮助,我解决了
na.locf(dat,fromLast=TRUE)
生成了group1和group2字符。因此,
cbind(Datetime,aggregate(as.numeric(group2)~as.numeric(group1),na.locf(dat,fromLast=TRUE),sum))
就是解决方案。
ddf = structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "2011-08-08", class = "factor"), 
    time = structure(1:7, .Label = c("21:00:00", "21:10:00", 
    "21:20:00", "21:30:00", "21:40:00", "21:50:00", "22:00:00"
    ), class = "factor"), group1 = c(1L, NA, NA, 2L, NA, NA, 
    3L), group2 = 1:7), .Names = c("Date", "time", "group1", 
"group2"), class = "data.frame", row.names = c(NA, -7L))

ddf$group1a = ddf$group1
for(i in nrow(ddf):1)   
     if(is.na(ddf$group1a[i])) 
          ddf$group1a[i] = ddf$group1a[i+1]
outdf = stack(with(ddf, tapply(group2, group1a, sum)))
names(outdf) = c("group2","group1")
outdf = outdf[,c(2,1)]
outdf

#  group1 group2
#1      1      1
#2      2      9
#3      3     18