Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/81.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
数据帧R中每10行后求和列值_R_Dataframe - Fatal编程技术网

数据帧R中每10行后求和列值

数据帧R中每10行后求和列值,r,dataframe,R,Dataframe,我有一个数据框,在R中有300000行,如下所示: 11 2990000 3000000 0.00000000 11 3000000 3010000 2.30247191 11 3010000 3020000 0.32213483 11 3020000 3030000 0.91696629 11 3030000 3040000 1.81595506 11 3040000 3050000 0.27269663 11 3050000 3060000 2.21988764 11 3060000 307

我有一个数据框,在R中有300000行,如下所示:

11 2990000 3000000 0.00000000
11 3000000 3010000 2.30247191
11 3010000 3020000 0.32213483
11 3020000 3030000 0.91696629
11 3030000 3040000 1.81595506
11 3040000 3050000 0.27269663
11 3050000 3060000 2.21988764
11 3060000 3070000 3.44640449
11 3070000 3080000 2.02134831
11 3080000 3090000 1.22123596 #10th row
11 3090000 3100000 3.47089888
11 3100000 3110000 3.08921348
11 3110000 3120000 3.11786517
11 3120000 3130000 1.44325843
11 3130000 3140000 0.00000000
11 3140000 3150000 0.00000000
11 3150000 3160000 2.55146067
11 3160000 3170000 0.63460674
11 3170000 3180000 1.08415730
11 3180000 3190000 2.73101124 #20th row
我想制作一个新的数据帧,它将每10行的列
4
相加,并分别输出第一行和第十行的列
2
和列
3
。其输出应为:

11 2990000 3090000 14.5391
11 3090000 3190000 18.12247

您可以尝试使用
tidyverse

library(tidyverse)
d %>% 
  group_by(gr=gl(n()/10,10)) %>% 
  summarise(Sum=sum(V4)) 
# A tibble: 2 x 2
  gr      Sum
  <fct> <dbl>
1 1      14.5
2 2      18.1
在base
R
中,您可以使用
ave

ave(d$V4, factor(rep(1:(nrow(d)/10), each=10)), FUN=sum)[seq(10, nrow(d), 10)] 

下面是一个包含
数据的解决方案。表

library("data.table")
D <- fread(
"11 2990000 3000000 0.00000000
11 3000000 3010000 2.30247191
11 3010000 3020000 0.32213483
11 3020000 3030000 0.91696629
11 3030000 3040000 1.81595506
11 3040000 3050000 0.27269663
11 3050000 3060000 2.21988764
11 3060000 3070000 3.44640449
11 3070000 3080000 2.02134831
11 3080000 3090000 1.22123596
11 3090000 3100000 3.47089888
11 3100000 3110000 3.08921348
11 3110000 3120000 3.11786517
11 3120000 3130000 1.44325843
11 3130000 3140000 0.00000000
11 3140000 3150000 0.00000000
11 3150000 3160000 2.55146067
11 3160000 3170000 0.63460674
11 3170000 3180000 1.08415730
11 3180000 3190000 2.73101124")

D[, .(V2=V2[1], V3=V3[.N], V4=sum(V4)), by=gl(D[, .N]/10, 10)]
# > D[, .(V2=V2[1], V3=V3[.N], V4=sum(V4)), by=gl(D[, .N]/10, 10)]
#    gl      V2      V3       V4
# 1:  1 2990000 3090000 14.53910
# 2:  2 3090000 3190000 18.12247
库(“data.table”)
D D[,(V2=V2[1],V3=V3[.N],V4=sum(V4)),by=gl(D[,.N]/10,10)]
#gl V2 V3 V4
# 1:  1 2990000 3090000 14.53910
# 2:  2 3090000 3190000 18.12247

将以下现有答案汇总在一起:



首先构造一个分组变量。您可以使用
gl()
rep()
。用于计算您可以使用的总和,例如,
aggregate()
library("data.table")
D <- fread(
"11 2990000 3000000 0.00000000
11 3000000 3010000 2.30247191
11 3010000 3020000 0.32213483
11 3020000 3030000 0.91696629
11 3030000 3040000 1.81595506
11 3040000 3050000 0.27269663
11 3050000 3060000 2.21988764
11 3060000 3070000 3.44640449
11 3070000 3080000 2.02134831
11 3080000 3090000 1.22123596
11 3090000 3100000 3.47089888
11 3100000 3110000 3.08921348
11 3110000 3120000 3.11786517
11 3120000 3130000 1.44325843
11 3130000 3140000 0.00000000
11 3140000 3150000 0.00000000
11 3150000 3160000 2.55146067
11 3160000 3170000 0.63460674
11 3170000 3180000 1.08415730
11 3180000 3190000 2.73101124")

D[, .(V2=V2[1], V3=V3[.N], V4=sum(V4)), by=gl(D[, .N]/10, 10)]
# > D[, .(V2=V2[1], V3=V3[.N], V4=sum(V4)), by=gl(D[, .N]/10, 10)]
#    gl      V2      V3       V4
# 1:  1 2990000 3090000 14.53910
# 2:  2 3090000 3190000 18.12247
v = D$V4
n = 10

#using @jogo's example data
cbind(
  D[ seq(1, nrow(D), n), 1:3 ],
  tapply(v, (seq_along(v)-1) %/% n, sum)
  )

#    V1      V2      V3       V2
# 1: 11 2990000 3000000 14.53910
# 2: 11 3090000 3100000 18.12247