Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/82.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 平均每年前5个数据点_R_Aggregate_Mean - Fatal编程技术网

R 平均每年前5个数据点

R 平均每年前5个数据点,r,aggregate,mean,R,Aggregate,Mean,我试图平均每个“年”的前5个“var1”数据点。我的数据如下。每个年度数据的长度不一样。非常感谢你的帮助!:) 像这样的 t <- read.csv("t.txt", sep="") ## Read data myMean <- function(x) ifelse(length(x)<5, mean(x), mean(x[1:5])) ans <- aggregate(var1 ~ year, data = t, FUN = myMean) ans year var

我试图平均每个“年”的前5个“var1”数据点。我的数据如下。每个年度数据的长度不一样。非常感谢你的帮助!:)

像这样的

t <- read.csv("t.txt", sep="") ## Read data
myMean <- function(x) ifelse(length(x)<5, mean(x), mean(x[1:5]))
ans <- aggregate(var1 ~ year, data = t, FUN = myMean)
ans
  year var1
1 2008   14
2 2009   13
3 2010   12
像这样的事

t <- read.csv("t.txt", sep="") ## Read data
myMean <- function(x) ifelse(length(x)<5, mean(x), mean(x[1:5]))
ans <- aggregate(var1 ~ year, data = t, FUN = myMean)
ans
  year var1
1 2008   14
2 2009   13
3 2010   12

t使用data.table,我们将'data.frame'转换为'data.table'(
setDT(df1)
),按'year'分组,我们得到'var1'的前5个值(带
head
),并得到
平均值

library(data.table)
setDT(df1)[, list(var1=mean(head(var1,5))), year]
#   year var1
#1: 2010  2.4
#2: 2009  2.6
#3: 2008  2.8

使用data.table,我们将'data.frame'转换为'data.table'(
setDT(df1)
),按'year'分组,得到'var1'的前5个值(带
head
),并得到
平均值

library(data.table)
setDT(df1)[, list(var1=mean(head(var1,5))), year]
#   year var1
#1: 2010  2.4
#2: 2009  2.6
#3: 2008  2.8

这里还有一个选项,使用
split
lappy

sapply( split(X$var1,X$year), function(x) ifelse(length(x)<5, mean(x), mean(x[1:5])) )
akrun
是数据表,
PoChoi
是数据帧,
mra68
是命名向量:

> akrun
   year var1
1: 2011  5.5
2: 2010  2.4
3: 2009  2.6
4: 2008  2.8
> PoChoi.1
  year var1
1 2008  2.8
2 2009  2.6
3 2010  2.4
4 2011  5.5
> PoChoi.2
  year var1
1 2008  2.8
2 2009  2.6
3 2010  2.4
4 2011  5.5
> mra68.1
2008 2009 2010 2011 
 2.8  2.6  2.4  5.5 
> mra68.2
2008 2009 2010 2011 
 2.8  2.6  2.4  5.5 
一个更大的例子:

library(microbenchmark)
library(data.table)

set.seed(1)

X <- data.frame( year = sample( 1500:2015, 10000, replace=TRUE ),
                 var1 = sample( 1:10, 10000, replace=TRUE ) )

myMean <- function(x) ifelse(length(x)<5, mean(x), mean(x[1:5]))

microbenchmark(
  akrun    = setDT(X)[, list(var1=mean(head(var1,5))), year],
  PoChoi.1 = aggregate(var1 ~ year, data = X, FUN = myMean),
  PoChoi.2 = aggregate(var1 ~ year, data = X, FUN = function(x) ifelse(length(x)<5, mean(x), mean(x[1:5]))),
  mra68.1  = sapply( split(X$var1,X$year), myMean ),
  mra68.2  = sapply( split(X$var1,X$year), function(x) ifelse(length(x)<5, mean(x), mean(x[1:5])) ),
  times = 1000
)

# Unit: milliseconds
# expr      min       lq     mean   median       uq       max neval
# akrun    15.44811 23.50436 36.81674 43.12405 44.22435  69.62202  1000
# PoChoi.1 33.96411 51.52858 83.29682 95.53486 99.60884 241.59967  1000
# PoChoi.2 33.64844 51.70747 83.47835 95.07223 99.44127 247.55881  1000
# mra68.1  11.05145 17.33191 27.21526 31.41954 32.34819 126.89461  1000
# mra68.2  11.05054 17.16615 26.96236 31.25061 32.14054  85.44422  1000

这里还有一个选项,使用
split
lappy

sapply( split(X$var1,X$year), function(x) ifelse(length(x)<5, mean(x), mean(x[1:5])) )
akrun
是数据表,
PoChoi
是数据帧,
mra68
是命名向量:

> akrun
   year var1
1: 2011  5.5
2: 2010  2.4
3: 2009  2.6
4: 2008  2.8
> PoChoi.1
  year var1
1 2008  2.8
2 2009  2.6
3 2010  2.4
4 2011  5.5
> PoChoi.2
  year var1
1 2008  2.8
2 2009  2.6
3 2010  2.4
4 2011  5.5
> mra68.1
2008 2009 2010 2011 
 2.8  2.6  2.4  5.5 
> mra68.2
2008 2009 2010 2011 
 2.8  2.6  2.4  5.5 
一个更大的例子:

library(microbenchmark)
library(data.table)

set.seed(1)

X <- data.frame( year = sample( 1500:2015, 10000, replace=TRUE ),
                 var1 = sample( 1:10, 10000, replace=TRUE ) )

myMean <- function(x) ifelse(length(x)<5, mean(x), mean(x[1:5]))

microbenchmark(
  akrun    = setDT(X)[, list(var1=mean(head(var1,5))), year],
  PoChoi.1 = aggregate(var1 ~ year, data = X, FUN = myMean),
  PoChoi.2 = aggregate(var1 ~ year, data = X, FUN = function(x) ifelse(length(x)<5, mean(x), mean(x[1:5]))),
  mra68.1  = sapply( split(X$var1,X$year), myMean ),
  mra68.2  = sapply( split(X$var1,X$year), function(x) ifelse(length(x)<5, mean(x), mean(x[1:5])) ),
  times = 1000
)

# Unit: milliseconds
# expr      min       lq     mean   median       uq       max neval
# akrun    15.44811 23.50436 36.81674 43.12405 44.22435  69.62202  1000
# PoChoi.1 33.96411 51.52858 83.29682 95.53486 99.60884 241.59967  1000
# PoChoi.2 33.64844 51.70747 83.47835 95.07223 99.44127 247.55881  1000
# mra68.1  11.05145 17.33191 27.21526 31.41954 32.34819 126.89461  1000
# mra68.2  11.05054 17.16615 26.96236 31.25061 32.14054  85.44422  1000

您是否在中进行了搜索和研究?RAlso已经回答了很多类似的问题,请记下您对2008年和2009年的预期答案,因为此答案将根据数据的排序方式而有所不同。可能重复的问题谢谢您的评论!:)您是否在中进行了搜索和研究?RAlso已经回答了很多类似的问题,请记下您对2008年和2009年的预期答案,因为此答案将根据数据的排序方式而有所不同。可能重复的问题谢谢您的评论!:)我喜欢这个答案,假设OP想要什么,因为它考虑了少于5个案例的情况。这个简单的代码很棒,这正是我想要的!非常感谢!:)我喜欢这个答案,假设OP想要什么,因为它考虑了少于5个案例的情况。这个简单的代码很棒,这正是我想要的!非常感谢!:)使用
aggregate
公式
var1~year
对新的R用户来说更直观,但是,相比之下,它的缺点是速度较慢。使用
aggregate
公式
var1~year
对新的R用户来说更直观,但相比之下,它的缺点是速度较慢。非常感谢,谢谢你的建议。下面的代码非常适合我测试和学习:)非常感谢阿克伦的建议。下面的代码非常适合我测试和学习:)