R 使用推断数据从一列值计算另一列值

R 使用推断数据从一列值计算另一列值,r,if-statement,transform,calculated-columns,R,If Statement,Transform,Calculated Columns,我试图创建一个函数,将一列中的实际值和推断值相加,以创建另一列。我的数据格式如下: Nest <- c(a,b,c,d,e,a,c,a,d,c,b) Age <- c(5,5,4,6,5,7,6,9,10,8,10) Brood <- c(4,3,4,4,3,4,3,3,4,3,1) df <- data.frame(Nest, Age, Brood) 以Nest C为例说明如何计算。在第一次访问第3排时,这个巢穴有4天大,包含4只小鸡,因此Sum.Br=4*4=16。

我试图创建一个函数,将一列中的实际值和推断值相加,以创建另一列。我的数据格式如下:

Nest <- c(a,b,c,d,e,a,c,a,d,c,b)
Age <- c(5,5,4,6,5,7,6,9,10,8,10)
Brood <- c(4,3,4,4,3,4,3,3,4,3,1)
df <- data.frame(Nest, Age, Brood)
Nest C
为例说明如何计算。在第一次访问第3排时,这个巢穴有4天大,包含4只小鸡,因此
Sum.Br
=4*4=16。下一次看到它时,在第7排,小鸡已经6天大了,但只剩下3只了。因此,
Sum.Br
取之前的值(16),将中间天数的一半加上原来的小鸡数(4),一半加上新的小鸡数(3),因此16+4+3=23。在第10排,小鸡已经8天大了(离最后一次访巢还有2天),巢中还有3只,因此
Sum.Br
=23+3+3=29

我试图通过一系列
ifelse
命令来实现这一点,这些命令包装在
transform
中:

tmp <- transform(df, Sum.Br = ave(Brood, Nest, FUN = function(x)
                                  c(df$Age*x[1],
                                    ifelse(x[2] == x[1],
                                           df$Age*x[2],
                                           df$Age[x[1]]*x[1] + (df$Age[x[2]]-df$Age[x[1]])*((x[1]+x[2])/2)),
                                    ifelse(x[3] == x[2],
                                           ifelse(x[2]==x[1],
                                                  df$Age*x[3],
                                                  df$Age[x[1]]*x[1] + (df$Age[x[2]]-df$Age[x[1]])*((x[1]+x[2])/2) + (df$Age[[3]]-df$Age[x[2]])*x[3]),
                                           ifelse(x[2]==x[1],
                                                  df$Age[x[2]]*x[2] + (df$Age[x[3]]-df$Age[x[2]])*((x[2]+x[3])/2),
                                                  df$Age[x[1]]*x[1] + (df$Age[x[2]]-df$Age[x[1]])*((x[1]+x[2])/2) + (df$Age[x[3]]-df$Age[x[2]])*((x[2]+x[3])/2))))

tmp其他用户可能有兴趣知道我已经解决了这个问题

我在
plyr
包中使用
ddply
将数据帧按嵌套方式拆分为多个部分:

tmp <- ddply(df, "Nest", function(x){
  df2 <- data.frame(Nest = x$Nest)    # Create a dataframe with columns "Nest"
  df2$Age = x$Age                     # "Age"
  df2$Brood = x$Brood                 # and "Brood" from "df"

# The next bit is a bit long-winded, but serves the purpose
# Create an vector which contains the Sum.Brood values for each visit to that nest
# This takes the Age*Brood for the first visit, and then adds the product of the difference in age between visits and the mean brood between visits

  brood.sum = c(x$Age[1]*x$Brood[1],      
                x$Age[1]*x$Brood[1] + (x$Age[2]-x$Age[1])*((x$Brood[1]+x$Brood[2])/2),
                x$Age[1]*x$Brood[1] + (x$Age[2]-x$Age[1])*((x$Brood[1]+x$Brood[2])/2) + (x$Age[3]-x$Age[2])*((x$Brood[2]+x$Brood[3])/2),
                x$Age[1]*x$Brood[1] + (x$Age[2]-x$Age[1])*((x$Brood[1]+x$Brood[2])/2) + (x$Age[3]-x$Age[2])*((x$Brood[2]+x$Brood[3])/2) + (x$Age[4]-x$Age[3])*((x$Brood[3]+x$Brood[4])/2),
                x$Age[1]*x$Brood[1] + (x$Age[2]-x$Age[1])*((x$Brood[1]+x$Brood[2])/2) + (x$Age[3]-x$Age[2])*((x$Brood[2]+x$Brood[3])/2) + (x$Age[4]-x$Age[3])*((x$Brood[3]+x$Brood[4])/2) + (x$Age[5]-x$Age[4])*((x$Brood[4]+x$Brood[5])/2),
                x$Age[1]*x$Brood[1] + (x$Age[2]-x$Age[1])*((x$Brood[1]+x$Brood[2])/2) + (x$Age[3]-x$Age[2])*((x$Brood[2]+x$Brood[3])/2) + (x$Age[4]-x$Age[3])*((x$Brood[3]+x$Brood[4])/2) + (x$Age[5]-x$Age[4])*((x$Brood[4]+x$Brood[5])/2) + (x$Age[6]-x$Age[5])*((x$Brood[5]+x$Brood[6])/2))

# Add the non-NA elements of that vector to a new column in "df2"

  df2$bs = brood.sum[!is.na(brood.sum)]  
  df})

tmp你说的是天,但不清楚你的意思是什么——年龄栏?此外,如果您在一个示例中进行解释,则更容易理解所需的功能-例如,第7行中的23列“Sum.Br”是如何计算的?抱歉,不清楚。是的,年龄列以天为单位。第一次访问时,巢穴C是4天大的,包含4只小鸡,因此4*4=16。下一次看到它时,在第7排,小鸡已经6天大了,但只剩下3只了。因此,Sum.Br取先前的值(16),将中间的一半天数与旧的小鸡数(4)相加,另一半天数与新的小鸡数(3)相加,因此16+4+3=23。我也会把这个问题加上。
tmp <- ddply(df, "Nest", function(x){
  df2 <- data.frame(Nest = x$Nest)    # Create a dataframe with columns "Nest"
  df2$Age = x$Age                     # "Age"
  df2$Brood = x$Brood                 # and "Brood" from "df"

# The next bit is a bit long-winded, but serves the purpose
# Create an vector which contains the Sum.Brood values for each visit to that nest
# This takes the Age*Brood for the first visit, and then adds the product of the difference in age between visits and the mean brood between visits

  brood.sum = c(x$Age[1]*x$Brood[1],      
                x$Age[1]*x$Brood[1] + (x$Age[2]-x$Age[1])*((x$Brood[1]+x$Brood[2])/2),
                x$Age[1]*x$Brood[1] + (x$Age[2]-x$Age[1])*((x$Brood[1]+x$Brood[2])/2) + (x$Age[3]-x$Age[2])*((x$Brood[2]+x$Brood[3])/2),
                x$Age[1]*x$Brood[1] + (x$Age[2]-x$Age[1])*((x$Brood[1]+x$Brood[2])/2) + (x$Age[3]-x$Age[2])*((x$Brood[2]+x$Brood[3])/2) + (x$Age[4]-x$Age[3])*((x$Brood[3]+x$Brood[4])/2),
                x$Age[1]*x$Brood[1] + (x$Age[2]-x$Age[1])*((x$Brood[1]+x$Brood[2])/2) + (x$Age[3]-x$Age[2])*((x$Brood[2]+x$Brood[3])/2) + (x$Age[4]-x$Age[3])*((x$Brood[3]+x$Brood[4])/2) + (x$Age[5]-x$Age[4])*((x$Brood[4]+x$Brood[5])/2),
                x$Age[1]*x$Brood[1] + (x$Age[2]-x$Age[1])*((x$Brood[1]+x$Brood[2])/2) + (x$Age[3]-x$Age[2])*((x$Brood[2]+x$Brood[3])/2) + (x$Age[4]-x$Age[3])*((x$Brood[3]+x$Brood[4])/2) + (x$Age[5]-x$Age[4])*((x$Brood[4]+x$Brood[5])/2) + (x$Age[6]-x$Age[5])*((x$Brood[5]+x$Brood[6])/2))

# Add the non-NA elements of that vector to a new column in "df2"

  df2$bs = brood.sum[!is.na(brood.sum)]  
  df})
df$Sum.Br <- tmp$bs[match(paste(df$Nest, df$Age, sep="_"),
                          paste(tmp$Nest, tmp$Age, sep="_"))]