在R中计算年与前一年之间的余额值

在R中计算年与前一年之间的余额值,r,loops,balance,R,Loops,Balance,我试图从df中得到一个平衡值,看起来像这样 df1 Name Year Ch1 Origin A 1995 x1 a A 1996 x2 b A 1997 x3 a A 2000 x4 a B 1997 y1 c B 1998 y2 c 而Ch1是数值型的。我想添加一个额外的列来获得这个值: Name Year C

我试图从df中得到一个平衡值,看起来像这样

df1

Name   Year    Ch1    Origin
A      1995    x1      a
A      1996    x2      b
A      1997    x3      a
A      2000    x4      a
B      1997    y1      c
B      1998    y2      c
而Ch1是数值型的。我想添加一个额外的列来获得这个值:

Name   Year   Ch1    Bil
A      1995    x1    
A      1996    x2    %
A      1997    x3    %
A      2000    x4    %
B      1997    y1  
B      1998    y2    %
如果Xi>=Xi-1和
-Xi-1/Xi如果Xidf[i-1,3]{

df$Bil我们可以从
dplyr
使用
lag

library(dplyr)
df1 %>% 
  arrange(Year) %>% 
  group_by(Name) %>% 
  mutate(Bil = case_when(Ch1 >= lag(Ch1) ~ Ch1 / lag(Ch1), 
                         Ch1 < lag(Ch1) ~ -lag(Ch1)/Ch1))
库(dplyr)
df1%>%
安排(年)%>%
分组单位(名称)%>%
当(Ch1>=滞后(Ch1)~Ch1/滞后(Ch1)时,突变(Bil=情况_),
Ch1
数据

df1 <- structure(list(Name = structure(c(1L, 1L, 1L, 1L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), Year = c(1995L, 1996L, 1997L, 2000L, 
1997L, 1998L), Ch1 = structure(1:6, .Label = c("x1", "x2", "x3", 
"x4", "y1", "y2"), class = "factor"), Origin = structure(c(1L, 
2L, 1L, 1L, 3L, 3L), .Label = c("a", "b", "c"), class = "factor")), class = "data.frame", row.names = c(NA, 
-6L))
df1<-df1 %>% mutate(Ch1 = round(runif(n=6,100,1000),2))

df1这里有一个
data.table
方法,使用
shift

library(data.table)
dat <- as.data.table(df1)

dat$value <- rnorm(6, 20, 1) #adding a numeric column 

dat1 <- dat[order(Year)][, 
             Bil := ifelse(test = shift(x = value, n = 1, type = 'lag') > value, 
                           yes = shift(x = value, n = 1, type = 'lag')/value, 
                           no = value/shift(x = value, n = 1, type = 'lag'))]

> dat
   Name Year Ch1 Origin    value
1:    A 1995  x1      a 19.23394
2:    A 1996  x2      b 21.16079
3:    A 1997  x3      a 20.87078
4:    A 2000  x4      a 20.50770
5:    B 1997  y1      c 20.39450
6:    B 1998  y2      c 20.53281

> dat1
   Name Year Ch1 Origin    value      Bil
1:    A 1995  x1      a 19.23394       NA
2:    A 1996  x2      b 21.16079 1.100179
3:    A 1997  x3      a 20.87078 1.013895
4:    B 1997  y1      c 20.39450 1.023353
5:    B 1998  y2      c 20.53281 1.006782
6:    A 2000  x4      a 20.50770 1.001224
库(data.table)
dat dat1
名称年份Ch1原始值Bil
1:A 1995 x1 A 19.23394 NA
2:A 1996 x2 b 21.16079 1.100179
3:A 1997 x3 A 20.87078 1.013895
4:B 1997 y1 c 20.39450 1.023353
5:B 1998 y2 c 20.53281 1.006782
6:A 2000 x4 A 20.50770 1.001224

什么是不可理解的?我可以试着重新表述一下,lag有什么作用?我在尝试运行时收到一条错误消息。原因可能是前一年没有Ch1值吗?错误:
#a tible:3396 x 4#Groups:Name[943]名称年份Q14a Bil 1 Ch1 2010 38000 NA 2 Ch1 2011 43200 1.14 3 Ch1 2012 41080-1.05 4 Ch1 2013 43400 1.06 5 Ch1 2014 43183-1.01 6 Ch1 2015 42600-1.01…#…更多行3386行
出于某种原因,代码计算的结果未“保存”在df;中,所有NA和负值在错误中标记为红色。您可以尝试用
dplyr::lag
替换每个
lag
。这不起作用,顺序是按年份而不是名称和年份进行的,因此计算出错误的估计值
library(data.table)
dat <- as.data.table(df1)

dat$value <- rnorm(6, 20, 1) #adding a numeric column 

dat1 <- dat[order(Year)][, 
             Bil := ifelse(test = shift(x = value, n = 1, type = 'lag') > value, 
                           yes = shift(x = value, n = 1, type = 'lag')/value, 
                           no = value/shift(x = value, n = 1, type = 'lag'))]

> dat
   Name Year Ch1 Origin    value
1:    A 1995  x1      a 19.23394
2:    A 1996  x2      b 21.16079
3:    A 1997  x3      a 20.87078
4:    A 2000  x4      a 20.50770
5:    B 1997  y1      c 20.39450
6:    B 1998  y2      c 20.53281

> dat1
   Name Year Ch1 Origin    value      Bil
1:    A 1995  x1      a 19.23394       NA
2:    A 1996  x2      b 21.16079 1.100179
3:    A 1997  x3      a 20.87078 1.013895
4:    B 1997  y1      c 20.39450 1.023353
5:    B 1998  y2      c 20.53281 1.006782
6:    A 2000  x4      a 20.50770 1.001224