在R中计算年与前一年之间的余额值
我试图从df中得到一个平衡值,看起来像这样在R中计算年与前一年之间的余额值,r,loops,balance,R,Loops,Balance,我试图从df中得到一个平衡值,看起来像这样 df1 Name Year Ch1 Origin A 1995 x1 a A 1996 x2 b A 1997 x3 a A 2000 x4 a B 1997 y1 c B 1998 y2 c 而Ch1是数值型的。我想添加一个额外的列来获得这个值: Name Year C
df1
Name Year Ch1 Origin
A 1995 x1 a
A 1996 x2 b
A 1997 x3 a
A 2000 x4 a
B 1997 y1 c
B 1998 y2 c
而Ch1是数值型的。我想添加一个额外的列来获得这个值:
Name Year Ch1 Bil
A 1995 x1
A 1996 x2 %
A 1997 x3 %
A 2000 x4 %
B 1997 y1
B 1998 y2 %
如果Xi>=Xi-1和
-Xi-1/Xi如果Xidf[i-1,3]{
df$Bil我们可以从dplyr
使用lag
library(dplyr)
df1 %>%
arrange(Year) %>%
group_by(Name) %>%
mutate(Bil = case_when(Ch1 >= lag(Ch1) ~ Ch1 / lag(Ch1),
Ch1 < lag(Ch1) ~ -lag(Ch1)/Ch1))
库(dplyr)
df1%>%
安排(年)%>%
分组单位(名称)%>%
当(Ch1>=滞后(Ch1)~Ch1/滞后(Ch1)时,突变(Bil=情况_),
Ch1
数据
df1 <- structure(list(Name = structure(c(1L, 1L, 1L, 1L, 2L, 2L), .Label = c("A",
"B"), class = "factor"), Year = c(1995L, 1996L, 1997L, 2000L,
1997L, 1998L), Ch1 = structure(1:6, .Label = c("x1", "x2", "x3",
"x4", "y1", "y2"), class = "factor"), Origin = structure(c(1L,
2L, 1L, 1L, 3L, 3L), .Label = c("a", "b", "c"), class = "factor")), class = "data.frame", row.names = c(NA,
-6L))
df1<-df1 %>% mutate(Ch1 = round(runif(n=6,100,1000),2))
df1这里有一个data.table
方法,使用shift
library(data.table)
dat <- as.data.table(df1)
dat$value <- rnorm(6, 20, 1) #adding a numeric column
dat1 <- dat[order(Year)][,
Bil := ifelse(test = shift(x = value, n = 1, type = 'lag') > value,
yes = shift(x = value, n = 1, type = 'lag')/value,
no = value/shift(x = value, n = 1, type = 'lag'))]
> dat
Name Year Ch1 Origin value
1: A 1995 x1 a 19.23394
2: A 1996 x2 b 21.16079
3: A 1997 x3 a 20.87078
4: A 2000 x4 a 20.50770
5: B 1997 y1 c 20.39450
6: B 1998 y2 c 20.53281
> dat1
Name Year Ch1 Origin value Bil
1: A 1995 x1 a 19.23394 NA
2: A 1996 x2 b 21.16079 1.100179
3: A 1997 x3 a 20.87078 1.013895
4: B 1997 y1 c 20.39450 1.023353
5: B 1998 y2 c 20.53281 1.006782
6: A 2000 x4 a 20.50770 1.001224
库(data.table)
dat dat1
名称年份Ch1原始值Bil
1:A 1995 x1 A 19.23394 NA
2:A 1996 x2 b 21.16079 1.100179
3:A 1997 x3 A 20.87078 1.013895
4:B 1997 y1 c 20.39450 1.023353
5:B 1998 y2 c 20.53281 1.006782
6:A 2000 x4 A 20.50770 1.001224
什么是不可理解的?我可以试着重新表述一下,lag有什么作用?我在尝试运行时收到一条错误消息。原因可能是前一年没有Ch1值吗?错误:#a tible:3396 x 4#Groups:Name[943]名称年份Q14a Bil 1 Ch1 2010 38000 NA 2 Ch1 2011 43200 1.14 3 Ch1 2012 41080-1.05 4 Ch1 2013 43400 1.06 5 Ch1 2014 43183-1.01 6 Ch1 2015 42600-1.01…#…更多行3386行
出于某种原因,代码计算的结果未“保存”在df;中,所有NA和负值在错误中标记为红色。您可以尝试用dplyr::lag
替换每个lag
。这不起作用,顺序是按年份而不是名称和年份进行的,因此计算出错误的估计值
library(data.table)
dat <- as.data.table(df1)
dat$value <- rnorm(6, 20, 1) #adding a numeric column
dat1 <- dat[order(Year)][,
Bil := ifelse(test = shift(x = value, n = 1, type = 'lag') > value,
yes = shift(x = value, n = 1, type = 'lag')/value,
no = value/shift(x = value, n = 1, type = 'lag'))]
> dat
Name Year Ch1 Origin value
1: A 1995 x1 a 19.23394
2: A 1996 x2 b 21.16079
3: A 1997 x3 a 20.87078
4: A 2000 x4 a 20.50770
5: B 1997 y1 c 20.39450
6: B 1998 y2 c 20.53281
> dat1
Name Year Ch1 Origin value Bil
1: A 1995 x1 a 19.23394 NA
2: A 1996 x2 b 21.16079 1.100179
3: A 1997 x3 a 20.87078 1.013895
4: B 1997 y1 c 20.39450 1.023353
5: B 1998 y2 c 20.53281 1.006782
6: A 2000 x4 a 20.50770 1.001224