基于r中以前出现的值进行贴现
我有以下(示例)数据集: 我的数据集很大,因此非常感谢基于r中以前出现的值进行贴现,r,data.table,grouping,discount,rowwise,R,Data.table,Grouping,Discount,Rowwise,我有以下(示例)数据集: 我的数据集很大,因此非常感谢data.table解决方案 提前感谢您的帮助。使用.I在数据集上创建一个行索引列,将'sub_class'的数据子集,其中'sub_new_I'为1,按'firm','year','subclass'获取唯一的行,然后按'firm','sub class'分组,通过将下一个“年”(lead)和“年”之间的差的负exp的lag除以5('newdf')来创建折扣_I列。然后,与“rn”上的原始数据进行联接 df[, rn := .I] newd
data.table
解决方案
提前感谢您的帮助。使用
.I
在数据集上创建一个行索引列,将'sub_class'的数据子集,其中'sub_new_I'为1,按'firm','year','subclass'获取唯一的
行,然后按'firm','sub class'分组,通过将下一个“年”(lead
)和“年”之间的差的负exp
的lag
除以5('newdf')来创建折扣_I列。然后,与“rn”上的原始数据进行联接
df[, rn := .I]
newdf <- unique(df[df[, sub_class %in% sub_class[sub_new_I == 1]]],
by = c('firm', 'year', 'sub_class'))[, discounted_I :=
shift(exp(-(shift(year, type = 'lead') -
year)/5)), .(firm, sub_class)]
df[newdf, discounted_I := discounted_I, on = .(rn)]
另一个简单的想法
fifelse
diff.year
> df
firm year patent_number class sub_class sub_new_I discounted_I
1: A 1994 5505081 73 147 0 0.0000
2: A 1994 5505081 73 12.07 0 0.0000
3: A 1994 5606110 73 12.08 0 0.0000
4: A 1994 5606110 73 147 0 0.0000
5: A 1994 5837890 73 116.03 0 0.0000
6: A 1994 5837890 73 184 0 0.0000
7: A 1994 5837890 73 185 0 0.0000
8: A 1994 5837890 73 186 0 0.0000
9: A 1994 5837890 73 198 0 0.0000
10: A 1994 5837890 73 210 0 0.0000
11: A 2000 6725912 165 144 0 0.0000
12: A 2000 6725912 165 147 1 0.3012
13: A 2000 6725912 165 140 0 0.0000
14: A 2002 6748800 73 147 1 0.6703
15: A 2002 6748800 73 E29.272 0 0.0000
16: A 2002 6748800 73 E29.309 0 0.0000
17: A 2003 7153136 434 59 0 0.0000
18: A 2003 6997049 73 147 1 0.8187
19: A 2003 6997049 73 E29.272 1 0.8187
20: A 2003 6997049 73 E29.309 1 0.8187
21: B 1975 4026555 463 3 0 0.0000
22: B 1975 4026555 463 168 0 0.0000
23: B 1975 4026555 463 473 0 0.0000
24: B 1975 4026555 463 960 0 0.0000
25: B 1975 4026555 463 552 0 0.0000
26: B 1975 4026555 463 31 0 0.0000
27: B 1976 4155095 348 701 0 0.0000
28: B 1976 4155095 348 593 0 0.0000
29: B 1977 4137556 361 91.2 0 0.0000
30: B 1977 4137556 361 72 0 0.0000
31: B 1977 4137556 361 58 0 0.0000
32: B 1977 4137556 361 59 0 0.0000
33: B 1977 4137556 361 111 0 0.0000
34: B 1977 4137556 361 222 0 0.0000
35: B 1977 4137556 361 93.05 0 0.0000
36: B 1977 4137556 361 117 0 0.0000
37: B 1977 4137556 361 709 0 0.0000
38: B 1978 4253157 707 104.1 0 0.0000
39: B 1978 4253157 707 93.25 0 0.0000
40: B 1978 4253157 707 552 1 0.5488
df[, rn := .I]
newdf <- unique(df[df[, sub_class %in% sub_class[sub_new_I == 1]]],
by = c('firm', 'year', 'sub_class'))[, discounted_I :=
shift(exp(-(shift(year, type = 'lead') -
year)/5)), .(firm, sub_class)]
df[newdf, discounted_I := discounted_I, on = .(rn)]
firm year patent_number class sub_class sub_new_I rn discounted_I
1: A 1994 5505081 73 147 0 1 NA
2: A 1994 5505081 73 12.07 0 2 NA
3: A 1994 5606110 73 12.08 0 3 NA
4: A 1994 5606110 73 147 0 4 NA
5: A 1994 5837890 73 116.03 0 5 NA
6: A 1994 5837890 73 184 0 6 NA
7: A 1994 5837890 73 185 0 7 NA
8: A 1994 5837890 73 186 0 8 NA
9: A 1994 5837890 73 198 0 9 NA
10: A 1994 5837890 73 210 0 10 NA
11: A 2000 6725912 165 144 0 11 NA
12: A 2000 6725912 165 147 1 12 0.3011942
13: A 2000 6725912 165 140 0 13 NA
14: A 2002 6748800 73 147 1 14 0.6703200
15: A 2002 6748800 73 E29.272 0 15 NA
16: A 2002 6748800 73 E29.309 0 16 NA
17: A 2003 7153136 434 59 0 17 NA
18: A 2003 6997049 73 147 1 18 0.8187308
19: A 2003 6997049 73 E29.272 1 19 0.8187308
20: A 2003 6997049 73 E29.309 1 20 0.8187308
21: B 1975 4026555 463 3 0 21 NA
22: B 1975 4026555 463 168 0 22 NA
23: B 1975 4026555 463 473 0 23 NA
24: B 1975 4026555 463 960 0 24 NA
25: B 1975 4026555 463 552 0 25 NA
26: B 1975 4026555 463 31 0 26 NA
27: B 1976 4155095 348 701 0 27 NA
28: B 1976 4155095 348 593 0 28 NA
29: B 1977 4137556 361 91.2 0 29 NA
30: B 1977 4137556 361 72 0 30 NA
31: B 1977 4137556 361 58 0 31 NA
32: B 1977 4137556 361 59 1 32 NA
33: B 1977 4137556 361 111 0 33 NA
34: B 1977 4137556 361 222 0 34 NA
35: B 1977 4137556 361 93.05 0 35 NA
36: B 1977 4137556 361 117 0 36 NA
37: B 1977 4137556 361 709 0 37 NA
38: B 1978 4253157 707 104.1 0 38 NA
39: B 1978 4253157 707 93.25 0 39 NA
40: B 1978 4253157 707 552 1 40 0.5488116
df[,diff.year := year - lag(year),by=.(firm,sub_class)]
df[, discounted_I := fifelse(sub_new_I == 1,exp(-diff.year/5),0),
by=.(firm,sub_class)]
df[,diff.year:=NULL]