R 条件减法第2部分

R 条件减法第2部分,r,conditional,multiplication,R,Conditional,Multiplication,我有一个big data.frame(TOTAL)和一些值(cols11-16),我需要从中减去一个基数,然后根据TOTAL中的两个条件乘以一个值 data.frame(TOTAL)看起来有点像这样 Channel Hour Category cols11 cols12 cols13 cols14 cols15 base TV1 01:00:00 New 2 5 4 5 6 2.4 TV5

我有一个big data.frame(
TOTAL
)和一些值(cols11-16),我需要从中减去一个基数,然后根据
TOTAL
中的两个条件乘以一个值

data.frame(
TOTAL
)看起来有点像这样

Channel    Hour      Category cols11 cols12 cols13 cols14 cols15 base
TV1        01:00:00  New      2      5      4      5      6      2.4
TV5        23:00:00  Old      1      5      3      9      7      1.8
TV1        02:00:00  New      8      7      9      2      4      5.4
有4个不同的频道和24个不同的小时(
00:00:00-23:00:00

我还有其他四个向量,其中的conditionedvariable需要根据小时和频道在基上相乘,因此对于每个频道,我都有一个这样的向量:

TV1Slope:
TV1Slope00 TV1Slope01 TV1Slope02.. TV1Slope23
 0.0012      0.0015    0.013       0.0009

TV5Slope:
TV5Slope00 TV5Slope01 TV5Slope02.. TV5Slope23
0.0032      0.0023    0.016       0.002

TOTAL$Uplift0 <- (TOTAL$cols11 - TOTAL$base * conditionedvariable)
TOTAL$Uplift1 <- (TOTAL$cols12 - TOTAL$base * conditionedvariable)
TOTAL$Uplift2 <- (TOTAL$cols13 - TOTAL$base * conditionedvariable)
TOTAL$Uplift3 <- (TOTAL$cols14 - TOTAL$base * conditionedvariable)
TOTAL$Uplift4 <- (TOTAL$cols15 - TOTAL$base * conditionedvariable)
对于频道为TV1且小时数为
01:00:00:2-2.4*0.0015的第一行
第二排频道为TV5,小时数为
23:00:00:1-1.8*0.002

对于频道为TV1且小时为
02:00:00:8-5.4*0.013
的第三行,我们
将'hour'列('nm1')的'Channel'和
子字符串粘贴在一起,将'TV1Slope'和'TV5Slope'向量('TV15')连接起来,
使用
sub
删除“Slope”子字符串后,将名称为“TV15”的“nm1”向量匹配,并获得相应的“TV15”值。使用
grep
将名称以“cols”开头的列子集,进行计算,并将其分配给新列(“nm2”)

nm1
 cols11 - base * conditionedvariable
nm1 <- with(TOTAL, paste0(Channel, substr(Hour, 1,2)))
TV15 <- c(TV1Slope, TV5Slope)
val <- TV15[match(nm1, sub('Slope', '', names(TV15)))]
indx <- grep('^cols', names(TOTAL))
nm2 <- paste0('Uplift',seq_along(indx)-1)
TOTAL[nm2] <- TOTAL[indx]-(TOTAL$base*val)
TOTAL
#  Channel     Hour Category cols11 cols12 cols13 cols14 cols15 base   Uplift0
#1     TV1 01:00:00      New      2      5      4      5      6  2.4 1.9946026
#2     TV5 23:00:00      Old      1      5      3      9      7  1.8 0.9823184
#3     TV1 02:00:00      New      8      7      9      2      4  5.4 7.9619720
#   Uplift1  Uplift2  Uplift3  Uplift4
#1 4.994603 3.994603 4.994603 5.994603
#2 4.982318 2.982318 8.982318 6.982318
#3 6.961972 8.961972 1.961972 3.961972
TOTAL <- structure(list(Channel = c("TV1", "TV5", "TV1"), Hour = c("01:00:00", 
"23:00:00", "02:00:00"), Category = c("New", "Old", "New"), cols11 = c(2L, 
1L, 8L), cols12 = c(5L, 5L, 7L), cols13 = c(4L, 3L, 9L), cols14 = c(5L, 
9L, 2L), cols15 = c(6L, 7L, 4L), base = c(2.4, 1.8, 5.4)), .Names = c("Channel", 
"Hour", "Category", "cols11", "cols12", "cols13", "cols14", "cols15", 
"base"), class = "data.frame", row.names = c(NA, -3L))

set.seed(24)
TV1Slope <- setNames(runif(24)/100, sprintf('TV1Slope%02d', 0:23))
set.seed(29)
TV5Slope <- setNames(runif(24)/100, sprintf('TV5Slope%02d', 0:23))