R使用data.table计算依赖于前几行的列

R使用data.table计算依赖于前几行的列,r,performance,for-loop,data.table,rcpp,R,Performance,For Loop,Data.table,Rcpp,我有多个与日常销售相关的产品。我想根据每种产品的持续累积销售额和一段时间内我预期的总销售额来预测这些产品的预期日销售额 第一个表(“键”)列出了每种产品的预期总销售额,以及我根据已经售出的数量预测每天的销售量(也就是说,如果我的产品A的累计销售额是650,那么我已经卖出了1500个总数的43%,因此我预计第二天会卖出75个,因为40%不确定这是否真的能帮助您处理给定大小的实际数据集 library(data.table) #convert key into a list for fast lo

我有多个与日常销售相关的产品。我想根据每种产品的持续累积销售额和一段时间内我预期的总销售额来预测这些产品的预期日销售额


第一个表(“键”)列出了每种产品的预期总销售额,以及我根据已经售出的数量预测每天的销售量(也就是说,如果我的产品A的累计销售额是650,那么我已经卖出了1500个总数的43%,因此我预计第二天会卖出75个,因为40%不确定这是否真的能帮助您处理给定大小的实际数据集

library(data.table)

#convert key into a list for fast loookup
keyLs <- lapply(split(key, by="Product"), 
    function(x) list(TotalSales=x[,TotalSales[1L]], 
                     Percent=x[,Percent], 
                     Forecast=x[,Forecast]))

#for each product, use recursion to calculate cumulative sales after finding the forecasted sales
futureSales <- data[, {
        byChar <- as.character(.BY)
        list(Date=Date[Time=="Future"], 
            Cum=Reduce(function(x, y) {
                pct <- x / keyLs[[byChar]]$TotalSales
                res <- x + keyLs[[byChar]]$Forecast[findInterval(pct, c(0, keyLs[[byChar]]$Percent))]
                if (res >= keyLs[[byChar]]$TotalSales) return(keyLs[[byChar]]$TotalSales)
                res
            },
            x=rep(0L, sum(Time=="Future")),
            init=sum(Sales[Time=="Past"]),
            accumulate=TRUE)[-1])
    },
    by=.(Product)]
futureSales 

#calculate other sales stats
futureSales[data, on=.(Date, Product)][,
    Cum := ifelse(is.na(Cum), cumsum(Sales), Cum),
    by=.(Product)][,
        ':=' (
            Percent.Actual = Cum / keyLs[[as.character(.BY)]]$TotalSales,
            Forecast = ifelse(Sales > 0, 0, c(0, diff(Cum)))
        ), by=.(Product)][]
#     Product Date Cum   Time Sales Percent.Actual Forecast
#  1:       A    1 190   Past   190      0.1266667        0
#  2:       A    2 355   Past   165      0.2366667        0
#  3:       A    3 488   Past   133      0.3253333        0
#  4:       A    4 608   Past   120      0.4053333        0
#  5:       A    5 683 Future     0      0.4553333       75
#  6:       A    6 758 Future     0      0.5053333       75
#  7:       A    7 833 Future     0      0.5553333       75
#  8:       A    8 908 Future     0      0.6053333       75
#  9:       A    9 958 Future     0      0.6386667       50
# 10:       B    1  72   Past    72      0.0960000        0
# 11:       B    2 130   Past    58      0.1733333        0
# 12:       B    3 193   Past    63      0.2573333        0
# 13:       B    4 244   Past    51      0.3253333        0
# 14:       B    5 304 Future     0      0.4053333       60
# 15:       B    6 349 Future     0      0.4653333       45
# 16:       B    7 394 Future     0      0.5253333       45
# 17:       B    8 439 Future     0      0.5853333       45
# 18:       B    9 484 Future     0      0.6453333       45
库(data.table)
#将密钥转换为列表,以便快速查找

keyLs为什么
Cum
值从第10行重新开始?请编辑以避免“文本墙”的印象。我认为您的数据和最终表缺少一个可以回答@MKR问题的产品列。“最终”表合并到以前构建的“数据”表中,该表有一个产品列。“Cum”每种产品的列重置。此答案在速度上有了显著的提高,谢谢!但是如果它能更快,我将受益匪浅。此外,当将修改后的脚本应用于我的完整数据集时,我遇到了一个特殊问题,即一旦Cum达到TotalSales限制,第二天的预测值将为负数r等于之前所有未来时间步的总预测销售额。想法?您如何处理一年内累计销售额超过1500的问题?这在OPGreat问题中没有提到。我们假设,作为一家销售独特产品的小公司,我们只能在可用产品用完之前销售预测的最大销售额。因此,一旦销售1500件产品A,预测产量将变为0。
library(data.table)

#convert key into a list for fast loookup
keyLs <- lapply(split(key, by="Product"), 
    function(x) list(TotalSales=x[,TotalSales[1L]], 
                     Percent=x[,Percent], 
                     Forecast=x[,Forecast]))

#for each product, use recursion to calculate cumulative sales after finding the forecasted sales
futureSales <- data[, {
        byChar <- as.character(.BY)
        list(Date=Date[Time=="Future"], 
            Cum=Reduce(function(x, y) {
                pct <- x / keyLs[[byChar]]$TotalSales
                res <- x + keyLs[[byChar]]$Forecast[findInterval(pct, c(0, keyLs[[byChar]]$Percent))]
                if (res >= keyLs[[byChar]]$TotalSales) return(keyLs[[byChar]]$TotalSales)
                res
            },
            x=rep(0L, sum(Time=="Future")),
            init=sum(Sales[Time=="Past"]),
            accumulate=TRUE)[-1])
    },
    by=.(Product)]
futureSales 

#calculate other sales stats
futureSales[data, on=.(Date, Product)][,
    Cum := ifelse(is.na(Cum), cumsum(Sales), Cum),
    by=.(Product)][,
        ':=' (
            Percent.Actual = Cum / keyLs[[as.character(.BY)]]$TotalSales,
            Forecast = ifelse(Sales > 0, 0, c(0, diff(Cum)))
        ), by=.(Product)][]
#     Product Date Cum   Time Sales Percent.Actual Forecast
#  1:       A    1 190   Past   190      0.1266667        0
#  2:       A    2 355   Past   165      0.2366667        0
#  3:       A    3 488   Past   133      0.3253333        0
#  4:       A    4 608   Past   120      0.4053333        0
#  5:       A    5 683 Future     0      0.4553333       75
#  6:       A    6 758 Future     0      0.5053333       75
#  7:       A    7 833 Future     0      0.5553333       75
#  8:       A    8 908 Future     0      0.6053333       75
#  9:       A    9 958 Future     0      0.6386667       50
# 10:       B    1  72   Past    72      0.0960000        0
# 11:       B    2 130   Past    58      0.1733333        0
# 12:       B    3 193   Past    63      0.2573333        0
# 13:       B    4 244   Past    51      0.3253333        0
# 14:       B    5 304 Future     0      0.4053333       60
# 15:       B    6 349 Future     0      0.4653333       45
# 16:       B    7 394 Future     0      0.5253333       45
# 17:       B    8 439 Future     0      0.5853333       45
# 18:       B    9 484 Future     0      0.6453333       45