在R中将数据帧转换为时间序列
我有一个数据帧:在R中将数据帧转换为时间序列,r,time-series,dataset,data-cleaning,arima,R,Time Series,Dataset,Data Cleaning,Arima,我有一个数据帧: > dsa[1:20] Ordered.Item date Qty 1: 2011001FAM002025001 2019-06-01 19440.00 2: 2011001FAM002025001 2019-05-01 24455.53 3: 2011001FAM002025001 2019-04-01 16575.06 4: 2011001FAM002025001 2019-03-01 880.00 5: 2
> dsa[1:20]
Ordered.Item date Qty
1: 2011001FAM002025001 2019-06-01 19440.00
2: 2011001FAM002025001 2019-05-01 24455.53
3: 2011001FAM002025001 2019-04-01 16575.06
4: 2011001FAM002025001 2019-03-01 880.00
5: 2011001FAM002025001 2019-02-01 5000.00
6: 2011001FAM002035001 2019-04-01 175.00
7: 2011001FAM004025001 2019-06-01 2000.00
8: 2011001FAM004025001 2019-05-01 2500.00
9: 2011001FAM004025001 2019-04-01 3000.00
10: 2011001FAM012025001 2019-06-01 1200.00
11: 2011001FAM012025001 2019-04-01 1074.02
12: 2011001FAM022025001 2019-06-01 350.00
13: 2011001FAM022025001 2019-05-01 110.96
14: 2011001FAM022025001 2019-04-01 221.13
15: 2011001FAM022035001 2019-06-01 500.00
16: 2011001FAM022035001 2019-05-01 18.91
17: 2011001FAM027025001 2019-06-01 210.00
18: 2011001FAM028025001 2019-04-01 327.21
19: 2011001FBK005035001 2019-05-01 500.00
20: 2011001FBL001025001 2019-06-01 15350.00
>str(dsa)
Classes ‘data.table’ and 'data.frame': 830 obs. of 3 variables:
$ Ordered.Item: Factor w/ 435 levels "2011001FAM002025001",..: 1 1 1 1 1 2 3 3 3 4 ...
$ date : Date, format: "2019-06-01" "2019-05-01" "2019-04-01" ...
$ Qty : num 19440 24456 16575 880 5000 ...
- attr(*, ".internal.selfref")=<externalptr>
>dsa[1:20]
订购。项目日期数量
1:2011001FAM00205001 2019-06-01 19440.00
2:2011001FAM00205001 2019-05-01 24455.53
3:2011001FAM00205001 2019-04-01 16575.06
4:2011001FAM00205001 2019-03-01 880.00
5:2011001FAM00205001 2019-02-01 5000.00
6:2011001FAM002035001 2019-04-01 175.00
7:2011001FAM004025001 2019-06-01 2000.00
8:2011001FAM004025001 2019-05-01 2500.00
9:2011001FAM004025001 2019-04-01 3000.00
10:2011001FAM012025001 2019-06-01 1200.00
11:2011001FAM012025001 2019-04-01 1074.02
12:2011001FAM022025001 2019-06-01 350.00
13:2011001FAM022025001 2019-05-01 110.96
14:2011001FAM022025001 2019-04-01 221.13
15:2011001FAM022035001 2019-06-01 500.00
16:2011001FAM022035001 2019-05-01 18.91
17:2011001FAM027025001 2019-06-01 210.00
18:2011001FAM028025001 2019-04-01327.21
19:2011001FBK005035001 2019-05-01 500.00
20:2011001FBL001025001 2019-06-01 15350.00
>str(dsa)
类“data.table”和“data.frame”:830 obs。共有3个变量:
订购美元。项目:系数w/435级别“2011001FAM00205001”…:1 2 3 4。。。
$date:日期,格式:“2019-06-01”“2019-05-01”“2019-04-01”。。。
$Qty:num 19440 24456 16575 880 5000。。。
-属性(*,“.internal.selfref”)=
此数据包含sku及其每月销售量
因为我计划使用ARIMA预测,所以我试图将数据帧转换为时间序列,但我得到了一个奇怪的输出
> timesr<-ts(data=dsa,start=c(12,2018),frequency = 12)
> head(timesr)
Ordered.Item date Qty
[1,] 1 18048 19440.00
[2,] 1 18017 24455.53
[3,] 1 17987 16575.06
[4,] 1 17956 880.00
[5,] 1 17928 5000.00
[6,] 2 17987 175.00
>timesr头(timesr)
订购。项目日期数量
[1,] 1 18048 19440.00
[2,] 1 18017 24455.53
[3,] 1 17987 16575.06
[4,] 1 17956 880.00
[5,] 1 17928 5000.00
[6,] 2 17987 175.00
对于sku ARIMA建模,您可以尝试类似的方法
# Create dataframe
dsa = read.table(text = '
ID Ordered.Item date Qty
1 2011001FAM002025001 2019-06-01 19440.00
2 2011001FAM002025001 2019-05-01 24455.53
3 2011001FAM002025001 2019-04-01 16575.06
4 2011001FAM002025001 2019-03-01 880.00
5 2011001FAM002025001 2019-02-01 5000.00
6 2011001FAM002035001 2019-04-01 175.00
7 2011001FAM004025001 2019-06-01 2000.00
8 2011001FAM004025001 2019-05-01 2500.00
9 2011001FAM004025001 2019-04-01 3000.00
10 2011001FAM012025001 2019-06-01 1200.00
11 2011001FAM012025001 2019-04-01 1074.02
12 2011001FAM022025001 2019-06-01 350.00
13 2011001FAM022025001 2019-05-01 110.96
14 2011001FAM022025001 2019-04-01 221.13
15 2011001FAM022035001 2019-06-01 500.00
16 2011001FAM022035001 2019-05-01 18.91
17 2011001FAM027025001 2019-06-01 210.00
18 2011001FAM028025001 2019-04-01 327.21
19 2011001FBK005035001 2019-05-01 500.00
20 2011001FBL001025001 2019-06-01 15350.00
', header = T)
dsa$ID <- NULL
# Reshape
dsa2 <- reshape(data=dsa,idvar="date", v.names = "Qty", timevar = "Ordered.Item", direction="wide")
dsa2 <- dsa2[order(as.Date(dsa2$date, "%Y-%m-%d")),] # Sort by date
# Predict for sku 2011001FAM002025001
fit <- auto.arima(ts(dsa2$Qty.2011001FAM002025001))
fcast <- forecast(fit, h=60) # forecast 60 periods ahead
plot(fcast)
#创建数据帧
dsa=读取。表格(文本=)
订购ID。项目日期数量
2011001FAM002250012019-06-0119440.00
2011001FAM002250012019-05-01 24455.53
3 2011001FAM00205001 2019-04-01 16575.06
4 2011001FAM002250012019-03-01880.00
5 2011001FAM00205001 2019-02-01 5000.00
6 2011001FAM002035001 2019-04-01 175.00
7 2011001FAM004025001 2019-06-01 2000.00
8 2011001FAM004025001 2019-05-01 2500.00
9 2011001FAM004025001 2019-04-01 3000.00
10 2011001FAM012025001 2019-06-01 1200.00
11 2011001FAM012025001 2019-04-01 1074.02
12 2011001FAM022025001 2019-06-01 350.00
13 2011001FAM022025001 2019-05-01 110.96
14 2011001FAM022025001 2019-04-01 221.13
15 2011001FAM022035001 2019-06-01 500.00
16 2011001FAM022035001 2019-05-01 18.91
17 2011001FAM027025001 2019-06-01 210.00
18 2011001FAM028025001 2019-04-01 327.21
19 2011001FBK005035001 2019-05-01 500.00
20 2011001FBL001025001 2019-06-01 15350.00
,页眉=T)
dsa$ID类(dsa2)是数据表/数据帧我的印象是,对于arima预测,整个数据帧应该转换为ts,不是吗?上面的代码中应用了一个“ts”函数(从最后一行算起的第三行)。