在R中按组指示ts频率
这里有三组在R中按组指示ts频率,r,time-series,R,Time Series,这里有三组 timeseries=structure(list(Data = structure(c(10L, 14L, 18L, 22L, 26L, 29L, 32L, 35L, 38L, 1L, 4L, 7L, 11L, 15L, 19L, 23L, 27L, 30L, 33L, 36L, 39L, 2L, 5
timeseries=structure(list(Data = structure(c(10L, 14L, 18L, 22L, 26L, 29L,
32L, 35L, 38L, 1L, 4L, 7L, 11L, 15L, 19L, 23L, 27L, 30L, 33L,
36L, 39L, 2L, 5L, 8L, 12L, 16L, 20L, 24L, 28L, 31L, 34L, 37L,
40L, 3L, 6L, 9L, 13L, 17L, 21L, 25L), .Label = c("01.01.2018",
"01.01.2019", "01.01.2020", "01.02.2018", "01.02.2019", "01.02.2020",
"01.03.2018", "01.03.2019", "01.03.2020", "01.04.2017", "01.04.2018",
"01.04.2019", "01.04.2020", "01.05.2017", "01.05.2018", "01.05.2019",
"01.05.2020", "01.06.2017", "01.06.2018", "01.06.2019", "01.06.2020",
"01.07.2017", "01.07.2018", "01.07.2019", "01.07.2020", "01.08.2017",
"01.08.2018", "01.08.2019", "01.09.2017", "01.09.2018", "01.09.2019",
"01.10.2017", "01.10.2018", "01.10.2019", "01.11.2017", "01.11.2018",
"01.11.2019", "01.12.2017", "01.12.2018", "01.12.2019"), class = "factor"),
client = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L), .Label = c("Horns", "Kornev"), class = "factor"), stuff = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("chickens",
"hooves", "Oysters"), class = "factor"), Sales = c(374L,
12L, 120L, 242L, 227L, 268L, 280L, 419L, 12L, 172L, 336L,
117L, 108L, 150L, 90L, 117L, 116L, 146L, 120L, 211L, 213L,
67L, 146L, 118L, 152L, 122L, 201L, 497L, 522L, 65L, 268L,
441L, 247L, 348L, 445L, 477L, 62L, 226L, 476L, 306L)), .Names = c("Data",
"client", "stuff", "Sales"), class = "data.frame", row.names = c(NA,
-40L))
正如我们所看到的,这些小组有不同的开始时间
在我的代码中
按组创建预测
Kornev Oysters 01.03.2018 - 01.06.2019
Horns hooves 01.07.2019 - 01.07.2020
Horns chickens 01.04.2017 - 01.02.2018
#首先是分组变量
timeseries$group这可能会有所帮助
listed_ts <- lapply(listed,
function(x) ts(x[["Sales"]], start = c(2017, 1), frequency = 12) )
#确定所有组
在这里,我认为更好的选择是Map
,您可以将start
作为向量列表提供,即Map(函数(x,y)ts(x[[“Sales”]],start=y,frequency=12),list,list(c(2017,1),c(2018,1),c(2017,1))
,因为它基于第一个元素,将monthyear提取为列表,然后将其传递到Map
即库(dplyr);月百分比按(组)分组%>%总结(数据=mdy(第一个(数据)),年月份=列表(c(年(数据),月(数据)))%%>%pull(年月份)
listed_ts <- lapply(listed,
function(x) ts(x[["Sales"]], start = c(2017, 1), frequency = 12) )
# determine all groups
groups <- unique(timeseries$group)
# find starting date per group and save them as a list of elements c('YEAR','Month')
timeseries$date <- as.Date(as.character(timeseries$Data), '%d.%m.%Y')
timeseries <- timeseries[order(timeseries$date),]
start_dates <- format(timeseries$date[match(groups, timeseries$group)], "%Y %m")
start_dates <- strsplit(start_dates, ' ')
# Back to your code
# now the list
listed <- split(timeseries,timeseries$group)
# Edited the lapply funcion in order to consider the starting dates
# to have a smaller output, I post the str(listed)
listed_ts <- lapply(seq_along(listed),
function(k) ts(listed[[k]][["Sales"]], start = as.integer(start_dates[[k]]), frequency = 12) )