R 如何创建根据年龄调整的时间序列?
我有过去13个月的住院数据。我的数据包含变量/列:ID、年龄、入院日期、出院日期、住院时间、年龄组(从0:4开始的5年增量)、按年龄组划分的平均住院时间、按入院日期划分的平均住院时间 我已经创建了ACF和PACF图:R 如何创建根据年龄调整的时间序列?,r,time-series,prediction,arima,R,Time Series,Prediction,Arima,我有过去13个月的住院数据。我的数据包含变量/列:ID、年龄、入院日期、出院日期、住院时间、年龄组(从0:4开始的5年增量)、按年龄组划分的平均住院时间、按入院日期划分的平均住院时间 我已经创建了ACF和PACF图: plot(hospitalized$`Hospital Admission Date`, hospitalized$avg_stay_pDate,type = "l", ylab = "Days Spent in Hospit
plot(hospitalized$`Hospital Admission Date`, hospitalized$avg_stay_pDate,type = "l",
ylab = "Days Spent in Hospital",xlab = "Date Admitted",cex.lab = 1.4)
##plot ACF of data
forecast::Acf(hospitalized$avg_stay_pDate, main = "",cex.lab = 1.4,
lag.max = 30, ylab = "ACF")
##plot partial ACF
forecast::Acf(hospitalized$avg_stay_pDate,main = "",cex.lab = 1.4,
lag.max = 30,ylab = "PACF",
type = "partial")```
Which results in AR(5) and MA(3), although admittedly I don't really know what I'm talking about here. I just know that's where the corresponding plots become insignificant.
I compared AIC and BIC for AR, MA, ARMA, ARIMA, and SARIMA models and SARIMA had the lowest numbers so I chose SARIMA.
```## Find the best model
BIC(lm_mod)
ar1_simp$bic
ma1_simp$bic
arma1_simp$bic
arima1_simp$bic
sarima1_simp$bic
## Find the best model
AIC(lm_mod)
ar1_simp$aic
ma1_simp$aic
arma1_simp$aic
arima1_simp$aic
sarima1_simp$aic```
Output:
[1] 5701.889
[1] 5149.502
[1] 5149.502
[1] 5149.502
[1] 5143.878
[1] 5082.838
[1] 5688.029
[1] 5103.301
[1] 5103.301
[1] 5103.301
[1] 5102.309
[1] 5032.39
Then I tried to use my model for prediction:
```## Subset the data
hosp.test = hosp_ts[563:750]
hosp.train = hosp_ts[1:562]
## Fit model to subset off the data
train.mod = forecast::Arima(hosp.train,
order = c(5, 1, 3),
seasonal = list(order = c(5,1,3),period = 40))
## calculate errors/bias
## Forecast
#h = steps ahead
pred = forecast::forecast(train.mod,h = 188)
## plot predictions
plot(pred,xlab = "Week",ylab = "Google Index Score",cex.lab = 1.4,
ylim = c(0,45),xlim = c(0,750))
sqrt(mean((hosp.test - pred$mean)^2)) ### RPMSE
mean(pred$mean - hosp.test) ### bias
sd(hospitalized$avg_stay_pDate)```
my RPMSE is 3.97, my bias is -2.192, and my sd is 10.72. I don't think my model is bad (except I can't figure out how big to make my period. After looking at the graph I thought maybe 60, but Rstudio said it wouldn't let me create an 8gb vector HA!) but I need to create a predictive time series that is adjusted for age and I don't really see how to do that.