R 用时间序列数据拟合多项式曲线

R 用时间序列数据拟合多项式曲线,r,ggplot2,time-series,lm,R,Ggplot2,Time Series,Lm,我有一个以每月文章频率为y轴的时间序列图。数据如下所示: Count.V Date Month Week Year 2637 6 2006-01-02 2006-01-01 2006-01-02 2006-01-01 406 4 2006-01-03 2006-01-01 2006-01-02 2006-01-01 543 4 2006-01-04 2006-01-01 2006-01-02 2006

我有一个以每月文章频率为y轴的时间序列图。数据如下所示:

     Count.V       Date      Month       Week       Year
2637       6 2006-01-02 2006-01-01 2006-01-02 2006-01-01
406        4 2006-01-03 2006-01-01 2006-01-02 2006-01-01
543        4 2006-01-04 2006-01-01 2006-01-02 2006-01-01
998        3 2006-01-05 2006-01-01 2006-01-02 2006-01-01
1400       4 2006-01-06 2006-01-01 2006-01-02 2006-01-01
2218       4 2006-02-01 2006-02-01 2006-01-30 2006-01-01
2792       6 2006-02-02 2006-02-01 2006-01-30 2006-01-01
2488      10 2006-02-03 2006-02-01 2006-01-30 2006-01-01
954        8 2006-02-04 2006-02-01 2006-01-30 2006-01-01
2622       3 2006-02-06 2006-02-01 2006-02-06 2006-01-01
2321      11 2006-02-07 2006-02-01 2006-02-06 2006-01-01
2452      10 2006-03-21 2006-03-01 2006-03-20 2006-01-01
2267       5 2006-03-22 2006-03-01 2006-03-20 2006-01-01
1408       3 2006-03-23 2006-03-01 2006-03-20 2006-01-01
2602       3 2006-03-24 2006-03-01 2006-03-20 2006-01-01
2489       5 2006-03-25 2006-03-01 2006-03-20 2006-01-01
2771       1 2006-03-27 2006-03-01 2006-03-27 2006-01-01
我使用ggplot2来绘制它:

MyPlot <- ggplot(data = df, aes(x = Month, y = Count.V)) + stat_summary(fun.y = sum, geom ="line") + scale_x_date(
labels = date_format("%m-%y"),
breaks = "3 months")
有些东西不正常:

我做错了什么

编辑: 添加了具有多个月的数据帧部分:

>dput(df)
结构(列表)(计数V=c(6L、4L、4L、3L、4L、5L、2L、8L、6L、,
5L、12L、1L、2L、3L、4L、2L、4L、4L、4L、6L、6L、2L、4L、,
4L,6L,10L,8L,3L,11L,8L,13L,3L,9L,7L,4L,7L,9L,5L,
4L,5L,6L,5L,9L,5L,11L,4L,6L,2L,8L,3L,5L,4L,3L,
5L、4L、2L、3L、3L、3L、8L、6L、1L、3L、10L、5L、3L、3L、3L、5L、,
1L、8L、4L、3L、2L、1L、4L、4L、5L、7L、8L、3L、4L、7L、5L、,
3L、3L、4L、6L、3L、2L、3L、2L、5L、6L、4L、5L、8L、3L、4L),
日期=结构(c(131501315113152131531315413155,
13157, 13158, 13159, 13161, 13162, 13164, 13165, 13166, 13168, 
13169, 13171, 13172, 13173, 13174, 13175, 13176, 13178, 13179, 
13180, 13181, 13182, 13183, 13185, 13186, 13187, 13188, 13189, 
13190, 13192, 13193, 13194, 13195, 13196, 13197, 13199, 13200, 
13201, 13202, 13203, 13204, 13206, 13207, 13208, 13209, 13210, 
13211, 13214, 13215, 13216, 13217, 13218, 13220, 13221, 13222, 
13223, 13224, 13225, 13227, 13228, 13229, 13230, 13231, 13232, 
13234, 13235, 13236, 13237, 13238, 13239, 13241, 13242, 13243, 
13244, 13245, 13246, 13248, 13249, 13250, 13251, 13252, 13253, 
13256, 13257, 13258, 13259, 13260, 13262, 13263, 13264, 13265, 
13266132671327013271),class=“日期”),月=结构(c(13149,
13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 
13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 
13149, 13149, 13149, 13149, 13149, 13180, 13180, 13180, 13180, 
13180, 13180, 13180, 13180, 13180, 13180, 13180, 13180, 13180, 
13180, 13180, 13180, 13180, 13180, 13180, 13180, 13180, 13180, 
13180, 13180, 13208, 13208, 13208, 13208, 13208, 13208, 13208, 
13208, 13208, 13208, 13208, 13208, 13208, 13208, 13208, 13208, 
13208, 13208, 13208, 13208, 13208, 13208, 13208, 13208, 13208, 
13208, 13239, 13239, 13239, 13239, 13239, 13239, 13239, 13239, 
13239, 13239, 13239, 13239, 13239, 13239, 13239, 13239, 13239, 
13239, 13239, 13239, 13239, 13239, 13239, 13239, 13269, 13269
),class=“Date”),周=结构(c(131501315013150,
13150, 13150, 13150, 13157, 13157, 13157, 13157, 13157, 13164, 
13164, 13164, 13164, 13164, 13171, 13171, 13171, 13171, 13171, 
13171, 13178, 13178, 13178, 13178, 13178, 13178, 13185, 13185, 
13185, 13185, 13185, 13185, 13192, 13192, 13192, 13192, 13192, 
13192, 13199, 13199, 13199, 13199, 13199, 13199, 13206, 13206, 
13206, 13206, 13206, 13206, 13213, 13213, 13213, 13213, 13213, 
13220, 13220, 13220, 13220, 13220, 13220, 13227, 13227, 13227, 
13227, 13227, 13227, 13234, 13234, 13234, 13234, 13234, 13234, 
13241, 13241, 13241, 13241, 13241, 13241, 13248, 13248, 13248, 
13248, 13248, 13248, 13255, 13255, 13255, 13255, 13255, 13262, 
1326213262132621326213262132621326913269),class=“日期”),
年份=结构(c(131491314913149131491314913149131491314913149,
13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 
13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 
13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 
13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 
13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 
13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 
13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 
13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 
13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 
13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 13149, 
13149131491314913149),class=“Date”),.Names=c(“Count.V”,
“日期”、“月”、“周”、“年”,row.names=c(2637L,406L,
543L、998L、1400L、2667L、1211L、140L、737L、545L、2573L、978L、,
2119L、842L、1866L、1002L、1956 L、1229L、2278L、1889L、1285L、,
1020L、964L、1584L、2218L、2792L、2488L、954L、2622L、2321L、,
796L、501L、294L、2476L、2541L、642L、177L、1222L、1249L、990L、,
2776L、580L、1181L、1792L、431L、224L、214L、679L、1601L、1655L、,
645L、2785L、1507L、1580L、1274L、2083L、157L、2491L、2733L、,
1533L、2332L、328L、1995L、1598L、2452L、2267L、1408L、2602L、,
2489L、2771L、232323L、1714L、907L、1522L、882L、2727L、844L、2105L、,
2530L、1160L、2075L、1435L、821L、1284L、2406L、2357L、1499L、,
2145L、1539L、1890L、1856L、27L、887L、1500L、812L、1677L、1965L、,

2580L,823L,1482L),class=“data.frame”)

试着用
平均值来代替这样的
求和

ggplot(data = df, aes(x = Month, y = Count.V)) +
    stat_summary(fun.y = mean, geom ="line")+
    stat_smooth(method = "lm", formula = y ~ poly(x, 3), size = 1) +
    geom_point()+
    scale_x_date(labels = date_format("%m-%y"), breaks = "3 months")

谢谢,但是当我转换
Volkskrant.df$Date时,请尝试首先转换日期变量:
Volkskrant.df$Date@MamounBenghezal是的,我今天有点慢。但是,在我的数据框中,日期数据已经是日期格式。我认为曲线的实际问题是
stat\u summary(fun.y=sum,geom=“line”)
函数为每个月的
count.V
变量聚合计数,而
stat\u smooth
无法处理这一点。例如,如果我试图在没有
stat\u summary
的情况下绘制图表,它会给出正确的曲线。@Mamounbenghzal,我已经更新了问题,以包含更多的值,
df$Month
@Mamounbenghzal添加了
dput(df)
,尽管如果我的数据是日期格式,我不会收到错误消息。只是多项式曲线不正确地表示了数据(它显示了
$Date
,而曲线图显示了
$Month
的合计计数)
ggplot(data = df, aes(x = Month, y = Count.V)) +
    stat_summary(fun.y = mean, geom ="line")+
    stat_smooth(method = "lm", formula = y ~ poly(x, 3), size = 1) +
    geom_point()+
    scale_x_date(labels = date_format("%m-%y"), breaks = "3 months")