Python 计算每个聚类的时间序列数据的季节性和趋势_Python_Pandas_Group By_Time Series_Pivot Table

Python 计算每个聚类的时间序列数据的季节性和趋势

python pandas

Python 计算每个聚类的时间序列数据的季节性和趋势,python,pandas,group-by,time-series,pivot-table,Python,Pandas,Group By,Time Series,Pivot Table,我有这个时间序列数据，现在我想用“modal_price”计算每个APMC和商品集群的趋势季节性类型（乘法或加法）。数据集大约有60000行这样的行，APMC和Cluster是相同的，但日期在变化。数据集如下所示： APMC | Commodity | qtl _weight| min_price | max_price | modal_price | district_name | Year | Month date 2014-12-01 Akole ba

我有这个时间序列数据，现在我想用“modal_price”计算每个APMC和商品集群的趋势季节性类型（乘法或加法）。数据集大约有60000行这样的行，APMC和Cluster是相同的，但日期在变化。数据集如下所示：

             APMC |   Commodity  | qtl _weight| min_price | max_price | modal_price | district_name | Year | Month
date
2014-12-01  Akole   bajri            40              1375        1750      1563          Ahmadnagar  2014   12
2014-12-01  Akole   paddy-unhusked   346             1400        1800      1625          Ahmadnagar  2014   12
2014-12-01  Akole   wheat            55              1500        1900       1675         Ahmadnagar  2014   12
2014-12-01  Akole   bhagar/vari      59              2000        2600       2400         Ahmadnagar  2014   12
2014-12-01  Akole   gram              9              3200        3300       3235         Ahmadnagar  2014   12
2014-12-01  Jamkhed cotton           44199           3950        4033       3991         Ahmadnagar  2014   12
2014-12-01  Jamkhed bajri            846             1300        1488       1394         Ahmadnagar  2014   12
2014-12-01  Jamkhed wheat(husked)    155             1879        2231       2055         Ahmadnagar  2014   12
2014-12-01  Kopar   gram             421             1983        2698       2463         Ahmadnagar  2014   12
2014-12-01  Kopar   greengram         18             6734        7259       6759         Ahmadnagar  2014   12
2014-12-01  Kopar   soybean          1507            2945        3247       3199         Ahmadnagar  2014   12
2016-11-01  Sanga   wheat(husked)    222             1730        2173       1994         Ahmadnagar  2016   11

现在我尝试使用pivot表（APMC、商品和日期作为索引），但这无助于计算每个集群（APMC、商品）的平均值（计算趋势）。我只需要知道如何使用“modal_price”计算每个集群（APMC、商品）的平均值，并将其作为一列添加到dataframe/pivot表中。

也许groupby将为您提供趋势所需的信息，然后转换将使您能够将其投影回同一索引？比如：

# group by your cluster
g = df.groupby(["Year", "APMC", "Commodity"])
# determine the trend per cluster but finalise back into original diimensions
trend = g.modal_price.transform(lambda x: x.mean())
df["trend"] = trend

如果我必须计算每个季节相同的滚动平均值呢？上面修改为在groupby中包括“年”。不应该包括月吗？因为趋势是按月计算的。还有滚动平均值（）呢。我该怎么做呢。有些商品的有效期仅为3-4个月，我如何为此类情况设置窗口大小趋势是基于您想要的任何基础，但是，您的问题是“每年每个集群（APMC，商品）的平均值（计算趋势）”。如果您想要滚动平均值，请修改您的问题或提出一个新问题。另一方面，如果您想要按日期属性（年、月、道琼斯指数）分组，则不需要为其投影列，您可以执行类似df.groupby（df.column.dt.year）的操作。为您保存一列或3列