Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/354.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 按用户定义的月跨度对数据帧进行分组_Python_Pandas_Group By_Pandas Groupby - Fatal编程技术网

Python 按用户定义的月跨度对数据帧进行分组

Python 按用户定义的月跨度对数据帧进行分组,python,pandas,group-by,pandas-groupby,Python,Pandas,Group By,Pandas Groupby,从Oktober到4月,将数据分组到冬季的最佳方法是什么?用均匀分布的频率,我无法输出1972/1973、1973/1974等季节的冬季月份的季节性总和。。。也许是一件小事,但我不知道如何做到这一点,而不开始编写一个过度杀戮的解决方案 sd_x sd_y 1972-10-31 0.000000 0.709677 1972-11-30 1.720838 4.366667 1972-12-31 15.893438 5.600000

从Oktober到4月,将数据分组到冬季的最佳方法是什么?用均匀分布的频率,我无法输出1972/1973、1973/1974等季节的冬季月份的季节性总和。。。也许是一件小事,但我不知道如何做到这一点,而不开始编写一个过度杀戮的解决方案

                 sd_x       sd_y
1972-10-31   0.000000   0.709677
1972-11-30   1.720838   4.366667
1972-12-31  15.893438   5.600000
1973-01-31   6.256230   6.548387
1973-02-28   0.653714  53.142857
1973-03-31   0.000000  70.354839
1973-04-30   0.000000  11.700000
1973-10-31   0.000000   0.096774
1973-11-30   0.000000   4.266667
1973-12-31   0.394652  53.419355
1974-01-31   4.540915  46.645161
1974-02-28   2.978056  35.571429
1974-03-31   0.000000   4.967742
1974-04-30   0.000000   0.000000
1974-10-31   0.000000   0.064516
1974-11-30   0.000000   1.000000
1974-12-31   5.585954  20.096774
1975-01-31  50.498147  24.580645
1975-02-28  35.906097  22.000000
1975-03-31   0.457109   5.483871
1975-04-30   0.000000   0.433333

使用
pd.offset.MonthBegin
将月份向后移动
4

                 sd_x       sd_y
1972-10-31   0.000000   0.709677
1972-11-30   1.720838   4.366667
1972-12-31  15.893438   5.600000
1973-01-31   6.256230   6.548387
1973-02-28   0.653714  53.142857
1973-03-31   0.000000  70.354839
1973-04-30   0.000000  11.700000
1973-10-31   0.000000   0.096774
1973-11-30   0.000000   4.266667
1973-12-31   0.394652  53.419355
1974-01-31   4.540915  46.645161
1974-02-28   2.978056  35.571429
1974-03-31   0.000000   4.967742
1974-04-30   0.000000   0.000000
1974-10-31   0.000000   0.064516
1974-11-30   0.000000   1.000000
1974-12-31   5.585954  20.096774
1975-01-31  50.498147  24.580645
1975-02-28  35.906097  22.000000
1975-03-31   0.457109   5.483871
1975-04-30   0.000000   0.433333
shifted_months = df.index - pd.offsets.MonthBegin(5)
shifted_months

DatetimeIndex(['1972-06-01', '1972-07-01', '1972-08-01', '1972-09-01',
               '1972-10-01', '1972-11-01', '1972-12-01', '1973-06-01',
               '1973-07-01', '1973-08-01', '1973-09-01', '1973-10-01',
               '1973-11-01', '1973-12-01', '1974-06-01', '1974-07-01',
               '1974-08-01', '1974-09-01', '1974-10-01', '1974-11-01',
               '1974-12-01'],
              dtype='datetime64[ns]', freq=None)
然后,我们可以将
.year
属性用于
groupby
sum

                 sd_x       sd_y
1972-10-31   0.000000   0.709677
1972-11-30   1.720838   4.366667
1972-12-31  15.893438   5.600000
1973-01-31   6.256230   6.548387
1973-02-28   0.653714  53.142857
1973-03-31   0.000000  70.354839
1973-04-30   0.000000  11.700000
1973-10-31   0.000000   0.096774
1973-11-30   0.000000   4.266667
1973-12-31   0.394652  53.419355
1974-01-31   4.540915  46.645161
1974-02-28   2.978056  35.571429
1974-03-31   0.000000   4.967742
1974-04-30   0.000000   0.000000
1974-10-31   0.000000   0.064516
1974-11-30   0.000000   1.000000
1974-12-31   5.585954  20.096774
1975-01-31  50.498147  24.580645
1975-02-28  35.906097  22.000000
1975-03-31   0.457109   5.483871
1975-04-30   0.000000   0.433333
df.groupby(shifted_months.year).sum()

           sd_x        sd_y
1972  24.524220  152.422427
1973   7.913623  144.967128
1974  92.447307   73.659139
我们可以用

                 sd_x       sd_y
1972-10-31   0.000000   0.709677
1972-11-30   1.720838   4.366667
1972-12-31  15.893438   5.600000
1973-01-31   6.256230   6.548387
1973-02-28   0.653714  53.142857
1973-03-31   0.000000  70.354839
1973-04-30   0.000000  11.700000
1973-10-31   0.000000   0.096774
1973-11-30   0.000000   4.266667
1973-12-31   0.394652  53.419355
1974-01-31   4.540915  46.645161
1974-02-28   2.978056  35.571429
1974-03-31   0.000000   4.967742
1974-04-30   0.000000   0.000000
1974-10-31   0.000000   0.064516
1974-11-30   0.000000   1.000000
1974-12-31   5.585954  20.096774
1975-01-31  50.498147  24.580645
1975-02-28  35.906097  22.000000
1975-03-31   0.457109   5.483871
1975-04-30   0.000000   0.433333
df.groupby(shifted_months.year).sum().rename(lambda x: '{}/{}'.format(x, x + 1))

                sd_x        sd_y
1972/1973  24.524220  152.422427
1973/1974   7.913623  144.967128
1974/1975  92.447307   73.659139

预计产量是多少?1972/1973年的sd_x之和,1973/1974年的sd_x之和等等。我想得到从10月到4月的每个冬季的sd之和。然后这个指数可能看起来像1973年,1974年,1975年。。但是每个yeah都应该包含从10月到4月的值。你太快了;-)遗憾的是,我不能接受两个答案。这个答案有更好的解释,而MaxU答案是一个很好的一行!谢谢你们两位的快速回答@曼努埃尔也为我们感到难过。:-)我更喜欢
df.index-pd.DateOffset(月数=4)