Pandas 在熊猫数据框架中,如何规范化每天的值?

Pandas 在熊猫数据框架中,如何规范化每天的值?,pandas,datetime,dataframe,Pandas,Datetime,Dataframe,我有一个数据框,其中包含一组在不同时间测量的值。我想将每天获取的值标准化为一。如何做到这一点 具体而言,我有以下形式的数据: value datetime 2017-03-08 14:36:06.616166 1002.49 2017-03-08 15:06:07.661818 992.68 2017-03-08 15:36:08.597443 984.34 2017-03

我有一个数据框,其中包含一组在不同时间测量的值。我想将每天获取的值标准化为一。如何做到这一点

具体而言,我有以下形式的数据:

                              value
datetime                           
2017-03-08 14:36:06.616166  1002.49
2017-03-08 15:06:07.661818   992.68
2017-03-08 15:36:08.597443   984.34
2017-03-08 16:06:09.265451   989.32
2017-03-08 16:36:10.581452  1004.00
2017-03-08 17:06:11.269434  1003.97
2017-03-08 17:36:12.117443   994.80
2017-03-08 18:06:12.809445   994.17
2017-03-08 18:36:14.029444   997.93
2017-03-08 19:06:14.654631   989.65
2017-03-08 19:36:15.413438   991.14
2017-03-08 20:06:16.145432   984.65
2017-03-08 20:36:17.265443   993.30
2017-03-08 21:06:18.117434   981.18
2017-03-08 21:36:19.165447   987.64
2017-03-08 22:06:19.909443   985.26
2017-03-08 22:36:20.569442   980.40
2017-03-08 23:06:21.197446   988.59
2017-03-08 23:36:21.989448   984.59
2017-03-09 00:06:22.665448   983.91
2017-03-09 00:36:23.281681   993.65
2017-03-09 01:06:23.857440   986.69
2017-03-09 01:36:24.441713   984.04
2017-03-09 02:06:25.117453   989.92
2017-03-09 02:36:25.953449   978.82
2017-03-09 03:06:26.521704   987.42
2017-03-09 03:36:27.157448   996.66
2017-03-09 04:06:27.725445   996.66
2017-03-09 04:36:29.201442   996.66
2017-03-09 05:06:29.765443   989.82
...                             ...
2017-03-22 20:16:24.007637   833.74
2017-03-22 20:46:24.583127   834.69
2017-03-22 21:16:25.217536   829.66
我想分别对2017-03-08、2017-03-09等的所有值进行标准化,并将这些标准化值添加为一个新列

值列表的简单规范化函数如下所示:

def normalize(x, summation = None):
    if summation is None:
        summation = sum(x) # normalize to unity
    return [element/summation for element in x]
                              value  value_day_normalized
datetime                                                 
2017-03-08 14:36:06.616166  1002.49  0.0532386976171
2017-03-08 15:06:07.661818   992.68  0.0527177232197
2017-03-08 15:36:08.597443   984.34  0.0522748153223
2017-03-08 16:06:09.265451   989.32  0.0525392855057
2017-03-08 16:36:10.581452  1004.00  0.0533188883755
2017-03-08 17:06:11.269434  1003.97  0.0533172951817
2017-03-08 17:36:12.117443   994.80  0.0528303089203
2017-03-08 18:06:12.809445   994.17  0.0527968518489
2017-03-08 18:36:14.029444   997.93  0.052996532148
2017-03-08 19:06:14.654631   989.65  0.0525568106383
2017-03-08 19:36:15.413438   991.14  0.0526359392674
2017-03-08 20:06:16.145432   984.65  0.0522912783257
2017-03-08 20:36:17.265443   993.30  0.0527506492265
2017-03-08 21:06:18.117434   981.18  0.0521069989007
2017-03-08 21:36:19.165447   987.64  0.0524500666486
2017-03-08 22:06:19.909443   985.26  0.0523236732678
2017-03-08 22:36:20.569442   980.40  0.0520655758599
2017-03-08 23:06:21.197446   988.59  0.052500517788
2017-03-08 23:36:21.989448   984.59  0.0522880919379
因此,对于2017-03-08,标准化值如下:

def normalize(x, summation = None):
    if summation is None:
        summation = sum(x) # normalize to unity
    return [element/summation for element in x]
                              value  value_day_normalized
datetime                                                 
2017-03-08 14:36:06.616166  1002.49  0.0532386976171
2017-03-08 15:06:07.661818   992.68  0.0527177232197
2017-03-08 15:36:08.597443   984.34  0.0522748153223
2017-03-08 16:06:09.265451   989.32  0.0525392855057
2017-03-08 16:36:10.581452  1004.00  0.0533188883755
2017-03-08 17:06:11.269434  1003.97  0.0533172951817
2017-03-08 17:36:12.117443   994.80  0.0528303089203
2017-03-08 18:06:12.809445   994.17  0.0527968518489
2017-03-08 18:36:14.029444   997.93  0.052996532148
2017-03-08 19:06:14.654631   989.65  0.0525568106383
2017-03-08 19:36:15.413438   991.14  0.0526359392674
2017-03-08 20:06:16.145432   984.65  0.0522912783257
2017-03-08 20:36:17.265443   993.30  0.0527506492265
2017-03-08 21:06:18.117434   981.18  0.0521069989007
2017-03-08 21:36:19.165447   987.64  0.0524500666486
2017-03-08 22:06:19.909443   985.26  0.0523236732678
2017-03-08 22:36:20.569442   980.40  0.0520655758599
2017-03-08 23:06:21.197446   988.59  0.052500517788
2017-03-08 23:36:21.989448   984.59  0.0522880919379
这样的事情怎么办?我有一种感觉,它可能涉及到使用数据帧方法
groupby
,但我不知道该如何处理这个问题。

您可以通过以下方式将
系列
与原始
df
长度相同:

另一个解决方案包括和: