Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/340.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 通过使用globals()作为变量引用在多个数据帧中循环排序datetime_Python_Pandas_Sorting_Datetime_Globals - Fatal编程技术网

Python 通过使用globals()作为变量引用在多个数据帧中循环排序datetime

Python 通过使用globals()作为变量引用在多个数据帧中循环排序datetime,python,pandas,sorting,datetime,globals,Python,Pandas,Sorting,Datetime,Globals,我有20个以下格式的数据帧,但有140000行左右。数据格式为“%Y/%m/%d”或YYYY/MM/DD In [1]: data1 = pd.DataFrame({'Day': ['2020-04-07','2020-04-07', '2020-04-07','2020-08-11','2020-08-11','2020-08-11','2020-06-14','2020-06-14','2020-06-14'], 'Time': ['2

我有20个以下格式的数据帧,但有140000行左右。数据格式为“%Y/%m/%d”或YYYY/MM/DD

In [1]:  data1 = pd.DataFrame({'Day': ['2020-04-07','2020-04-07', '2020-04-07','2020-08-11','2020-08-11','2020-08-11','2020-06-14','2020-06-14','2020-06-14'],
                           'Time': ['23:41:18', '23:42:56', '23:44:34','10:23:10','15:24:46','10:24:13','23:41:18','23:42:56','23:44:34'],
                           'V': [1044.865, 1044.889, 1044.914,320.014,320.033,320.018,1044.865,1044.889,1044.914]})
        
        data2 = pd.DataFrame({'Day': ['2020-04-07','2020-04-07', '2020-04-07','2020-08-11','2020-08-11','2020-08-11','2020-06-14','2020-06-14','2020-06-14'],
                           'Time': ['23:41:18', '23:42:56', '23:44:34','10:23:10','15:24:46','10:24:13','23:41:18','23:42:56','23:44:34'],
                           'V': [1044.865, 1044.887, 1044.914,320.014,320.033,320.018,1044.865,1044.889,1044.914]})
        
        data3 = pd.DataFrame({'Day': ['2020-04-07','2020-04-07', '2020-04-07','2020-08-11','2020-08-11','2020-08-11','2020-06-14','2020-06-14','2020-06-14'],
                           'Time': ['23:41:18', '23:42:56', '23:44:34','10:23:10','15:24:46','10:24:13','23:41:18','23:42:56','23:44:34'],
                           'V': [1044.865, 1044.888, 1044.914,320.014,320.033,320.018,1044.865,1044.889,1044.914]})

    In [2]:data2.head(15)
    Out[2]: 
          Day         Time        V
    0  2020-04-07   23:41:18   1044.865
    1  2020-04-07   23:42:56   1044.887 
    2  2020-04-07   23:44:34   1044.914
    3  2020-08-11   10:23:10   320.014
    4  2020-08-11   15:24:46   320.033
    5  2020-08-11   10:24:13   320.018
    6  2020-06-14   23:41:18   1044.865
    7  2020-06-14   23:42:56   1044.889
    8  2020-06-14   23:44:34   1044.914
我正在使用下面的循环尝试按“Day”列中的日期对数据帧进行排序,然后按“Time”列对数据帧进行排序。在我的实际数据帧中,每分钟大约有3个度量值

我的目的是不必键入数据帧名称20次,我发现这是一种非常适合.index.drop().reset_index()属性的解决方案。 但由于某些原因,不能使用此循环中显示的.sort_values():

 In [3]:for n in range(1,4,1):
            globals()["data" + str(n)]['Day'] = pd.to_datetime(globals()["data" + str(n)]['Day'], 
            format = '%Y/%m/%d')
            globals()["data" + str(n)].sort_values(by=['Day','Time'])
            globals()["data" + str(n)]['Day'] = globals()["data" + str(n)]['Day'].astype(str)
 Out[3]: 
          Day         Time        V
    0  2020-04-07   23:41:18   1044.865
    1  2020-04-07   23:42:56   1044.887 
    2  2020-04-07   23:44:34   1044.914
    3  2020-08-11   10:23:10   320.014
    4  2020-08-11   15:24:46   320.033
    5  2020-08-11   10:24:13   320.018
    6  2020-06-14   23:41:18   1044.865
    7  2020-06-14   23:42:56   1044.889
    8  2020-06-14   23:44:34   1044.914
但是,如果我只是使用循环将“Date”列设置为“datetime”,然后通过手动键入数据帧名称来使用.sort_values(),它就会起作用

 In [4]:for n in range(1,4,1):
            globals()["data" + str(n)]['Day'] = pd.to_datetime(globals()["data" + str(n)]['Day'], 
            format = '%Y/%m/%d')
        data2.sort_values(by=['Day','Time'])

 Out[3]: 
          Day         Time        V
    0  2020-04-07   23:41:18   1044.865
    1  2020-04-07   23:42:56   1044.887 
    2  2020-04-07   23:44:34   1044.914
    6  2020-06-14   23:41:18   1044.865
    7  2020-06-14   23:42:56   1044.889
    8  2020-06-14   23:44:34   1044.914
    3  2020-08-11   10:23:10   320.014
    4  2020-08-11   10:24:13   320.018
    5  2020-08-11   15:24:46   320.033

关于如何使这项工作更具动态性,您有什么建议吗?

第一件事:停止使用
globals()
!这几乎从来都不合适。对于您的用例,字典或常规列表可能是合适的。我将对此进行研究,谢谢链接!您必须在sort_值内使用temp变量或inplace参数。