Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/327.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python pandas.date_范围和dataframe.index之间的差异_Python_Pandas_Difference_Date Range - Fatal编程技术网

Python pandas.date_范围和dataframe.index之间的差异

Python pandas.date_范围和dataframe.index之间的差异,python,pandas,difference,date-range,Python,Pandas,Difference,Date Range,在Linux上使用Python提示符,我使用pandas从csv文件读取数据 >>> import pandas as pd >>> my_data = pd.read_csv('data.txt', header = None, index_col=5) 我得到 >>> my_data.index Index([u'2018.09.20 19:00', u'2018.09.20 19:15', u'2018.09.20 19:30',

在Linux上使用Python提示符,我使用pandas从csv文件读取数据

>>> import pandas as pd
>>> my_data = pd.read_csv('data.txt', header = None, index_col=5)

我得到

>>> my_data.index
Index([u'2018.09.20 19:00', u'2018.09.20 19:15', u'2018.09.20 19:30',
       u'2018.09.20 19:45', u'2018.09.20 20:00', u'2018.09.20 20:15',
       u'2018.09.20 20:30', u'2018.09.20 20:45', u'2018.09.20 21:00',
       u'2018.09.20 21:15', u'2018.09.20 21:30', u'2018.09.20 21:45',
       u'2018.09.20 22:00', u'2018.09.20 22:15', u'2018.09.20 22:30',
       u'2018.09.20 22:45', u'2018.09.20 23:00', u'2018.09.20 23:15',
       u'2018.09.20 23:30', u'2018.09.20 23:45'],
      dtype='object', name=5)

这些都是从2018.09.20 19:00到2018.09.20 23:45的日期和时间,频率为15分钟

现在,我想检查一下丢失的日期和时间

如果我使用相同的时间间隔,则可以:

>>> pd.date_range(start='2018.09.20 19:00', end = '2018.09.20 23:45', freq = '15min').difference(my_data.index)
DatetimeIndex([], dtype='datetime64[ns]', freq='15T')
如果我将pd.data_范围中的“开始”切换到2018.09.20:00,我将获得

>>> pd.date_range(start='2018.09.20 20:00', end = '2018.09.20 23:45', freq = '15min').difference(my_data.index)
DatetimeIndex(['2018-09-20 20:00:00', '2018-09-20 20:15:00',
               '2018-09-20 20:30:00', '2018-09-20 20:45:00',
               '2018-09-20 21:00:00', '2018-09-20 21:15:00',
               '2018-09-20 21:30:00', '2018-09-20 21:45:00',
               '2018-09-20 22:00:00', '2018-09-20 22:15:00',
               '2018-09-20 22:30:00', '2018-09-20 22:45:00',
               '2018-09-20 23:00:00', '2018-09-20 23:15:00',
               '2018-09-20 23:30:00', '2018-09-20 23:45:00'],
              dtype='datetime64[ns]', freq=None)

如果我将“开始”切换到2018.09.20 18:00

>>> pd.date_range(start='2018.09.20 18:00', end = '2018.09.20 23:45', freq = '15min').difference(my_data.index)
DatetimeIndex(['2018-09-20 18:00:00', '2018-09-20 18:15:00',
               '2018-09-20 18:30:00', '2018-09-20 18:45:00',
               '2018-09-20 19:00:00', '2018-09-20 19:15:00',
               '2018-09-20 19:30:00', '2018-09-20 19:45:00',
               '2018-09-20 20:00:00', '2018-09-20 20:15:00',
               '2018-09-20 20:30:00', '2018-09-20 20:45:00',
               '2018-09-20 21:00:00', '2018-09-20 21:15:00',
               '2018-09-20 21:30:00', '2018-09-20 21:45:00',
               '2018-09-20 22:00:00', '2018-09-20 22:15:00',
               '2018-09-20 22:30:00', '2018-09-20 22:45:00',
               '2018-09-20 23:00:00', '2018-09-20 23:15:00',
               '2018-09-20 23:30:00', '2018-09-20 23:45:00'],
              dtype='datetime64[ns]', freq=None)
我不确定这种行为是否是由于my_data.index中的dtype='object'造成的,我尝试将其转换为datetime,它似乎工作正常。其实

>>> pd.date_range(start='2018.09.20 20:00', end = '2018.09.20 23:45', freq = '15min').difference(pd.to_datetime(my_data.index))
DatetimeIndex([], dtype='datetime64[ns]', freq=None)
>>> pd.date_range(start='2018.09.20 18:00', end = '2018.09.20 23:45', freq = '15min').difference(pd.to_datetime(my_data.index))
DatetimeIndex(['2018-09-20 18:00:00', '2018-09-20 18:15:00',
               '2018-09-20 18:30:00', '2018-09-20 18:45:00'],
              dtype='datetime64[ns]', freq=None)
我的问题

  • 第一个“奇怪”行为是否可能是由于我的_data.index中有dtype='object'这一事实
  • 使用正确吗
  • 解决这个问题

  • 为什么即使我不使用pd.to_datetime,第一个案例(具有相同间隔的案例)也可以正常工作

  • 是的,pd.to_首先确定索引的日期。@ScottBoston谢谢。第三个问题呢?它是偶然发生的吗?我怀疑比较需要像数据类型一样。
    pd.to_datetime