Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/318.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python “熊猫”中的类日期时间索引`_Python_Pandas - Fatal编程技术网

Python “熊猫”中的类日期时间索引`

Python “熊猫”中的类日期时间索引`,python,pandas,Python,Pandas,我有一个数据帧,其中包含累积降雨量的时间序列: df = pd.read_csv(csv_file, parse_dates=[['date', 'time']], dayfirst=True, index_col=0) 我无法共享源数据,它是通过适配器对象读取的,适配器对象将数据显示为文本文件,其中包含.csv内容以读取\u csv,尽管源文件是某种专有格式-但是,它与问题无关,最终结果是带有日期时间索引和浮点值的数据帧-日期可以被模拟 然后将降雨量转换为重新采样的分钟数: rainfall

我有一个数据帧,其中包含累积降雨量的时间序列:

df = pd.read_csv(csv_file, parse_dates=[['date', 'time']], dayfirst=True, index_col=0)
我无法共享源数据,它是通过适配器对象读取的,适配器对象将数据显示为文本文件,其中包含.csv内容以读取\u csv,尽管源文件是某种专有格式-但是,它与问题无关,最终结果是带有日期时间索引和浮点值的数据帧-日期可以被模拟

然后将降雨量转换为重新采样的分钟数:

rainfall_differences = df['rainfall'].diff()
rainfall_differences = rainfall_differences.resample('1min', label='right', closed='right').sum()
所有这些都如预期的那样起作用。然而,我的问题是关于这两种说法之间的区别:

x = rainfall_differences.rolling('90min').sum()
y = rainfall_differences.rolling('1.5h').sum()
第一个有效,但第二个抛出异常:

  File "<<path>>/my_file.py", line 68, in load_rainfalls
    result[duration_label] = rainfall_differences.rolling(duration_label).sum()
  File "<<path>>\lib\site-packages\pandas\core\generic.py", line 10386, in rolling
    closed=closed,
  File "<<path>>\lib\site-packages\pandas\core\window\rolling.py", line 94, in __init__
    self.validate()
  File "<<path>>\lib\site-packages\pandas\core\window\rolling.py", line 1836, in validate
    freq = self._validate_freq()
  File "<<path>>\lib\site-packages\pandas\core\window\rolling.py", line 1888, in _validate_freq
    f"passed window {self.window} is not "
ValueError: passed window 1.5h is not compatible with a datetimelike index

我认为有必要将h改为h:

我认为原因是因为无效:

样本:


啊,答案很简单,像_timedelta这样的函数允许更多不同的变量,而滚动的频率字符串必须有一个更严格的格式字符串?非常不满意,但你似乎是对的1.5h不起作用,1.5h起作用…@Grismar-我认为是的,这是两件不同的事情,这里是频率字符串更“严格”
index_duration = str(int(pd.to_timedelta('1.5 hour').total_seconds() / 60)) + 'min'
y = rainfall_differences.rolling(index_duration).sum()
y = rainfall_differences.rolling('1.5H').sum()
Alias   Description
H       hourly frequency
T, min  minutely frequency
S       secondly frequency
rng = pd.date_range('2017-04-03', periods=5, freq='10T')
rainfall_differences = pd.DataFrame({'a': range(5)}, index=rng)  
print (rainfall_differences)
                     a
2017-04-03 00:00:00  0
2017-04-03 00:10:00  1
2017-04-03 00:20:00  2
2017-04-03 00:30:00  3
2017-04-03 00:40:00  4

y = rainfall_differences.rolling('1.5H').sum()
print (y)
                        a
2017-04-03 00:00:00   0.0
2017-04-03 00:10:00   1.0
2017-04-03 00:20:00   3.0
2017-04-03 00:30:00   6.0
2017-04-03 00:40:00  10.0