Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/352.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 熊猫滚动时间窗口计数字符串失败-为什么?_Python_Pandas_Pandas Rolling_Pandas Timeindex - Fatal编程技术网

Python 熊猫滚动时间窗口计数字符串失败-为什么?

Python 熊猫滚动时间窗口计数字符串失败-为什么?,python,pandas,pandas-rolling,pandas-timeindex,Python,Pandas,Pandas Rolling,Pandas Timeindex,尝试使用timeindex和count()方法时,出现了一个错误,我缺少了什么 以下是一个例子: d = {'vv': {pd.Timestamp('2020-01-13 08:22:00', freq='T'): 'aa', pd.Timestamp('2020-01-13 08:23:00', freq='T'): 'bb', pd.Timestamp('2020-01-13 08:24:00', freq='T'): 'cc', pd.Timestamp('2020-01-13 08:25

尝试使用timeindex和count()方法时,出现了一个错误,我缺少了什么

以下是一个例子:

d = {'vv': {pd.Timestamp('2020-01-13 08:22:00', freq='T'): 'aa',
pd.Timestamp('2020-01-13 08:23:00', freq='T'): 'bb',
pd.Timestamp('2020-01-13 08:24:00', freq='T'): 'cc',
pd.Timestamp('2020-01-13 08:25:00', freq='T'): np.nan,
pd.Timestamp('2020-01-13 08:26:00', freq='T'): 'dd'}}

df = pd.DataFrame(d)

df['vv'].rolling('72s').count()
得到这个:

ValueError                                Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pandas\core\window\rolling.py in _prep_values(self, values)
    354             try:
--> 355                 values = ensure_float64(values)
    356             except (ValueError, TypeError) as err:

pandas\_libs\algos_common_helper.pxi in pandas._libs.algos.ensure_float64()

ValueError: could not convert string to float: 'aa'

The above exception was the direct cause of the following exception:

TypeError                                 Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pandas\core\window\rolling.py in _apply(self, func, center, require_min_periods, floor, is_weighted, name, use_numba_cache, **kwargs)
    535             try:
--> 536                 values = self._prep_values(b.values)
    537 

~\Anaconda3\lib\site-packages\pandas\core\window\rolling.py in _prep_values(self, values)
    356             except (ValueError, TypeError) as err:
--> 357                 raise TypeError(f"cannot handle this type -> {values.dtype}") from err
    358 

TypeError: cannot handle this type -> object

The above exception was the direct cause of the following exception:

DataError                                 Traceback (most recent call last)
<ipython-input-166-8676c880bf7e> in <module>
      7 df = pd.DataFrame(d)
      8 
----> 9 df['vv'].rolling('72s').count()

~\Anaconda3\lib\site-packages\pandas\core\window\rolling.py in count(self)
   2048         if self.is_freq_type or isinstance(self.window, BaseIndexer):
   2049             window_func = self._get_roll_func("roll_count")
-> 2050             return self._apply(window_func, center=self.center, name="count")
   2051 
   2052         return super().count()

~\Anaconda3\lib\site-packages\pandas\core\window\rolling.py in _apply(self, func, center, require_min_periods, floor, is_weighted, name, use_numba_cache, **kwargs)
    542                     continue
    543                 else:
--> 544                     raise DataError("No numeric types to aggregate") from err
    545 
    546             if values.size == 0:

DataError: No numeric types to aggregate
ValueError回溯(最近一次调用)
~\Anaconda3\lib\site packages\pandas\core\window\rolling.py in\u prep\u值(self,values)
354试试:
-->355值=确保64(值)
356除了(ValueError,TypeError)作为错误:
熊猫\\u libs\algos\u common\u helper.pxi在熊猫中。\u libs.algos.sure\u float64()
ValueError:无法将字符串转换为浮点:“aa”
上述异常是以下异常的直接原因:
TypeError回溯(最近一次调用上次)
~\Anaconda3\lib\site packages\pandas\core\window\rolling.py in\u apply(self、func、center、require\u min\u periods、floor、is\u weighted、name、use\u numba\u cache、**kwargs)
535尝试:
-->536值=自身值(b值)
537
~\Anaconda3\lib\site packages\pandas\core\window\rolling.py in\u prep\u值(self,values)
356除了(ValueError,TypeError)作为错误:
-->357从err引发类型错误(f“无法处理此类型->{values.dtype}”)
358
TypeError:无法处理此类型->对象
上述异常是以下异常的直接原因:
数据错误回溯(最近一次呼叫上次)
在里面
7 df=pd.数据帧(d)
8.
---->9 df['vv'].滚动('72s').计数()
计数中的~\Anaconda3\lib\site packages\pandas\core\window\rolling.py(self)
2048如果self.is_freq_type或isinstance(self.window、BaseIndexer):
2049 window\u func=self.\u get\u roll\u func(“roll\u count”)
->2050返回self.\u应用(窗口函数,中心=self.center,name=“count”)
2051
2052返回super().count()
~\Anaconda3\lib\site packages\pandas\core\window\rolling.py in\u apply(self、func、center、require\u min\u periods、floor、is\u weighted、name、use\u numba\u cache、**kwargs)
542继续
543其他:
-->544从err引发DataError(“无要聚合的数字类型”)
545
546如果values.size==0:
DataError:没有要聚合的数字类型
熊猫滚动时间窗口计数字符串失败-为什么

我想是虫子。它与窗口井的数值一起工作:

s = df['vv'].rolling(4).count()
print (s)
2020-01-13 08:22:00    1.0
2020-01-13 08:23:00    2.0
2020-01-13 08:24:00    3.0
2020-01-13 08:25:00    3.0
2020-01-13 08:26:00    3.0
Name: vv, dtype: float64

一种可能的想法是使用
count
将非缺失值替换为to
1

d = {'vv': {pd.Timestamp('2020-01-13 08:22:00', freq='T'): 'aa',
pd.Timestamp('2020-01-13 08:23:00', freq='T'): 'bb',
pd.Timestamp('2020-01-13 08:24:00', freq='T'): 'cc',
pd.Timestamp('2020-01-13 08:25:00', freq='T'): np.nan,
pd.Timestamp('2020-01-13 08:26:00', freq='T'): 'dd'}}

df = pd.DataFrame(d)

详细信息

print (df['vv'].where(df['vv'].isna(), 1))
2020-01-13 08:22:00      1
2020-01-13 08:23:00      1
2020-01-13 08:24:00      1
2020-01-13 08:25:00    NaN
2020-01-13 08:26:00      1
Name: vv, dtype: object
第一个想法是测试不缺少值和
总和

df = df['vv'].notna().rolling('72s').sum()
print (df)
2020-01-13 08:22:00    1.0
2020-01-13 08:23:00    2.0
2020-01-13 08:24:00    2.0
2020-01-13 08:25:00    1.0
2020-01-13 08:26:00    1.0
Name: vv, dtype: float64

嗯,有道理,但由于某些原因,数字似乎不正确,不应该,例如,8:24是2('aa'和'bb'发生在72秒内)?谢谢,“.notna().rolling('72s').sum()”似乎也能工作(在您的第一个版本之后发现这个)@EzerK-oops,那么我错了,测试了miisng值。补充答案。
df = df['vv'].notna().rolling('72s').sum()
print (df)
2020-01-13 08:22:00    1.0
2020-01-13 08:23:00    2.0
2020-01-13 08:24:00    2.0
2020-01-13 08:25:00    1.0
2020-01-13 08:26:00    1.0
Name: vv, dtype: float64