使用python中的时间序列数据计算平均值、方差和偏差_Python_Pandas_Time Series

使用python中的时间序列数据计算平均值、方差和偏差

python pandas

使用python中的时间序列数据计算平均值、方差和偏差,python,pandas,time-series,Python,Pandas,Time Series,我从传感器收集的数据如下所示： sec nanosec value 1001 1 0.2 1001 2 0.2 1001 3 0.2 1002 1 0.1 1002 2 0.2 1002 3 0.1 1003 1 0.2 1003 2 0.2 1003 3 0.1 1004 1 0.2

我从传感器收集的数据如下所示：

sec   nanosec value 

1001   1       0.2 

1001   2       0.2

1001   3       0.2 

1002   1       0.1  

1002   2       0.2   

1002   3       0.1 

1003   1       0.2 

1003   2       0.2

1003   3       0.1  

1004   1       0.2   

1004   2       0.2 

1004   3       0.2 

1004   4      0.1

我想每2秒计算一列的

平均值、标准偏差

和一些其他统计数据，如最大值、最小值。因此，（10011002）的平均值=0.167，（10031004）的平均值=0.17

从教程中，我认为我应该将其转换为时间序列，并使用pandas的rolling_意思，但我对时间序列数据不太熟悉，因此我不确定这是否正确。另外，我如何在这里指定转换频率，因为第一秒的观测值较少。因此，对于实际数据，我在1001秒内的读数少于100，然后在1002秒之后的100次观察

我也可以按秒做一个简单的分组，但它将每秒的读数分组，而不是每2秒，然后我如何将分组中连续两组的观察结果合并，然后进行分析。

我认为您可以先将列

秒转换为秒，然后再转换为秒（2S
）：
也许您需要在重采样中更改base
：
print (df.value.resample('2S', base=1).mean())
sec
00:16:42    0.166667
00:16:44    0.171429
00:16:46         NaN
Freq: 2S, Name: value, dtype: float64

print (df.value.resample('2S', base=1).std())
sec
00:16:42    0.051640
00:16:44    0.048795
00:16:46         NaN
Freq: 2S, Name: value, dtype: float64

print (df.value.resample('2S', base=1).max())
sec
00:16:42    0.2
00:16:44    0.2
00:16:46    NaN
Freq: 2S, Name: value, dtype: float64

借用jezrael的代码进行设置：
df['sec'] = pd.to_timedelta(df.sec, unit='s')
df.set_index('sec', inplace=True)
print (df)
          nanosec  value
sec                     
00:16:41        1    0.2
00:16:41        2    0.2
00:16:41        3    0.2
00:16:42        1    0.1
00:16:42        2    0.2
00:16:42        3    0.1
00:16:43        1    0.2
00:16:43        2    0.2
00:16:43        3    0.1
00:16:44        1    0.2
00:16:44        2    0.2
00:16:44        3    0.2
00:16:44        4    0.1

使用pd.TimeGrouper（'2S'）
和descripe（）

我认为这会起作用，但我得到了一个警告：尝试使用.loc[row\u index，col\u indexer]=value代替df1['header\u stamp\u secs']=pd.to\u timedelta（df1.header\u stamp\u secs，unit='s'）。这之后是一个仅对DatetimeIndex或PeriodIndexInesting有效的错误。你的熊猫版本是什么？我使用的是熊猫0.13.1Hmmm，最后一个版本是0.18.1
，我想你可以升级pandas。
print (df.value.resample('2S', base=1).mean())
sec
00:16:42    0.166667
00:16:44    0.171429
00:16:46         NaN
Freq: 2S, Name: value, dtype: float64

print (df.value.resample('2S', base=1).std())
sec
00:16:42    0.051640
00:16:44    0.048795
00:16:46         NaN
Freq: 2S, Name: value, dtype: float64

print (df.value.resample('2S', base=1).max())
sec
00:16:42    0.2
00:16:44    0.2
00:16:46    NaN
Freq: 2S, Name: value, dtype: float64

print (df.value.resample('2S', base=2).mean())
sec
00:16:43    0.166667
00:16:45    0.171429
00:16:47         NaN
Freq: 2S, Name: value, dtype: float64

print (df.value.resample('2S', base=2).std())
sec
00:16:43    0.051640
00:16:45    0.048795
00:16:47         NaN
Freq: 2S, Name: value, dtype: float64

print (df.value.resample('2S', base=2).max())
sec
00:16:43    0.2
00:16:45    0.2
00:16:47    NaN
Freq: 2S, Name: value, dtype: float64

df['sec'] = pd.to_timedelta(df.sec, unit='s')
df.set_index('sec', inplace=True)
print (df)
          nanosec  value
sec                     
00:16:41        1    0.2
00:16:41        2    0.2
00:16:41        3    0.2
00:16:42        1    0.1
00:16:42        2    0.2
00:16:42        3    0.1
00:16:43        1    0.2
00:16:43        2    0.2
00:16:43        3    0.1
00:16:44        1    0.2
00:16:44        2    0.2
00:16:44        3    0.2
00:16:44        4    0.1

df.groupby(pd.TimeGrouper('2S')).describe()