Python 创建时间步长不均匀的10分钟数据运行平均值_Python_Python 3.x

Python 创建时间步长不均匀的10分钟数据运行平均值

python python-3.x

Python 创建时间步长不均匀的10分钟数据运行平均值,python,python-3.x,Python,Python 3.x,我试图创建一个10分钟的运行平均风速，以便在图表上绘图。我的数据中有许多不均匀的时间步长。我目前正在使用CSV模块读取和收集我的数据，我对熊猫不太熟悉，过去也有过问题 import matplotlib.pyplot as plt import csv from datetime import datetime x=[] y=[] with open('KART_201901010000_201912310000.txt') as csvfile: plots = csv.reade

我试图创建一个10分钟的运行平均风速，以便在图表上绘图。我的数据中有许多不均匀的时间步长。我目前正在使用CSV模块读取和收集我的数据，我对熊猫不太熟悉，过去也有过问题

import matplotlib.pyplot as plt
import csv
from datetime import datetime

x=[]
y=[]

with open('KART_201901010000_201912310000.txt') as csvfile:
    plots = csv.reader(csvfile, delimiter=',')
    for row in plots:
        if 'M' == row[1]:
            continue
        else:           
            x.append(datetime.strptime(row[0],'%Y-%m-%d %H:%M'))            
            y.append(int(row[1]))   

plt.plot(x,y, label='Wind Speed')
plt.xlabel('Date and Time')
plt.ylabel('Wind Speed (Kts)')
plt.title('Wind Speed\nVersus Time')
plt.legend()
plt.show()

下面是我的数据集的一个片段，显示了许多不均匀的时间步长之一

2019-11-01 11:40,30
2019-11-01 11:45,35
2019-11-01 11:50,32
2019-11-01 11:55,34
2019-11-01 11:56,33
2019-11-01 12:00,33
2019-11-01 12:05,36
2019-11-01 12:10,31

显然，一般的想法是使用for循环来继续计算，我需要这些计算来平均数据。我遇到的问题是如何解释不均匀的步骤？有没有一种方法可以使用datetime来实现我不知道的目标？

大致如下：

将熊猫作为pd导入
df=pd.read\u csv（'KART\u 201901010000\u 201912310000.txt'，标题=1）
df.index=pd.to_datetime（df.index，格式=“%Y-%m-%d%H:%m”）
df.滚动（'10min'，最小周期=1）。平均值（）

我还没有测试过，细节可能会有所不同。我知道你对熊猫并不熟悉，但你自己实现这一功能将花费宝贵的时间，我相信你会乐意在其他地方投资

这肯定适用于您提供的数据：

>>> series = pd.Series([30, 35, 32, 34, 33, 33, 36, 31],
                    index=[pd.Timestamp('2019-11-01 11:40'),
                        pd.Timestamp('2019-11-01 11:45'),
                        pd.Timestamp('2019-11-01 11:50'),
                        pd.Timestamp('2019-11-01 11:55'),
                        pd.Timestamp('2019-11-01 11:56'),
                        pd.Timestamp('2019-11-01 12:00'),
                        pd.Timestamp('2019-11-01 12:05'),
                        pd.Timestamp('2019-11-01 12:10')])
>>> series.rolling('10min', min_periods=1).mean()
Out:
2019-11-01 11:40:00    30.000000
2019-11-01 11:45:00    32.500000
2019-11-01 11:50:00    33.500000
2019-11-01 11:55:00    33.000000
2019-11-01 11:56:00    33.000000
2019-11-01 12:00:00    33.333333
2019-11-01 12:05:00    34.000000
2019-11-01 12:10:00    33.500000
dtype: float64

看看这个：