Python 用平均值绘制图形

Python 用平均值绘制图形,python,pandas,matplotlib,Python,Pandas,Matplotlib,我有以下数据框,希望创建一个标题为日期的图表,时间在x轴上,μ摩尔在y轴上: 0 2019-06-11 17:21:35 13.5 1 2019-06-11 17:22:35 13.1 2 2019-06-11 17:23:35 13.0 3 2019-06-11 17:24:35 11.8 4 2019-06-11 17:25:35 11.8 ... ... ... ... 394 2019-06-11 23:55:38 0.

我有以下数据框,希望创建一个标题为日期的图表,时间在x轴上,μ摩尔在y轴上:

0   2019-06-11  17:21:35    13.5
1   2019-06-11  17:22:35    13.1
2   2019-06-11  17:23:35    13.0
3   2019-06-11  17:24:35    11.8
4   2019-06-11  17:25:35    11.8
... ... ... ...
394 2019-06-11  23:55:38    0.0
395 2019-06-11  23:56:38    0.0
396 2019-06-11  23:57:38    0.0
397 2019-06-11  23:58:38    0.0
398 2019-06-11  23:59:38    0.0
我已经写出了一些数据帧,这些数据帧分开了时间段,并计算了下午5点、6点等的平均测量值。例如:

seventeen = df.iloc[:39]  # seventeen (for 5pm)
seventeen["\u03bcmoles"].mean()

six_pm = df.iloc[39:99]   # six_pm (for 6pm)
six_pm["\u03bcmoles"].mean()
等等

我想画出一个图表,用这种代码使用这些测量值:

df.plot(x ='Timestamp', y='\u03bcmoles', kind = 'line')
datapoints = seventeen, six_pm, seven, twenty_hundred, twenty_one, twenty_two, twenty_three (so these are all the datapoints for which I calculate the averages)
plt.show()

有什么方法可以做到这一点吗?

考虑按小时与熊猫、石斑鱼进行聚合,而不是单独的每小时平均值

fig, ax = plt.subplots(figsize=(12,6))
df.plot(x ='Timestamp', y='\u03bcmoles', kind = 'line', ax=ax)

agg = (df.groupby(pd.Grouper(key='Timestamp', freq='h'))['\u03bcmoles'].mean()
         .reset_index()
         .set_axis(['Timestamp', 'mean_\u03bcmoles'], axis='columns', inplace=False)
      )

agg.plot(x='Timestamp', y='mean_\u03bcmoles', xticks=agg['Timestamp'].tolist(),
         kind='line', marker='o', color='green', ax=ax)

plt.show()
如果您需要特定的小时数,请使用聚合数据上的
.loc
按小时过滤
.isin

(agg.loc[agg['Timestamp'].dt.hour.isin([17, 18, 20, 21, 22, 23])]
    .plot(x='Timestamp', y='mean_\u03bcmoles', 
          kind='line', marker='o', color='green', ax=ax)
)

要用随机数据演示:

import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter

### DATA BUILD
np.random.seed(10262020)
df = pd.DataFrame({'Timestamp': pd.to_datetime(1603670400 + np.random.randint(1, 86400, 500), unit='s'),
                   '\u03bcmoles': np.random.uniform(50, 100, 500)
                  }).sort_values('Timestamp')

### AGGREGATION BUILD
agg = (df.groupby(pd.Grouper(key='Timestamp', freq='h'))['\u03bcmoles'].mean()
         .reset_index()
         .set_axis(['Timestamp', 'mean_\u03bcmoles'], axis='columns', inplace=False)
      )
      
### PLOT BUILD
fig, ax = plt.subplots(figsize=(12,6))
df.plot(x ='Timestamp', y='\u03bcmoles', kind = 'line', ax=ax)

agg.plot(x='Timestamp', y='mean_\u03bcmoles', 
         xticks=agg['Timestamp'].tolist() + [agg['Timestamp'].dt.ceil(freq='d').max()],
         kind='line', marker='o', color='green', ax=ax)

ax.xaxis.set_major_formatter(DateFormatter("%Y-%m-%d %H:%M:%S"))

plt.show()

考虑按小时与熊猫、石斑鱼进行聚合,而不是单独的小时平均值

fig, ax = plt.subplots(figsize=(12,6))
df.plot(x ='Timestamp', y='\u03bcmoles', kind = 'line', ax=ax)

agg = (df.groupby(pd.Grouper(key='Timestamp', freq='h'))['\u03bcmoles'].mean()
         .reset_index()
         .set_axis(['Timestamp', 'mean_\u03bcmoles'], axis='columns', inplace=False)
      )

agg.plot(x='Timestamp', y='mean_\u03bcmoles', xticks=agg['Timestamp'].tolist(),
         kind='line', marker='o', color='green', ax=ax)

plt.show()
如果您需要特定的小时数,请使用聚合数据上的
.loc
按小时过滤
.isin

(agg.loc[agg['Timestamp'].dt.hour.isin([17, 18, 20, 21, 22, 23])]
    .plot(x='Timestamp', y='mean_\u03bcmoles', 
          kind='line', marker='o', color='green', ax=ax)
)

要用随机数据演示:

import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter

### DATA BUILD
np.random.seed(10262020)
df = pd.DataFrame({'Timestamp': pd.to_datetime(1603670400 + np.random.randint(1, 86400, 500), unit='s'),
                   '\u03bcmoles': np.random.uniform(50, 100, 500)
                  }).sort_values('Timestamp')

### AGGREGATION BUILD
agg = (df.groupby(pd.Grouper(key='Timestamp', freq='h'))['\u03bcmoles'].mean()
         .reset_index()
         .set_axis(['Timestamp', 'mean_\u03bcmoles'], axis='columns', inplace=False)
      )
      
### PLOT BUILD
fig, ax = plt.subplots(figsize=(12,6))
df.plot(x ='Timestamp', y='\u03bcmoles', kind = 'line', ax=ax)

agg.plot(x='Timestamp', y='mean_\u03bcmoles', 
         xticks=agg['Timestamp'].tolist() + [agg['Timestamp'].dt.ceil(freq='d').max()],
         kind='line', marker='o', color='green', ax=ax)

ax.xaxis.set_major_formatter(DateFormatter("%Y-%m-%d %H:%M:%S"))

plt.show()