Python 用平均值绘制图形
我有以下数据框,希望创建一个标题为日期的图表,时间在x轴上,μ摩尔在y轴上:Python 用平均值绘制图形,python,pandas,matplotlib,Python,Pandas,Matplotlib,我有以下数据框,希望创建一个标题为日期的图表,时间在x轴上,μ摩尔在y轴上: 0 2019-06-11 17:21:35 13.5 1 2019-06-11 17:22:35 13.1 2 2019-06-11 17:23:35 13.0 3 2019-06-11 17:24:35 11.8 4 2019-06-11 17:25:35 11.8 ... ... ... ... 394 2019-06-11 23:55:38 0.
0 2019-06-11 17:21:35 13.5
1 2019-06-11 17:22:35 13.1
2 2019-06-11 17:23:35 13.0
3 2019-06-11 17:24:35 11.8
4 2019-06-11 17:25:35 11.8
... ... ... ...
394 2019-06-11 23:55:38 0.0
395 2019-06-11 23:56:38 0.0
396 2019-06-11 23:57:38 0.0
397 2019-06-11 23:58:38 0.0
398 2019-06-11 23:59:38 0.0
我已经写出了一些数据帧,这些数据帧分开了时间段,并计算了下午5点、6点等的平均测量值。例如:
seventeen = df.iloc[:39] # seventeen (for 5pm)
seventeen["\u03bcmoles"].mean()
six_pm = df.iloc[39:99] # six_pm (for 6pm)
six_pm["\u03bcmoles"].mean()
等等
我想画出一个图表,用这种代码使用这些测量值:
df.plot(x ='Timestamp', y='\u03bcmoles', kind = 'line')
datapoints = seventeen, six_pm, seven, twenty_hundred, twenty_one, twenty_two, twenty_three (so these are all the datapoints for which I calculate the averages)
plt.show()
有什么方法可以做到这一点吗?考虑按小时与熊猫、石斑鱼进行聚合,而不是单独的每小时平均值
fig, ax = plt.subplots(figsize=(12,6))
df.plot(x ='Timestamp', y='\u03bcmoles', kind = 'line', ax=ax)
agg = (df.groupby(pd.Grouper(key='Timestamp', freq='h'))['\u03bcmoles'].mean()
.reset_index()
.set_axis(['Timestamp', 'mean_\u03bcmoles'], axis='columns', inplace=False)
)
agg.plot(x='Timestamp', y='mean_\u03bcmoles', xticks=agg['Timestamp'].tolist(),
kind='line', marker='o', color='green', ax=ax)
plt.show()
如果您需要特定的小时数,请使用聚合数据上的.loc
按小时过滤.isin
:
(agg.loc[agg['Timestamp'].dt.hour.isin([17, 18, 20, 21, 22, 23])]
.plot(x='Timestamp', y='mean_\u03bcmoles',
kind='line', marker='o', color='green', ax=ax)
)
要用随机数据演示:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
### DATA BUILD
np.random.seed(10262020)
df = pd.DataFrame({'Timestamp': pd.to_datetime(1603670400 + np.random.randint(1, 86400, 500), unit='s'),
'\u03bcmoles': np.random.uniform(50, 100, 500)
}).sort_values('Timestamp')
### AGGREGATION BUILD
agg = (df.groupby(pd.Grouper(key='Timestamp', freq='h'))['\u03bcmoles'].mean()
.reset_index()
.set_axis(['Timestamp', 'mean_\u03bcmoles'], axis='columns', inplace=False)
)
### PLOT BUILD
fig, ax = plt.subplots(figsize=(12,6))
df.plot(x ='Timestamp', y='\u03bcmoles', kind = 'line', ax=ax)
agg.plot(x='Timestamp', y='mean_\u03bcmoles',
xticks=agg['Timestamp'].tolist() + [agg['Timestamp'].dt.ceil(freq='d').max()],
kind='line', marker='o', color='green', ax=ax)
ax.xaxis.set_major_formatter(DateFormatter("%Y-%m-%d %H:%M:%S"))
plt.show()
考虑按小时与熊猫、石斑鱼进行聚合,而不是单独的小时平均值
fig, ax = plt.subplots(figsize=(12,6))
df.plot(x ='Timestamp', y='\u03bcmoles', kind = 'line', ax=ax)
agg = (df.groupby(pd.Grouper(key='Timestamp', freq='h'))['\u03bcmoles'].mean()
.reset_index()
.set_axis(['Timestamp', 'mean_\u03bcmoles'], axis='columns', inplace=False)
)
agg.plot(x='Timestamp', y='mean_\u03bcmoles', xticks=agg['Timestamp'].tolist(),
kind='line', marker='o', color='green', ax=ax)
plt.show()
如果您需要特定的小时数,请使用聚合数据上的.loc
按小时过滤.isin
:
(agg.loc[agg['Timestamp'].dt.hour.isin([17, 18, 20, 21, 22, 23])]
.plot(x='Timestamp', y='mean_\u03bcmoles',
kind='line', marker='o', color='green', ax=ax)
)
要用随机数据演示:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
### DATA BUILD
np.random.seed(10262020)
df = pd.DataFrame({'Timestamp': pd.to_datetime(1603670400 + np.random.randint(1, 86400, 500), unit='s'),
'\u03bcmoles': np.random.uniform(50, 100, 500)
}).sort_values('Timestamp')
### AGGREGATION BUILD
agg = (df.groupby(pd.Grouper(key='Timestamp', freq='h'))['\u03bcmoles'].mean()
.reset_index()
.set_axis(['Timestamp', 'mean_\u03bcmoles'], axis='columns', inplace=False)
)
### PLOT BUILD
fig, ax = plt.subplots(figsize=(12,6))
df.plot(x ='Timestamp', y='\u03bcmoles', kind = 'line', ax=ax)
agg.plot(x='Timestamp', y='mean_\u03bcmoles',
xticks=agg['Timestamp'].tolist() + [agg['Timestamp'].dt.ceil(freq='d').max()],
kind='line', marker='o', color='green', ax=ax)
ax.xaxis.set_major_formatter(DateFormatter("%Y-%m-%d %H:%M:%S"))
plt.show()