Python 将多个值打印为范围-matplotlib
我正在尝试确定最有效的方法来生成一组显示为一个范围的Python 将多个值打印为范围-matplotlib,python,pandas,dataframe,matplotlib,plot,Python,Pandas,Dataframe,Matplotlib,Plot,我正在尝试确定最有效的方法来生成一组显示为一个范围的行图。我希望能制作出类似于: 我会尽可能多地解释。对不起,如果我错过了任何信息。我设想x轴是时间戳的时间范围小时(8am-9am-10am等)。总范围将在8:00:00和27:00:00之间。y轴是在任何时间点发生的值的计数。绘图中的范围将表示出现的最大值、最小值和平均值 下面列出了一个示例df: import pandas as pd import matplotlib.pyplot as plt d = ({ 'Time1' :
行
图。我希望能制作出类似于:
我会尽可能多地解释。对不起,如果我错过了任何信息。我设想x轴
是时间戳
的时间范围
小时(8am-9am-10am等)。总范围将在8:00:00和27:00:00之间。y轴
是在任何时间点发生的值的计数。绘图
中的范围将表示出现的最大值
、最小值
和平均值
下面列出了一个示例df
:
import pandas as pd
import matplotlib.pyplot as plt
d = ({
'Time1' : ['8:00:00','9:30:00','9:40:00','10:25:00','12:30:00','1:31:00','1:35:00','2:45:00','4:50:00'],
'Occurring1' : ['1','2','3','4','5','5','6','6','7'],
'Time2' : ['8:10:00','9:34:00','9:48:00','10:40:00','1:30:00','2:31:00','3:35:00','3:45:00','4:55:00'],
'Occurring2' : ['1','2','2','3','4','5','5','6','7'],
'Time3' : ['9:00:00','9:34:00','9:58:00','10:45:00','10:50:00','12:31:00','1:35:00','2:15:00','3:55:00'],
'Occurring3' : ['1','2','3','4','4','5','6','7','8'],
})
df = pd.DataFrame(data = d)
因此,这个df
表示3组不同的数据。时间、出现的值以及条目的偶数可能会有所不同
下面是一个初步的例子。尽管我不确定是否需要重新思考我的方法。滚动方程在这里有效吗?评估df
(8:00:00-9:00:00)中每小时出现的max
、min
、avg
数值的东西
以下是完整的初始尝试:
import pandas as pd
import matplotlib.pyplot as plt
d = ({
'Time1' : ['8:00:00','9:30:00','9:40:00','10:25:00','12:30:00','1:31:00','1:35:00','2:45:00','4:50:00'],
'Occurring1' : ['1','2','3','4','5','5','6','6','7'],
'Time2' : ['8:10:00','9:34:00','9:48:00','10:40:00','1:30:00','2:31:00','3:35:00','3:45:00','4:55:00'],
'Occurring2' : ['1','2','2','3','4','5','5','6','7'],
'Time3' : ['9:00:00','9:34:00','9:58:00','10:45:00','10:50:00','12:31:00','1:35:00','2:15:00','3:55:00'],
'Occurring3' : ['1','2','3','4','4','5','6','7','8'],
})
df = pd.DataFrame(data = d)
fig, ax = plt.subplots(figsize = (10,6))
ax.plot(df['Time1'], df['Occurring1'])
ax.plot(df['Time2'], df['Occurring2'])
ax.plot(df['Time3'], df['Occurring3'])
plt.show()
为了得到想要的结果,你需要跳过几圈。首先,需要创建一个常规时间栅格,在其上插值y数据(引用)。然后,可以得到插值数据的最小值、最大值和平均值。下面的代码演示了如何执行此操作:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import griddata
# Example data
d = ({
'Time1' : ['8:00:00','9:30:00','9:40:00','10:25:00','12:30:00','1:31:00','1:35:00','2:45:00','4:50:00'],
'Occurring1' : ['1','2','3','4','5','5','6','6','7'],
'Time2' : ['8:10:00','9:34:00','9:48:00','10:40:00','1:30:00','2:31:00','3:35:00','3:45:00','4:55:00'],
'Occurring2' : ['1','2','2','3','4','5','5','6','7'],
'Time3' : ['9:00:00','9:34:00','9:58:00','10:45:00','10:50:00','12:31:00','1:35:00','2:15:00','3:55:00'],
'Occurring3' : ['1','2','3','4','4','5','6','7','8'],
})
# Create dataframe, explicitly define dtypes
df = pd.DataFrame(data=d)
df = df.astype({
"Time1": np.datetime64,
"Occurring1": np.int,
"Time2": np.datetime64,
"Occurring2": np.int,
"Time3": np.datetime64,
"Occurring3": np.int,
})
# Create 1D vectors of time data
all_times = df[["Time1", "Time2", "Time3"]].values
# Representation of 1 minute in time
t_min = np.timedelta64(int(60*1e9), "ns")
# Create a regular time grid with 10 minute spacing
time_grid = np.arange(all_times.min(), all_times.max(), 10*t_min, dtype="datetime64")
# Storage buffer for interpolated occurring data
occurrences_grid = np.zeros((3, len(time_grid)))
# Loop over all occurrence data and interpolate to regular grid
for i in range(3):
occurrences_grid[i] = griddata(
points=df["Time%i" % (i+1)].values.astype("float"),
values=df["Occurring%i" % (i+1)],
xi=time_grid.astype("float"),
method="linear"
)
# Get min, max, and mean values of interpolated data
occ_min = np.min(occurrences_grid, axis=0)
occ_max = np.max(occurrences_grid, axis=0)
occ_mean = np.mean(occurrences_grid, axis=0)
# Plot interpolated data
plt.fill_between(time_grid, occ_min, occ_max, color="slategray")
plt.plot(time_grid, occ_mean, c="white")
plt.xticks(rotation=60)
plt.tight_layout()
plt.show()
结果(x标签格式不正确):
谢谢@MPA。这太棒了。timedelta
是否可以代替datetime
?@user9639519数据帧中的时间格式可以是任何格式,但是griddata
函数只接受浮点数或整数作为输入,因此需要在插值之前转换输入。