Python 熊猫中带有groupby数据帧的子地块?
我有一些类似于以下CSV的内容:Python 熊猫中带有groupby数据帧的子地块?,python,csv,pandas,matplotlib,plot,Python,Csv,Pandas,Matplotlib,Plot,我有一些类似于以下CSV的内容: date,name,area,score 10/15/2015,john,metallurgy,92 10/16/2015,john,metallurgy,84 10/16/2015,nancy,metallurgy,97 10/17/2015,nancy,metallurgy,76 10/18/2015,john,forestry,81 10/18/2015,john,forestry,46 10/19/2015,nancy,forestry,81 10/19
date,name,area,score
10/15/2015,john,metallurgy,92
10/16/2015,john,metallurgy,84
10/16/2015,nancy,metallurgy,97
10/17/2015,nancy,metallurgy,76
10/18/2015,john,forestry,81
10/18/2015,john,forestry,46
10/19/2015,nancy,forestry,81
10/19/2015,nancy,forestry,74
10/23/2015,nancy,forestry,83
我想为每个人(name
)绘制一个包含子图的绘图。我希望它看起来像这样:
此外,如图所示,在绘制实际分数的同时,我希望能够绘制指数加权移动平均(ewma)曲线或一系列线性回归线等
我想我可以在Python/vanilla matplotlib中使用以下内容来实现:
import pandas as pd
import matplotlib.pyplot as plt
plots_per_row = 3 # number of columns
df = pd.read_csv("data.csv")
# get number of plots
nplots = 0
by_name_and_area = {}
nplots_by_name = {}
nrows_by_name = {}
for name, namerows in df.groupby(['name']):
by_name_and_area[name] = {}
nplots_by_name[name] = 0
for area, rows in namerows.groupby(['area']):
by_name_and_area[name][area] = rows['score']
nplots_by_name[name] += 1
# decide number of rows
nrows_by_name[name] = int(nplots / plots_per_row)
if nplots % plots_per_row > 0:
nrows_by_name[name] += 1
# create figure & subplots and iterate through to plot each
fig, axes = plt.subplots(nrows, plots_per_row, sharex=True, sharey=True, squeeze=True)
# ... etc, etc
但我更愿意在《熊猫》中这样做,因为我正在努力学习这一点,而且通常韦斯在我们大多数凡人之前就已经想清楚了一切
有什么想法吗?请看我对类似问题的回答是否对您有帮助