Python 使用Seaborn FaceGrid从数据帧打印错误条_Python_Matplotlib_Plot_Pandas_Seaborn

Python 使用Seaborn FaceGrid从数据帧打印错误条

python matplotlib plot pandas

Python 使用Seaborn FaceGrid从数据帧打印错误条,python,matplotlib,plot,pandas,seaborn,Python,Matplotlib,Plot,Pandas,Seaborn,我想在Seaborn FacetGrid上绘制pandas数据帧中列的错误条 import matplotlib.pyplot as plt import pandas as pd import seaborn as sns df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar']*2, 'B' : ['one', 'one', 'two', 'three',

我想在Seaborn FacetGrid上绘制pandas数据帧中列的错误条

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar']*2,
                   'B' : ['one', 'one', 'two', 'three',
                         'two', 'two', 'one', 'three'],
                  'C' : np.random.randn(8),
                  'D' : np.random.randn(8)})
df

示例数据帧

    A       B        C           D
0   foo     one      0.445827   -0.311863
1   bar     one      0.862154   -0.229065
2   foo     two      0.290981   -0.835301
3   bar     three    0.995732    0.356807
4   foo     two      0.029311    0.631812
5   bar     two      0.023164   -0.468248
6   foo     one     -1.568248    2.508461
7   bar     three   -0.407807    0.319404

df['E'] = abs(df['D']*0.5)
g = sns.FacetGrid(df, col="A", hue="B", size =5)
g.map(plt.errorbar, "C", "D", yerr=df['E']);

此代码适用于固定大小的错误条：

g = sns.FacetGrid(df, col="A", hue="B", size =5)
g.map(plt.errorbar, "C", "D",yerr=0.5, fmt='o');

但我无法使用数据帧中的值使其工作

    A       B        C           D
0   foo     one      0.445827   -0.311863
1   bar     one      0.862154   -0.229065
2   foo     two      0.290981   -0.835301
3   bar     three    0.995732    0.356807
4   foo     two      0.029311    0.631812
5   bar     two      0.023164   -0.468248
6   foo     one     -1.568248    2.508461
7   bar     three   -0.407807    0.319404

df['E'] = abs(df['D']*0.5)
g = sns.FacetGrid(df, col="A", hue="B", size =5)
g.map(plt.errorbar, "C", "D", yerr=df['E']);

或

两者都会产生错误

编辑：

在大量阅读matplotlib文档和各种stackoverflow答案后，这是一个纯matplotlib解决方案

#define a color palette index based on column 'B'
df['cind'] = pd.Categorical(df['B']).labels

#how many categories in column 'A'
cats = df['A'].unique()
cats.sort()

#get the seaborn colour palette and convert to array
cp = sns.color_palette()
cpa = np.array(cp)

#draw a subplot for each category in column "A"
fig, axs = plt.subplots(nrows=1, ncols=len(cats), sharey=True)
for i,ax in enumerate(axs):
    df_sub = df[df['A'] == cats[i]]
    col = cpa[df_sub['cind']]
    ax.scatter(df_sub['C'], df_sub['D'], c=col)
    eb = ax.errorbar(df_sub['C'], df_sub['D'], yerr=df_sub['E'], fmt=None)
    a, (b, c), (d,) = eb.lines
    d.set_color(col)

而不是标签，轴限制其OK。它为“a”列中的每个类别绘制了一个单独的子地块，由“B”列中的类别着色。（注意随机数据与上述数据不同）

如果有人有任何想法，我还是想要一个熊猫/海洋出生的解决方案

您没有显示

df['E']

实际上是什么，以及它是否是一个与

df['C']

和

df['D']

长度相同的列表

yerr

关键字参数（kwarg）要么从数据帧中获取一个将应用于键C和D列表中每个元素的值，要么需要一个与这些列表长度相同的值列表

因此，C、D和E都必须与相同长度的列表相关联，或者C和D必须是相同长度的列表，E必须与单个

浮点

或

int

相关联。如果单个

float

或

int

在列表中，则必须提取它，如

df['E'][0]

示例

matplotlib

代码与

yerr

：

描述YER的条形图API文档

使用

FaceGrid.map

时，引用

数据

数据帧的任何内容都必须作为位置参数传递。这将在您的情况下起作用，因为

yerr

是

plt.errorbar

的第三个位置参数，不过为了演示我将使用tips数据集：

from scipy import stats
tips_all = sns.load_dataset("tips")
tips_grouped = tips_all.groupby(["smoker", "size"])
tips = tips_grouped.mean()
tips["CI"] = tips_grouped.total_bill.apply(stats.sem) * 1.96
tips.reset_index(inplace=True)

然后，我可以使用

FaceGrid

和

errorbar

进行绘图：

g = sns.FacetGrid(tips, col="smoker", size=5)
g.map(plt.errorbar, "size", "total_bill", "CI", marker="o")

但是，请记住，seaborn plotting函数用于从完整数据集到带有错误条的绘图（使用引导），因此对于许多应用程序来说，这可能不是必需的。例如，您可以使用

factorplot

：

sns.factorplot("size", "total_bill", col="smoker",
               data=tips_all, kind="point")

sns.lmplot("size", "total_bill", col="smoker",
           data=tips_all, fit_reg=False, x_estimator=np.mean)

或

lmplot

：

sns.factorplot("size", "total_bill", col="smoker",
               data=tips_all, kind="point")

sns.lmplot("size", "total_bill", col="smoker",
           data=tips_all, fit_reg=False, x_estimator=np.mean)

df['E']=abs（df['D']*0.5）

，位于第四个代码块的第一行。我认为问题在于seaborn的map函数将整个

df['E']

列表传递给matplotlib的errorbar函数，而不仅仅是应用于该子地块的部分。位置参数位是关键。在测量环境中，a型不确定度（统计）很容易在factorplot、lmplot中进行评估，尽管人们必须深入api文档，以准确检查正在绘制的数据分布的测量值及其计算方法（通过引导法68%置信限？）。如果这在文档中更直接一些就好了。我需要绘制B型不确定性，我可以这样做，如图所示。感谢默认CI为95%（您可以在函数签名中看到），但它们都采用

CI

关键字参数，如果您需要标准错误，可以将其设置为68%。@mwaskom是否有不对称错误条的解决方案？假设我有两列数据帧，给出CI min/max。有没有办法通过

g.map

将其传递到

plt.errorbar

？您应该能够编写一个包装函数，该函数接受向量

（x，y，err\u lower，err\u upper）

，并正确调用

plt.errorbar

。