Python 如何绘制数据帧的某些行？_Python_Pandas_Matplotlib_Seaborn

Python 如何绘制数据帧的某些行？

python pandas matplotlib

Python 如何绘制数据帧的某些行？,python,pandas,matplotlib,seaborn,Python,Pandas,Matplotlib,Seaborn,我有一个数据帧示例： animal gender name first second third 0 dog m Ben 5 6 3 1 dog f Lilly 2 3 5 2 dog m Bob 3 2 1 3 cat f Puss 1 4

我有一个数据帧示例：

      animal gender     name  first  second  third
0     dog      m      Ben      5       6      3
1     dog      f    Lilly      2       3      5
2     dog      m      Bob      3       2      1
3     cat      f     Puss      1       4      4
4     cat      m  Inboots      3       6      5
5    wolf      f     Lady    NaN       0      3
6    wolf      m   Summer      2       2      1
7    wolf      m     Grey      4       2      3
8    wolf      m     Wind      2       3      5
9    lion      f     Elsa      5       1      4
10   lion      m    Simba      3       3      3
11   lion      f     Nala      4       4      2

现在，我想我可能需要一些层次索引来实现这一点，但我在熊猫方面还没有做到这一点。然而，我真的需要用它做一些事情（显然太高级了），但我还没有弄明白如何去做。基本上，我最后想要的是，在这种情况下，一个图（可能是散点图，尽管现在一条线也可以）

1）我想有一个4个子地块的数字-每只动物一个子地块。每个子批次的标题应为动物

2）在每个子图中，我想绘制数字（例如每年出生的幼仔数量），即给定行的“第一”、“第二”和“第三”值，并给它一个标签，在图例中显示“名称”。对于每个子批次（每只动物），我想分别绘制雄性和雌性（例如，蓝色的雄性和红色的雌性），此外，用黑色绘制动物的平均值（即给定动物每列的平均值）

3）注：以1,2,3为例绘制-参考列号，例如，对于标题为“dog”的第一个子地块，我想绘制一些类似于

plt.plot（np.array（[1,2,3]）、x、'b'，np.array（[1,2,3]）、y、'r'，np.array（[1,2,3]）、np mean（x，y，axis=1）、'k'）

其中x将（在第一种情况下）为5,6,3，蓝色地块的图例将显示“Ben和”，y为2,3,5，红色地块的图例显示为“莉莉”，黑色地块的图例显示为3.5,4.5,4，在图例中，我将其定义为“平均”（对于每个子地块）

我希望我说得够清楚了。我知道如果看不到最终的数字，可能很难想象，但是。。。如果我知道怎么做，我不会问

总之，我想在不同的层次上循环数据框架，让动物在不同的子批次上，比较雄性和雌性，以及它们在每个子批次中的平均值

我的实际数据帧要大得多，所以在理想情况下，我想要一个健壮但易于理解的解决方案（对于编程初学者）

为了了解子批次的外观，下面是excel中的产品：

我不确定我是否理解你的意思。但我认为，您需要将数据帧转换为长格式，或者，您将对其进行的许多操作将更容易使用该格式，从基于分类变量绘制图开始

使用

df

作为数据帧，要将其转换为整洁的格式，只需使用：

df2 = pd.melt(df, id_vars=["animal","gender","name"])
df2
  animal gender     name variable  value
0    dog      m      Ben    first    5.0
1    dog      f    Lilly    first    2.0
2    dog      m      Bob    first    3.0
3    cat      f     Puss    first    1.0
4    cat      m  Inboots    first    3.0
...
31   wolf     m     Grey    third    3.0
32   wolf     m     Wind    third    5.0
33   lion     f     Elsa    third    4.0
34   lion     m    Simba    third    3.0
35   lion     f     Nala    third    2.0

然后（几乎）一切都变得简单，只需使用seaborn，如下所示：

g = sns.factorplot(data=df2, # from your Dataframe
                   col="animal", # Make a subplot in columns for each variable in "animal"
                   col_wrap=2, # Maximum number of columns per row 
                   x="variable", # on x-axis make category on the variable "variable" (created by the melt operation)
                   y="value", # The corresponding y values
                   hue="gender", # color according to the column gender
                   kind="strip", # the kind of plot, the closest to what you want is a stripplot, 
                   legend_out=False, # let the legend inside the first subplot.
                   )

然后，您可以提高整体美感：

g.set_xlabels("year")
g.set_titles(template="{col_name}") # otherwise it's "animal = dog", now it's just "dog"
sns.despine(trim=True) # trim the axis.

为了增加平均值，你必须手动操作，但是，如果你有更多的数据，你也可以考虑一个方块图或一个小提琴图，你可以在条带的顶部使用它，BTW.< 我请你检查你的情节是否有进一步的改进

HTH

对i使用

，在df.groupby（'animal'）中分组：

并在循环中绘图。不是答案，因为我有点赶时间。我想我的问题在这里得到了部分回答：但我仍然不完全相信索引和循环这些多维数据，特别是绘制行，而不是列…谢谢Chinmay！当你有时间的时候，你能再解释一下吗？（例如，我如何处理groupby的两个参数——“I”代表什么？以及如何处理分组对象的行。）只是在数据框中添加了一些行，使其更加复杂，以确保其正常工作。我本来打算链接到文档，但它们不是很具有描述性。

groupby

中的

是组的“名称”，在本例中为

wolf

，

lion

等。