Python 在组Matplotlib plots中标记并使用正确颜色的图例_Python_Pandas_Matplotlib

Python 在组Matplotlib plots中标记并使用正确颜色的图例

python pandas matplotlib

Python 在组Matplotlib plots中标记并使用正确颜色的图例,python,pandas,matplotlib,Python,Pandas,Matplotlib,现在假设我有一个数据文件example.csv： first,second,third,fourth,fifth,sixth -42,11,3,L,D 4,21,40,L,Q 2,31,15,R,D -42,122,50,S,L 上面的打印（df.head（））是： first second third fourth fifth sixth 0 -42 11 3 L D NaN 1 4 21 40

现在假设我有一个数据文件example.csv：

first,second,third,fourth,fifth,sixth
-42,11,3,L,D
4,21,40,L,Q
2,31,15,R,D
-42,122,50,S,L

上面的打印（df.head（））是：

   first  second  third fourth fifth  sixth
0    -42      11      3      L     D    NaN
1      4      21     40      L     Q    NaN
2      2      31     15      R     D    NaN
3    -42     122     50      S     L    NaN

我想将条形图作为一个组绘制，其中第一列和第二列将用作索引。他们不同的号码会有不同的颜色

我所期待的是下面我已经开始工作

import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
filename = 'example.csv'
df = pd.read_csv(filename)
print(df.head())

first = df['first']
second = df['second']
third = df['third']
labels = df['third']
x = np.arange(len(labels))
width = 0.35
df.sort_values(by=['third'], axis=0, ascending=False)
fig, ax = plt.subplots()
rects1 = ax.bar(x - width/2, third, width, label='Parent 1')
rects2 = ax.bar(x + width/2, third, width, label='Parent 2')

ax.set_ylabel('Scores')
ax.set_title('Scores by group and gender')
ax.set_xticks(x)
ax.set_xticklabels(labels)
ax.legend()

def autolabel(rects):
    """Attach a text label above each bar in *rects*, displaying its height."""
    for rect in rects:
        height = rect.get_height()
        ax.annotate('{}'.format(height),
                    xy=(rect.get_x() + rect.get_width() / 2, height),
                    xytext=(0, 3),  # 3 points vertical offset
                    textcoords="offset points",
                    ha='center', va='bottom')
autolabel(rects1)
autolabel(rects2)
fig.tight_layout()

从条形图中可以清楚地看出，Y值来自名为“第三”的列，这正是我们得到的结果。但是在分组中，我们需要对分组中的标签进行一些修改。我在图上画了，所以你可以看到我在期待什么

每个条形图上的每个顶部数字将具有不同的颜色。例如，在第一对条形图中，我们有数字（-42,11）。所以我们需要分配两种不同的颜色。但如果其他条上的这些数字再次出现，这些相同的数字将具有相同的颜色。这意味着每个数字都有一个唯一的条颜色。条形图颜色的完整列表可以在左上角显示为图例，而不是我们现在拥有的

另一个标识将是条的底部。例如，第一对中的（L，D）表示数据文件的第四列和第五列

我想用第三列的降序来画。我应用命令将列缩短为降序，但在绘图中似乎没有这样做

排序_值（按=['third']，轴=0，升序=False）

自定义太多了，所以我认为通过行循环和不同方式绘制条形图更容易。另外，

sort\u values

默认情况下返回副本，pass

inplace=True

使其在原地运行：

# sort dataframe, notice `inplace`
df.sort_values(by=['third'], axis=0, ascending=False, inplace=True)

from matplotlib import cm

# we use this to change the colors with `cmap`   
values = np.unique(df[['first','second']])

# scaled the values to 0-1 for cmap
def scaled_value(val):
    return (val-values.min())/np.ptp(values)

cmap = cm.get_cmap('viridis')

width = 0.35

fig, ax = plt.subplots()    
for i, idx in enumerate(df.index):
    row = df.loc[idx]
    # draw the first
    ax.bar(i-width/2,row['third'], 
           color=cmap(scaled_value(row['first'])),    # specify color here
           width=width, edgecolor='w',
           label='Parent 1' if i==0 else None)        # label first bar
    
    # draw the second
    ax.bar(i+width/2, row['third'], 
           color=cmap(scaled_value(row['second'])),
           width=width, edgecolor='w', hatch='//',
           label='Parent 2' if i==0 else None)

# set the ticks manually
ax.set_xticks([i + o for i in range(df.shape[0]) for o in [-width/2, width/2]]);
ax.set_xticklabels(df[['fourth','fifth']].values.ravel());

ax.legend()

输出：

我认为您首先需要在数据帧中使用正确的数据结构。我相信你想要的是：

df['xaxis'] = df.fourth + ":" + df.fifth
df.groupby('xaxis').agg({'third':'sum','first':'sum'}).plot(kind='bar')

it输出

       third  first
xaxis              
L:D        3    -42
L:Q       40      4
R:D       15      2
S:L       50    -42

绘图如下：

有一个问题。有没有简单的方法，我们可以给不同的字母赋予不同的颜色，并将它们列为右上角的图例，而不是“parent1”和“parent2”。比如，如果第一个L是黄色的，那么第二个L也是黄色的，下一个字母R不会是蓝色的，如果已经有任何颜色被用作蓝色的话。我之所以这么说，是因为我们可以在右上角将所有颜色列为图例。我以为您希望将数字作为颜色指示器：这意味着每个数字都有一个唯一的条形颜色谢谢，我认为这样更好。在这种情况下，我们看到的四个数字对是（-42,11），（4,21），（2,31），（-42122）。这不是唯一的-42重复吗？因此，除了-42数字之外，我们应该为所有其他条形图提供不同的颜色，这样我们就可以在图例中列出所有条形图。所以11，21在-42122范围内很接近。当然，您也可以通过唯一值对它们进行编码。