Python 堆叠条形图断开连接_Python_Pandas_Matplotlib_Bar Chart

Python 堆叠条形图断开连接

python pandas matplotlib

Python 堆叠条形图断开连接,python,pandas,matplotlib,bar-chart,Python,Pandas,Matplotlib,Bar Chart,数据来自这个网站。我的堆叠条形图已断开连接。我不知道发生了什么事。我的所有数据都不包含任何空值。该系列的值为频率。有人遇到过这种情况吗？我只想把我的酒吧连接起来 fig, ax = plt.subplots(nrows=1, figsize=(15,5)) x = clean_df['main_category'].value_counts().index print("Number of unique main categories:", clean_df['mai

数据来自这个网站。

我的堆叠条形图已断开连接。我不知道发生了什么事。我的所有数据都不包含任何空值。该系列的值为频率。有人遇到过这种情况吗？我只想把我的酒吧连接起来

fig, ax = plt.subplots(nrows=1, figsize=(15,5))
x = clean_df['main_category'].value_counts().index


print("Number of unique main categories:", clean_df['main_category'].nunique())


for year in [2010, 2011, 2012, 2013, 2014, 2015, 2016]:    
    y = clean_df[clean_df['launched'].dt.year == year]['main_category'].value_counts()
    if year > 2010:
        bottom = clean_df[clean_df['launched'].dt.year <= year-1]['main_category'].value_counts()
    else:
        bottom = 0
        
    ax.set_xlabel("Main Catagories", fontsize=14)
    ax.set_ylabel("Frequency/Count", fontsize=14)
    ax.bar(x=x, height=y, width=0.9, bottom=bottom, label=str(year))
    ax.yaxis.grid(linestyle='-', linewidth=0.7)
    ax.set_xticklabels(x, rotation=45, ha='right')
    ax.legend(loc='upper right')
plt.tight_layout();

fig，ax=plt.子批次（nrows=1，figsize=（15,5））
x=清洁度[main_category].值计数（）.索引
打印（“唯一主要类别的数量：”，clean_df['main_category'].nunique（））
【2010、2011、2012、2013、2014、2015、2016】年度：
y=clean_df[clean_df['launted'].dt.year==year]['main_category'].value_counts（）
如果年份>2010年：
bottom=clean_df[clean_df['launted'].dt.year主要问题是clean_df[…]['main_category'].。value_counts（）
给出了从大到小的顺序值。这可能因年份而异
将[x]
添加到y
解决了这个问题，因此可以使用所需索引对y
进行有效排序
要计算条的底部，更容易在循环结束时累积高度。初始化bottom=0
和一些参数可以确保bottom+=y
和所需的值。只有在year
没有某个类别的值的情况下，才会为该ca设置na
因此，在y
被x
重新排序后，使用fillna（0）
可以防止na
的累积
一个简化的例子：
导入matplotlib.pyplot作为plt
将numpy作为np导入
作为pd进口熊猫
N=100
clean_df=pd.DataFrame（{'main_category'：np.random.choice（list（'abcdef'），N），
“年”：np.random.randint（2010、2017、N）}）
x=清洁度[main_category].值计数（）.索引
图，ax=plt.子批次（nrows=1，figsize=（15,5））
底部=0
【2010、2011、2012、2013、2014、2015、2016】年度：
y=clean_-df[clean_-df['year']==year]['main_-category']。值_计数（）[x]。填充值（0）
ax.集合标签（“主要类别”，fontsize=14）
ax.set_ylabel（“频率/计数”，fontsize=14）
最大杆（x=x，高度=y，宽度=0.9，底部=底部，标签=str（年份），α=0.8）
ax.yaxis.grid（线型='-'，线宽=0.7）
ax.setxticklabels（x，旋转=45，ha='right'）
ax.图例（loc='右上方'）
底部+=y
plt.紧_布局（）
plt.show（）


PS：要使用熊猫创建此绘图，请执行以下操作：
df_plot=clean_df.groupby（['year'，'main_category']）.size（）.reset_index（）.pivot（columns='year'，index='main_category'，value=0）
df_图['total']=df_图.sum（轴=1）
df_plot.sort_值（'total'，升序=False，原地=True）
df_plot[df_plot.columns[：-1]].plot（kind='bar'，stacked=True，rot=45）

请注意，您可能需要在clean_df
中创建一个只包含年份的新列。
每个标签在绘图中都有相同的年份（2016年）。为什么不使用pandas来进行堆叠条形图？看起来这会容易得多。问题可能是clean_df[…]['main_category']的值（）
更改每次调用中类别的顺序（顺序似乎是从大值计数到小值计数，每年不同）。因此，标签与条形图不对应。如果将alpha设置为0.4左右，您还会看到条形图重叠。按照BigBen的建议，使用pandas或seaborn创建绘图会更好。