Python 熊猫（水平）堆叠钢筋，具有每钢筋段排序_Python_Pandas_Matplotlib

Python 熊猫（水平）堆叠钢筋，具有每钢筋段排序

python pandas matplotlib

Python 熊猫（水平）堆叠钢筋，具有每钢筋段排序,python,pandas,matplotlib,Python,Pandas,Matplotlib,我可以使用以下代码从多索引数据帧生成水平堆叠条： arrays = [np.array(['bar', 'bar', 'bar', 'baz', 'baz', 'baz', 'foo', 'foo', 'foo', 'qux', 'qux', 'qux']), np.array(['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine', 'ten', 'eleven', 'twelve'])

我可以使用以下代码从多索引数据帧生成水平堆叠条：

arrays = [np.array(['bar', 'bar', 'bar', 'baz', 'baz', 'baz', 'foo', 'foo', 'foo', 'qux', 'qux', 'qux']),
          np.array(['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine', 'ten', 'eleven', 'twelve'])]

s = abs(pd.Series(np.random.randn(12), index=arrays))

ax = s.unstack(level=1).plot.barh(stacked=True, colormap='Paired')

plt.show()

这个输出

但我希望每个条上最大的部分（无论类别）始终显示在条的底部（即左侧）。我还没有找到任何能起作用的

barh（）

参数，如果

取消堆栈，则在级别0上对s
进行排序没有帮助。
您可以直接使用：
barh(range(4), ax.sum(axis=1), color=['blue' if one else 'green' for one in ax.one == ax.max(axis=1)]);
barh(range(4), ax.max(axis=1), color=['green' if one else 'blue' for one in ax.one == ax.max(axis=1)]);


当然，您可以使用yticks
等工具使刻度和标签更漂亮

编辑
对于一般情况，这里有一个关于如何按比例扩展的概要
首先，从
d = s.unstack(level=1).as_matrix()

现在迭代直到np.nansum（d）==0

对于每次迭代，钢筋的长度应为

np.nansum(d, axis=1)

要获得要打印的颜色，可以使用

np.nanargmin(d, axis=1)

（您需要将这些数字映射到颜色）。在每次迭代结束时，使用

d[:, np.nanargmin(d, axis=1)] = np.nan

这将在较长的条上绘制较短的条，产生堆叠条的错觉。

由于数据帧非常稀疏，即每列只有一个值，因此您可以按该值对列进行排序

import pandas as pd
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt

arrays = [np.array(['bar', 'bar', 'bar', 'baz', 'baz', 'baz', 'foo', 'foo', 'foo', 'qux', 'qux', 'qux']),
          np.array(['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine', 'ten', 'eleven', 'twelve'])]

s = abs(pd.Series(np.random.randn(12), index=arrays))
df = s.unstack(level=1)
df = df[df.columns[np.argsort(df.sum())[::-1]]]
ax = df.plot.barh(stacked=True, colormap='Paired')

plt.show()

这不是违背了目的吗？既然你们是在酒吧里混群？什么目的？：）另外，在我的特定数据中，每个条上的类别都不同于其他条上的类别（也许我应该更改我的示例以使其更清楚）@cᴏʟᴅsᴘᴇᴇᴅ 示例已更新！喜欢这些颜色…；-）谢谢每个酒吧最多有十几种不同的类别/颜色。我发现即使有三个类别，这也不能很好地扩展，但也许我错了。你是否意识到有些东西不依赖于知道类别的数量，而仅仅是它们的大小？我真的不认为缩放它有什么问题（除了人眼是否能很好地缩放它的问题）。您只需在循环中绘制所有列、除最小列之外的所有列（每行）、除两个最小列之外的所有列（每行）的总和，并相应地调整颜色。对于15.4K rep，您应该能够理解：-）我部分同意您的其他评论-我可能过于简化了示例。我已经更新了我的问题！谢谢对于看到这个答案的人，请注意，我最初的示例更简单，并且使用了Ami使用的颜色。我更新了问题中的示例，以避免进一步的混淆，但此解决方案实际上在特定情况下有效！我已经用一般案例的大纲更新了答案。这很好，但我认为极端稀疏性只是一个例子。好吧，在每次获得答案后，请随时更改用例。重要的是，我不是问题的作者。这是对你答案的投票（我喜欢）。哈哈，对不起。如果使用

numpy.cumsum

，我想另一个答案的原理在更一般的情况下仍然有效。没问题。顺便说一句，我不太明白

cumsum

如何解决这个问题，但我概述了如何使用

np.nanargmin

和

np.nansum

来解决这个问题。