Python X轴上字符串值的堆叠直方图失败

Python X轴上字符串值的堆叠直方图失败,python,python-2.7,pandas,matplotlib,Python,Python 2.7,Pandas,Matplotlib,我有下面的堆叠直方图代码,当FIELD是数字时,它可以正常工作。然而,当我把字段_str改为1,2,3。。。有abc1,abc2,abc3等,然后失败,出现错误TypeError:无法连接'str'和'float'对象。如何(直接或间接)用字符串值替换X轴上的数字(这是为了更好地阅读图表所必需的): 数据集: s_field1 = pd.Series(["5","5","5","8","8","9","10"]) s_field1_str = pd.Series(["abc1","abc1",

我有下面的堆叠直方图代码,当
FIELD
是数字时,它可以正常工作。然而,当我把
字段_str
改为1,2,3。。。有
abc1
abc2
abc3
等,然后失败,出现错误
TypeError:无法连接'str'和'float'对象
。如何(直接或间接)用字符串值替换X轴上的数字(这是为了更好地阅读图表所必需的):

数据集:

s_field1 = pd.Series(["5","5","5","8","8","9","10"]) 
s_field1_str = pd.Series(["abc1","abc1","abc1","abc2","abc2","abc3","abc4"]) 
s_cluster = pd.Series(["1","1","0","1","0","1","0"])  

df = pd.concat([s_field1, s_field1_str, s_cluster], axis=1)
df
编辑:

我试图创建一个字典,但不知道如何将其放入直方图中:

# since python 2.7
import collections
yes = collections.Counter(df["FIELD_str"][filter])
no = collections.Counter(df["FIELD_str"][~filter])

您可能必须使用条形图而不是直方图,因为直方图定义为数字(间隔)比例的数据,而不是名义(分类)比例的数据。您可以尝试以下方法:

import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

s_field1 = pd.Series(["5","5","5","8","8","9","10"])
s_field1_str = pd.Series(["abc1","abc1","abc1","abc2","abc2","abc3","abc4"]) 
s_cluster = pd.Series(["1","1","0","1","0","1","0"])

df = pd.concat([s_field1, s_field1_str, s_cluster], axis=1)
df.columns = ['FIELD', 'FIELD_str', 'CLUSTER']
counts = df.groupby(['FIELD_str', 'CLUSTER']).count().unstack()
# calculate counts by CLUSTER and FIELD_str
counts.columns = counts.columns.get_level_values(1)
counts.index.name = 'xaxis label here'
ax = counts.plot.bar(stacked=True, title='Some title here')
ax.set_ylabel("yaxis label here")
plt.tight_layout()
plt.savefig("stacked_barplot.png")

如果没有@GWW:请参见我的编辑,您很难帮助您。您以前使用
T
的解决方案对我来说效果很好(否则,图表在完整的数据集上看起来很奇怪):
counts.columns=counts.columns.get_level_值(1)counts.T.plot.bar(stacked=True,color=[''8A2BE2','EE3B3B'])
在这种情况下,如何添加标题和xaxis标签?我编辑了答案以添加标题和xaxis标签。您还可以使用普通的
matplotlib
调用创建新的轴,根据需要调整它们,并要求
pandas.plot
将该轴与
ax=…
参数一起使用。新的解决方案也可以工作(至少对我而言)。通过在
groupby
中选择正确的列顺序,我成功避免了数据帧
count
的移位。对不起,Y轴的标题是什么?你能把它也加上吗。
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

s_field1 = pd.Series(["5","5","5","8","8","9","10"])
s_field1_str = pd.Series(["abc1","abc1","abc1","abc2","abc2","abc3","abc4"]) 
s_cluster = pd.Series(["1","1","0","1","0","1","0"])

df = pd.concat([s_field1, s_field1_str, s_cluster], axis=1)
df.columns = ['FIELD', 'FIELD_str', 'CLUSTER']
counts = df.groupby(['FIELD_str', 'CLUSTER']).count().unstack()
# calculate counts by CLUSTER and FIELD_str
counts.columns = counts.columns.get_level_values(1)
counts.index.name = 'xaxis label here'
ax = counts.plot.bar(stacked=True, title='Some title here')
ax.set_ylabel("yaxis label here")
plt.tight_layout()
plt.savefig("stacked_barplot.png")