Python 熊猫数据框|分组绘制|堆叠并排图
我来自R ggplot2背景,在matplotlib plot中有点困惑 这是我的数据框Python 熊猫数据框|分组绘制|堆叠并排图,python,pandas,matplotlib,Python,Pandas,Matplotlib,我来自R ggplot2背景,在matplotlib plot中有点困惑 这是我的数据框 languages = ['en','cs','es', 'pt', 'hi', 'en', 'es', 'es'] counties = ['us','ch','sp', 'br', 'in', 'fr', 'ar', 'pr'] count = [32, 432,43,55,6,23,455,23] df = pd.DataFrame({'language': languages,'county': c
languages = ['en','cs','es', 'pt', 'hi', 'en', 'es', 'es']
counties = ['us','ch','sp', 'br', 'in', 'fr', 'ar', 'pr']
count = [32, 432,43,55,6,23,455,23]
df = pd.DataFrame({'language': languages,'county': counties, 'count' : count})
language county count
0 en us 32
1 cs ch 432
2 es sp 43
3 pt br 55
4 hi in 6
5 en fr 23
6 es ar 455
7 es pr 23
现在我想策划
ind = np.arange(df.languages.nunique())
width = 0.35
fig = plt.figure()
ax = fig.add_axes([0,0,1,1])
ax.bar(ind, df.languages, width, color='r')
ax.bar(ind, df.count, width,bottom=df.languages, color='b')
ax.set_ylabel('Count')
ax.set_title('Score y language and country')
ax.set_xticks(ind, df.languages)
ax.set_yticks(np.arange(0, 81, 10))
ax.legend(labels=[df.countries])
plt.show()
顺便说一下,我的panda pivot代码用于相同的绘图
df.pivot(index = "Language", columns = "Country", values = "count").plot.bar(figsize=(15,10))
plt.xticks(rotation = 0,fontsize=18)
plt.xlabel('Language' )
plt.ylabel('Count ')
plt.legend(fontsize='large', ncol=2,handleheight=1.5)
plt.show()
首先,将dataframe修改为低于dataframe
language country_count total_count
0 cs 1 432
1 en 2 55
2 es 3 521
3 hi 1 6
4 pt 1 55
这是情节:
由于country count的值很小,您无法清楚地看到堆叠的country count。我想要堆叠的条形图,用每种语言显示国家的数量谢谢,在这里我们可以更好地控制可视化效果。我有很多不同的例子来达到同样的效果,这让我很困惑。
sns.countplot(x=df.language,data=df)
import matplotlib.pyplot as plt
languages = ['en','cs','es', 'pt', 'hi', 'en', 'es', 'es']
counties = ['us','ch','sp', 'br', 'in', 'fr', 'ar', 'pr']
count = [32, 432,43,55,6,23,455,23]
df = pd.DataFrame({'language': languages,'county': counties, 'count' : count})
modified = {}
modified['language'] = np.unique(df.language)
country_count = []
total_count = []
for x in modified['language']:
country_count.append(len(df[df['language']==x]))
total_count.append(df[df['language']==x]['count'].sum())
modified['country_count'] = country_count
modified['total_count'] = total_count
mod_df = pd.DataFrame(modified)
print(mod_df)
ind = mod_df.language
width = 0.35
p1 = plt.bar(ind,mod_df.total_count, width)
p2 = plt.bar(ind,mod_df.country_count, width,
bottom=mod_df.total_count)
plt.ylabel("Total count")
plt.xlabel("Languages")
plt.legend((p1[0], p2[0]), ('Total Count', 'Country Count'))
plt.show()
language country_count total_count
0 cs 1 432
1 en 2 55
2 es 3 521
3 hi 1 6
4 pt 1 55