Python Matplotlib Groupby Graph中颜色一致
我试图创建一个散点饼图,该图由两列组成,第1列和第2列,其中饼图中的颜色(如果数字相同)由第3列决定 请参见下面我所在位置的示例: 此图显示第1列(y轴)和第2列(x轴)。颜色由第3栏决定 但是对于我使用的代码,颜色在图形中并不保持一致,如果相同的第3列显示为不同的第2列或第3列值,则会为其指定不同的颜色 我曾尝试使用CMAP和手动分配颜色,但我无法在第2列中保持一致 请参见下面我的当前代码:Python Matplotlib Groupby Graph中颜色一致,python,pandas,matplotlib,graph,Python,Pandas,Matplotlib,Graph,我试图创建一个散点饼图,该图由两列组成,第1列和第2列,其中饼图中的颜色(如果数字相同)由第3列决定 请参见下面我所在位置的示例: 此图显示第1列(y轴)和第2列(x轴)。颜色由第3栏决定 但是对于我使用的代码,颜色在图形中并不保持一致,如果相同的第3列显示为不同的第2列或第3列值,则会为其指定不同的颜色 我曾尝试使用CMAP和手动分配颜色,但我无法在第2列中保持一致 请参见下面我的当前代码: import pandas as pd import matplotlib.pyplot as p
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mticks
from matplotlib.font_manager import FontProperties
import numpy as np
def draw_pie(dist,
xpos,
ypos,
size,
color,
ax=None):
if ax is None:
fig, ax = plt.subplots(figsize=(70,60))
# for incremental pie slices
cumsum = np.cumsum(dist)
cumsum = cumsum/ cumsum[-1]
pie = [0] + cumsum.tolist()
for r1, r2 in zip(pie[:-1], pie[1:]):
angles = np.linspace(2 * np.pi * r1, 2 * np.pi * r2)
x = [0] + np.cos(angles).tolist()
y = [0] + np.sin(angles).tolist()
xy = np.column_stack([x, y])
ax.scatter([xpos], [ypos], marker=xy, s=size,c=color)
return ax
colors = {'Group A':'red', 'Group B':'green', 'Group C':'blue', 'Group D':'yellow', 'Group E':'yellow', 'Group F':'yellow', 'Group G':'yellow', 'Group H':'yellow'}
fig, ax = plt.subplots(figsize=(94,70))
for (x,y), d in dataset.groupby(['Column 1','Column 2']):
dist = d['Column 3'].value_counts()
draw_pie(dist, x, y, 50000, ax=ax,color=dataset['Column 3'].map(colors))
params = {'legend.fontsize': 100}
plt.rcParams.update(params)
#plt.legend(dataset["Column 3"],markerscale=.4,frameon=True,framealpha=1,ncol=3,loc=(0.00, -0.3), bbox_to_anchor=(0.0, 0., 0.5, 1.25),handletextpad=1,markerfirst=True,facecolor='lightgrey',mode='expand',borderaxespad=-16)
ax.yaxis.set_major_locator(mticks.MultipleLocator(1))
full = plt.Rectangle((-0.05, 4.25), 2.10, 2, color='g', alpha=0.15)
partial = plt.Rectangle((-0.05, 2.25), 2.10, 2, color='orange', alpha=0.15)
low = plt.Rectangle((-0.05, 0.25), 2.10, 2, color='r', alpha=0.15)
ax.add_patch(full)
ax.add_patch(partial)
ax.add_patch(low)
plt.xticks(fontsize=120)
plt.yticks(fontsize=100)
plt.ylim([0, 6.75])
plt.tight_layout()
plt.show()
理想情况下,基于数据的输出图(我将在下面复制)应与下图类似(我在每个饼图中放置了一个数字,以定义应有的颜色)
以下是用于图表的完整数据:
Column 1 3 2 Colour Group Desired
First Line Group A 6 1
First Line Group A 6 1
First Line Group A 6 1
First Line Group C 6 3
First Line Group B 6 2
First Line Group B 6 2
First Line Group B 6 2
First Line Group A 6 1
First Line Group A 6 1
First Line Group C 6 3
First Line Group A 6 1
Second Line Group A 6 1
Second Line Group A 6 1
Second Line Group A 6 1
Second Line Group C 6 3
Second Line Group B 6 2
Second Line Group B 6 2
Second Line Group B 6 2
Second Line Group A 4.5 1
Second Line Group A 6 1
Second Line Group C 6 3
Second Line Group A 6 1
Third Line Group A 1 1
Third Line Group A 6 1
Third Line Group A 1 1
Third Line Group C 6 3
Third Line Group B 3.5 2
Third Line Group B 3.5 2
Third Line Group B 3.5 2
Third Line Group A 1 1
Third Line Group A 1 1
Third Line Group C 4 3
Third Line Group A 1 1
此外,我还想在饼图的每个部分添加一个标签,标签的计数为distinct(第3列)。目前,我提出了以下解决方案来解决颜色问题:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mticks
from matplotlib.font_manager import FontProperties
import numpy as np
def draw_pie(dist,
xpos,
ypos,
size,
ax=None):
if ax is None:
fig, ax = plt.subplots(figsize=(7,6))
# The colors, corresponding to the values 1, 2 and 3:
# 1 is tab:blue
# 2 is tab:orange
# 3 is tab:green
# Of course, you can change this
colors = ['tab:blue', 'tab:orange', 'tab:green']
# for incremental pie slices
cumsum = np.cumsum(dist)
cumsum = cumsum/ cumsum[-1]
pie = [0] + cumsum.tolist()
for r1, r2, i in zip(pie[:-1], pie[1:], range(0, len(dist))):
# If no counts present, skip this one
if dist[i] == 0:
continue
angles = np.linspace(2 * np.pi * r1, 2 * np.pi * r2)
x = [0] + np.cos(angles).tolist()
y = [0] + np.sin(angles).tolist()
xy = np.column_stack([x, y])
ax.scatter([xpos], [ypos], marker=xy, s=size, color=colors[i])
return ax
#colors = {'Group A':'red', 'Group B':'green', 'Group C':'blue', 'Group D':'yellow', 'Group E':'yellow', 'Group F':'yellow', 'Group G':'yellow', 'Group H':'yellow'}
# Read dataset
dataset = pd.read_csv('dataset.csv')
fig, ax = plt.subplots(figsize=(9,5))
for (x,y), d in dataset.groupby(['Column 1','Column 2']):
# Only interested in the 'Column 3' column, as this one
# contains the values 1-2-3
d = d['Colour Group Desired']
# Count how often each value (1-2-3) occurs and store
# this in a list (count for value i located at list index
# i-1)
dist = list()
for i in [1,2,3]:
dist.append(d[d==i].count())
# Call your draw_pie function
draw_pie(dist, x, y, 500, ax=ax)
ax.yaxis.set_major_locator(mticks.MultipleLocator(1))
full = plt.Rectangle((-0.05, 4.25), 2.10, 2, color='g', alpha=0.15)
partial = plt.Rectangle((-0.05, 2.25), 2.10, 2, color='orange', alpha=0.15)
low = plt.Rectangle((-0.05, 0), 2.10, 2.25, color='r', alpha=0.15)
ax.add_patch(full)
ax.add_patch(partial)
ax.add_patch(low)
plt.xticks(fontsize=10)
plt.yticks(fontsize=8)
plt.ylim([0, 6.75])
plt.tight_layout()
plt.show()
首先,我把所有的尺寸都改了10倍,这样绘图就可以显示在我的屏幕上(例如figsize)。您可能希望再次使用原始值,但这与问题无关
我做的第一个更改是dataset.groupby(['Column 1','Column 2'])循环中(x,y),d的循环体。我没有使用dist=d['Column 3'].value\u counts()
,而是创建了一个空数组。随后,我在值1
、2
和3
上循环。在每次迭代中,我检查有多少行与特定值匹配,并将结果附加到列表中。这样,我得到了一个大小为3的列表,其中第一个元素对应于等于1
的行数,第二个元素对应于等于2
的行数,第三个元素对应于等于3
的行数。优点是我还可以跟踪出现0次的值
其次,我稍微更改了draw\u pie
函数。然而,由于我不完全理解颜色的含义,我在颜色
字典中注释掉了。似乎1
始终对应组A
,2
始终对应组B
,3
始终对应组C
。我利用这一观察结果,定义了另一个colors
变量(在draw\u pie
函数中)。而不是字典,colors
现在是一个列表(其中第一个元素对应于值1
,第二个元素对应于值2
,第三个元素对应于值3
)。我将您的for
循环从zip中r1,r2的循环(饼[:-1],饼[1:])
更改为zip中r1,r2,I的循环(饼[:-1],饼[1:],范围(0,len(dist))
。优点是我现在可以使用迭代变量I
从colors
列表中获得正确的颜色。此外,我还添加了一个小的if
语句,用于检查是否恰好有0次发生。如果是这样的话,我只跳过循环的其余部分,什么也不画(如果你不跳过这些情况,它会画一条很细的线,你可以自己去掉它来试试)
如果我运行代码,我会得到以下结果:
不幸的是,我没有成功地添加标签。我尝试使用该方法,但无法将标签放置在正确的位置
编辑 我决定更改
draw\u pie
函数的主体。在这个新版本中,我们在所需的(xpos,ypos)
位置绘制一个轴
实例。这涉及到一些转换:首先是从数据坐标到显示坐标的转换,然后是从显示坐标到图形坐标的转换。请参阅以获取解释。优点是,我们现在可以使用该方法在创建的轴内绘制饼图。此方法有一些不错的选项,例如添加标签
然而,有一个陷阱。在开始绘制饼图之前,我们需要先修复主轴的xlim
和ylim
值。如果我们不这样做(并且在绘制饼图之后这样做),饼图将不再位于正确的位置。因此,在我们第一次调用draw\u pie
函数之前,我移动了设置xlim
和ylim
值的代码。我还删除了调用plt.tight_layout()
,因为这(很不幸)还会导致饼图不再位于正确的位置
作为一个小旁注,我改变了背景色的绘制方式。我现在不再使用补丁
,而是使用该方法。使用此方法,您仍然可以控制y位置,但宽度将无限延伸(这意味着如果向左/向右滚动,颜色将保持不变)。如果不需要此选项,可以再次删除它:)
请参阅下面代码的新版本(我再次指出,我更改了所有尺寸,使其适合我的计算机屏幕。您可能想再次替换原始尺寸):
如果运行此代码,将获得以下输出:
更重要的是,您可以通过将size
参数调整到draw\u pie
函数来减小饼图的大小(我只是喜欢上面的输出:)。但请记住,在这种情况下,还需要减少draw_pie
函数体中textprops
字典中指定的fontsize
。例如(size=63
,fontsize=7
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mticks
from matplotlib.font_manager import FontProperties
import numpy as np
def draw_pie(dist,
xpos,
ypos,
size,
ax=None,
fig=None):
if ax is None:
fig, ax = plt.subplots(figsize=(7,6))
# Transform xpos and ypos to figure coordinates by the
# following steps:
# 1. First transform xpos and ypos to data coordinates
# 2. Transform the data coordinates back to figure coordinates
xfig, yfig = ax.transData.transform((xpos, ypos))
trans_to_fig = fig.transFigure.inverted()
xfig, yfig = trans_to_fig.transform((xfig, yfig))
# Calculate figure coordinates from the desired pie chart size
# given in pixels
size = trans_to_fig.transform((size, 0))[0]
# Add axes at data coordinates xpos and ypos. On these axes,
# the pie chart will be plotted
ax_pie = fig.add_axes([xfig-0.5*size, yfig-0.5*size, size, size])
# Plot the pie chart (with some special options)
textprops = {'color' : 'w',
'fontweight' : 'bold',
'fontsize' : 10,
'va' : 'center',
'ha' : 'center'
}
labels = [str(i) if not i == 0 else "" for i in dist]
labeldistance = 0.5
if sum(not x==0 for x in dist) == 1: # Ensures we plot the label in the center
labeldistance = 0.0 # if we have only one entry
ax_pie.pie(dist, labels=labels, labeldistance=labeldistance, textprops=textprops)
return ax_pie
# Read dataset
dataset = pd.read_csv('dataset.txt')
fig, ax = plt.subplots(figsize=(9,5))
# Important, limits must be set before calling draw_pie function!
# (Otherwise, the data coordinates will change, which will break
# the transform sequence inside the draw_pie function!)
ax.set_xlim([-0.2, 2.2]) # Tweak these values for the desired output
ax.set_ylim([0, 6.75])
# Make sure the string from 'Column 1' is displayed again
ax.set_xticks([0, 1, 2])
ax.set_xticklabels(['First Line', 'Second Line', 'Third Line'])
# Remainder of formatting
ax.yaxis.set_major_locator(mticks.MultipleLocator(1))
plt.xticks(fontsize=10)
plt.yticks(fontsize=8)
# Define float values for 'Column 1' (easier for transformation,
# we have already put the text back there using ax.set_xticklabels)
column1_to_float = {'First Line':0, 'Second Line':1, 'Third Line':2}
for (x,y), d in dataset.groupby(['Column 1','Column 2']):
# Only interested in the 'Column 3' column, as this one
# contains the values 1-2-3
d = d['Colour Group Desired']
# Count how often each value (1-2-3) occurs and store
# this in a list (count for value i located at list index
# i-1)
dist = list()
for i in [1,2,3]:
dist.append(d[d==i].count())
# Call your draw_pie function
draw_pie(dist, column1_to_float[x], y, 100, ax=ax, fig=fig)
# Plot the colours (note: using axhspan, they extend the full
# horizontal direction, even while scrolling)
ax.axhspan(0 , 2.25, fc='r' , ec=None, alpha=0.15)
ax.axhspan(2.25, 4.25, fc='orange', ec=None, alpha=0.15)
ax.axhspan(4.25, 6.75, fc='g' , ec=None, alpha=0.15)
# Unfortunately, tight_layout can no longer be used. If we do use this,
# the pie charts will no longer be at the proper positions...
# plt.tight_layout()
plt.show()