Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/305.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Matplotlib Groupby Graph中颜色一致_Python_Pandas_Matplotlib_Graph - Fatal编程技术网

Python Matplotlib Groupby Graph中颜色一致

Python Matplotlib Groupby Graph中颜色一致,python,pandas,matplotlib,graph,Python,Pandas,Matplotlib,Graph,我试图创建一个散点饼图,该图由两列组成,第1列和第2列,其中饼图中的颜色(如果数字相同)由第3列决定 请参见下面我所在位置的示例: 此图显示第1列(y轴)和第2列(x轴)。颜色由第3栏决定 但是对于我使用的代码,颜色在图形中并不保持一致,如果相同的第3列显示为不同的第2列或第3列值,则会为其指定不同的颜色 我曾尝试使用CMAP和手动分配颜色,但我无法在第2列中保持一致 请参见下面我的当前代码: import pandas as pd import matplotlib.pyplot as p

我试图创建一个散点饼图,该图由两列组成,第1列和第2列,其中饼图中的颜色(如果数字相同)由第3列决定

请参见下面我所在位置的示例:

此图显示第1列(y轴)和第2列(x轴)。颜色由第3栏决定

但是对于我使用的代码,颜色在图形中并不保持一致,如果相同的第3列显示为不同的第2列或第3列值,则会为其指定不同的颜色

我曾尝试使用CMAP和手动分配颜色,但我无法在第2列中保持一致

请参见下面我的当前代码:

import pandas as pd 
import matplotlib.pyplot as plt
import matplotlib.ticker as mticks
from matplotlib.font_manager import FontProperties
import numpy as np

def draw_pie(dist, 
             xpos, 
             ypos, 
             size, 
             color,
             ax=None):
    if ax is None:
        fig, ax = plt.subplots(figsize=(70,60))

    # for incremental pie slices
    cumsum = np.cumsum(dist)
    cumsum = cumsum/ cumsum[-1]
    pie = [0] + cumsum.tolist()

    for r1, r2 in zip(pie[:-1], pie[1:]):
        angles = np.linspace(2 * np.pi * r1, 2 * np.pi * r2)
        x = [0] + np.cos(angles).tolist()
        y = [0] + np.sin(angles).tolist()

        xy = np.column_stack([x, y])

        ax.scatter([xpos], [ypos], marker=xy, s=size,c=color)

    return ax


colors = {'Group A':'red', 'Group B':'green', 'Group C':'blue', 'Group D':'yellow', 'Group E':'yellow', 'Group F':'yellow', 'Group G':'yellow', 'Group H':'yellow'}


fig, ax = plt.subplots(figsize=(94,70))
for (x,y), d in dataset.groupby(['Column 1','Column 2']):
    dist = d['Column 3'].value_counts()
    draw_pie(dist, x, y, 50000, ax=ax,color=dataset['Column 3'].map(colors))


params = {'legend.fontsize': 100}
plt.rcParams.update(params)
#plt.legend(dataset["Column 3"],markerscale=.4,frameon=True,framealpha=1,ncol=3,loc=(0.00, -0.3), bbox_to_anchor=(0.0, 0., 0.5, 1.25),handletextpad=1,markerfirst=True,facecolor='lightgrey',mode='expand',borderaxespad=-16)


ax.yaxis.set_major_locator(mticks.MultipleLocator(1))
full = plt.Rectangle((-0.05, 4.25), 2.10, 2, color='g', alpha=0.15)
partial = plt.Rectangle((-0.05, 2.25), 2.10, 2, color='orange', alpha=0.15)
low = plt.Rectangle((-0.05, 0.25), 2.10, 2, color='r', alpha=0.15)
ax.add_patch(full)
ax.add_patch(partial)
ax.add_patch(low)
plt.xticks(fontsize=120)
plt.yticks(fontsize=100)
plt.ylim([0, 6.75]) 
plt.tight_layout()
plt.show()
理想情况下,基于数据的输出图(我将在下面复制)应与下图类似(我在每个饼图中放置了一个数字,以定义应有的颜色)

以下是用于图表的完整数据:

Column 1    3       2   Colour Group Desired
First Line  Group A 6   1
First Line  Group A 6   1
First Line  Group A 6   1
First Line  Group C 6   3
First Line  Group B 6   2
First Line  Group B 6   2
First Line  Group B 6   2
First Line  Group A 6   1
First Line  Group A 6   1
First Line  Group C 6   3
First Line  Group A 6   1
Second Line Group A 6   1
Second Line Group A 6   1
Second Line Group A 6   1
Second Line Group C 6   3
Second Line Group B 6   2
Second Line Group B 6   2
Second Line Group B 6   2
Second Line Group A 4.5 1
Second Line Group A 6   1
Second Line Group C 6   3
Second Line Group A 6   1
Third Line  Group A 1   1
Third Line  Group A 6   1
Third Line  Group A 1   1
Third Line  Group C 6   3
Third Line  Group B 3.5 2
Third Line  Group B 3.5 2
Third Line  Group B 3.5 2
Third Line  Group A 1   1
Third Line  Group A 1   1
Third Line  Group C 4   3
Third Line  Group A 1   1

此外,我还想在饼图的每个部分添加一个标签,标签的计数为distinct(第3列)。

目前,我提出了以下解决方案来解决颜色问题:

import pandas as pd 
import matplotlib.pyplot as plt
import matplotlib.ticker as mticks
from matplotlib.font_manager import FontProperties
import numpy as np

def draw_pie(dist, 
             xpos, 
             ypos, 
             size, 
             ax=None):
    if ax is None:
        fig, ax = plt.subplots(figsize=(7,6))
    
    # The colors, corresponding to the values 1, 2 and 3:
    # 1 is tab:blue
    # 2 is tab:orange
    # 3 is tab:green
    # Of course, you can change this
    colors = ['tab:blue', 'tab:orange', 'tab:green']
    
    # for incremental pie slices
    cumsum = np.cumsum(dist)
    cumsum = cumsum/ cumsum[-1]
    pie = [0] + cumsum.tolist()

    for r1, r2, i in zip(pie[:-1], pie[1:], range(0, len(dist))):
    
        # If no counts present, skip this one
        if dist[i] == 0:
            continue
    
        angles = np.linspace(2 * np.pi * r1, 2 * np.pi * r2)
        x = [0] + np.cos(angles).tolist()
        y = [0] + np.sin(angles).tolist()

        xy = np.column_stack([x, y])
        ax.scatter([xpos], [ypos], marker=xy, s=size, color=colors[i])

    return ax
    
#colors = {'Group A':'red', 'Group B':'green', 'Group C':'blue', 'Group D':'yellow', 'Group E':'yellow', 'Group F':'yellow', 'Group G':'yellow', 'Group H':'yellow'}

# Read dataset
dataset = pd.read_csv('dataset.csv')

fig, ax = plt.subplots(figsize=(9,5))

for (x,y), d in dataset.groupby(['Column 1','Column 2']):

    # Only interested in the 'Column 3' column, as this one
    # contains the values 1-2-3
    d = d['Colour Group Desired']
    
    # Count how often each value (1-2-3) occurs and store
    # this in a list (count for value i located at list index 
    # i-1)   
    dist = list()
    for i in [1,2,3]:
        dist.append(d[d==i].count())
        
        
    # Call your draw_pie function
    draw_pie(dist, x, y, 500, ax=ax)
   
    
ax.yaxis.set_major_locator(mticks.MultipleLocator(1))
full = plt.Rectangle((-0.05, 4.25), 2.10, 2, color='g', alpha=0.15)
partial = plt.Rectangle((-0.05, 2.25), 2.10, 2, color='orange', alpha=0.15)
low = plt.Rectangle((-0.05, 0), 2.10, 2.25, color='r', alpha=0.15)
ax.add_patch(full)
ax.add_patch(partial)
ax.add_patch(low)
plt.xticks(fontsize=10)
plt.yticks(fontsize=8)
plt.ylim([0, 6.75]) 
plt.tight_layout()
plt.show()
首先,我把所有的尺寸都改了10倍,这样绘图就可以显示在我的屏幕上(例如figsize)。您可能希望再次使用原始值,但这与问题无关

我做的第一个更改是dataset.groupby(['Column 1','Column 2'])循环中(x,y),d的
循环体。我没有使用dist=d['Column 3'].value\u counts()
,而是创建了一个空数组。随后,我在值
1
2
3
上循环。在每次迭代中,我检查有多少行与特定值匹配,并将结果附加到列表中。这样,我得到了一个大小为3的列表,其中第一个元素对应于等于
1
的行数,第二个元素对应于等于
2
的行数,第三个元素对应于等于
3
的行数。优点是我还可以跟踪出现0次的值

其次,我稍微更改了
draw\u pie
函数。然而,由于我不完全理解颜色的含义,我在
颜色
字典中注释掉了。似乎
1
始终对应
组A
2
始终对应
组B
3
始终对应
组C
。我利用这一观察结果,定义了另一个
colors
变量(在
draw\u pie
函数中)。而不是字典,
colors
现在是一个列表(其中第一个元素对应于值
1
,第二个元素对应于值
2
,第三个元素对应于值
3
)。我将您的
for
循环从zip中r1,r2的
循环(饼[:-1],饼[1:])
更改为zip中r1,r2,I的
循环(饼[:-1],饼[1:],范围(0,len(dist))
。优点是我现在可以使用迭代变量
I
colors
列表中获得正确的颜色。此外,我还添加了一个小的
if
语句,用于检查是否恰好有0次发生。如果是这样的话,我只跳过循环的其余部分,什么也不画(如果你不跳过这些情况,它会画一条很细的线,你可以自己去掉它来试试)

如果我运行代码,我会得到以下结果:

不幸的是,我没有成功地添加标签。我尝试使用该方法,但无法将标签放置在正确的位置


编辑 我决定更改
draw\u pie
函数的主体。在这个新版本中,我们在所需的
(xpos,ypos)
位置绘制一个
实例。这涉及到一些转换:首先是从数据坐标到显示坐标的转换,然后是从显示坐标到图形坐标的转换。请参阅以获取解释。优点是,我们现在可以使用该方法在创建的轴内绘制饼图。此方法有一些不错的选项,例如添加标签

然而,有一个陷阱。在开始绘制饼图之前,我们需要先修复主
轴的
xlim
ylim
值。如果我们不这样做(并且在绘制饼图之后这样做),饼图将不再位于正确的位置。因此,在我们第一次调用
draw\u pie
函数之前,我移动了设置
xlim
ylim
值的代码。我还删除了调用
plt.tight_layout()
,因为这(很不幸)还会导致饼图不再位于正确的位置

作为一个小旁注,我改变了背景色的绘制方式。我现在不再使用
补丁
,而是使用该方法。使用此方法,您仍然可以控制y位置,但宽度将无限延伸(这意味着如果向左/向右滚动,颜色将保持不变)。如果不需要此选项,可以再次删除它:)

请参阅下面代码的新版本(我再次指出,我更改了所有尺寸,使其适合我的计算机屏幕。您可能想再次替换原始尺寸):

如果运行此代码,将获得以下输出:

更重要的是,您可以通过将
size
参数调整到
draw\u pie
函数来减小饼图的大小(我只是喜欢上面的输出:)。但请记住,在这种情况下,还需要减少
draw_pie
函数体中
textprops
字典中指定的
fontsize
。例如(
size=63
fontsize=7
import pandas as pd 
import matplotlib.pyplot as plt
import matplotlib.ticker as mticks
from matplotlib.font_manager import FontProperties
import numpy as np

def draw_pie(dist, 
             xpos, 
             ypos, 
             size, 
             ax=None,
             fig=None):
    if ax is None:
        fig, ax = plt.subplots(figsize=(7,6))
    
    # Transform xpos and ypos to figure coordinates by the
    # following steps:
    # 1. First transform xpos and ypos to data coordinates
    # 2. Transform the data coordinates back to figure coordinates
    xfig, yfig   = ax.transData.transform((xpos, ypos))
    trans_to_fig = fig.transFigure.inverted()
    xfig, yfig   = trans_to_fig.transform((xfig, yfig))
    
    # Calculate figure coordinates from the desired pie chart size
    # given in pixels
    size = trans_to_fig.transform((size, 0))[0]
    
    # Add axes at data coordinates xpos and ypos. On these axes,
    # the pie chart will be plotted
    ax_pie = fig.add_axes([xfig-0.5*size, yfig-0.5*size, size, size])
    
    # Plot the pie chart (with some special options)
    textprops = {'color'      : 'w',
                 'fontweight' : 'bold', 
                 'fontsize'   :  10,
                 'va'         : 'center',
                 'ha'         : 'center'
                }            
    labels = [str(i) if not i == 0 else "" for i in dist]
    labeldistance = 0.5
    
    if sum(not x==0 for x in dist) == 1: # Ensures we plot the label in the center
        labeldistance = 0.0              # if we have only one entry
    
    ax_pie.pie(dist, labels=labels, labeldistance=labeldistance, textprops=textprops)

    return ax_pie
    

# Read dataset
dataset = pd.read_csv('dataset.txt')

fig, ax = plt.subplots(figsize=(9,5))

# Important, limits must be set before calling draw_pie function!
# (Otherwise, the data coordinates will change, which will break
# the transform sequence inside the draw_pie function!)
ax.set_xlim([-0.2, 2.2]) # Tweak these values for the desired output
ax.set_ylim([0, 6.75])

# Make sure the string from 'Column 1' is displayed again
ax.set_xticks([0, 1, 2])
ax.set_xticklabels(['First Line', 'Second Line', 'Third Line'])

# Remainder of formatting
ax.yaxis.set_major_locator(mticks.MultipleLocator(1))  
plt.xticks(fontsize=10)
plt.yticks(fontsize=8)

# Define float values for 'Column 1' (easier for transformation,
# we have already put the text back there using ax.set_xticklabels)
column1_to_float = {'First Line':0, 'Second Line':1, 'Third Line':2}

for (x,y), d in dataset.groupby(['Column 1','Column 2']):

    # Only interested in the 'Column 3' column, as this one
    # contains the values 1-2-3
    d = d['Colour Group Desired']
    
    # Count how often each value (1-2-3) occurs and store
    # this in a list (count for value i located at list index 
    # i-1)   
    dist = list()
    for i in [1,2,3]:
        dist.append(d[d==i].count())
        
    # Call your draw_pie function
    draw_pie(dist, column1_to_float[x], y, 100, ax=ax, fig=fig)

# Plot the colours (note: using axhspan, they extend the full 
# horizontal direction, even while scrolling)
ax.axhspan(0   , 2.25, fc='r'     , ec=None, alpha=0.15)
ax.axhspan(2.25, 4.25, fc='orange', ec=None, alpha=0.15)
ax.axhspan(4.25, 6.75, fc='g'     , ec=None, alpha=0.15)

# Unfortunately, tight_layout can no longer be used. If we do use this,
# the pie charts will no longer be at the proper positions...
# plt.tight_layout()
plt.show()