Pandas 熊猫式数据框-按分类列列出的彩色条_Pandas_Dataframe_Styles

Pandas 熊猫式数据框-按分类列列出的彩色条

pandas dataframe

Pandas 熊猫式数据框-按分类列列出的彩色条,pandas,dataframe,styles,Pandas,Dataframe,Styles,我有以下数据帧： # define categorical column. grps = pd.DataFrame(['a', 'a', 'a', 'b', 'b', 'b']) # generate dataframe. df = pd.DataFrame(np.random.randn(18).reshape(6, 3)) # concatenate categorical column and dataframe. df = pd.concat([grps, df], axis =

我有以下数据帧：

# define categorical column.
grps = pd.DataFrame(['a', 'a', 'a', 'b', 'b', 'b']) 

# generate dataframe.
df = pd.DataFrame(np.random.randn(18).reshape(6, 3))

# concatenate categorical column and dataframe.
df = pd.concat([grps, df], axis = 1)

# Assign column headers.
df.columns = ['group', 1, 2, 3]

通常，我的数据帧可能包含类别列的不断变化的级别数，即“a”、“b”、“c”、“d”等

然后，我可以使用

.bar（）

方法生成一个有样式的熊猫数据帧，然后写入html文件：

# style the dataframe.
style_df = (df.style.bar(align = 'zero', color = '#FFA07A'))

# write styled dataframe to html.
df_html = style_df.hide_index().render()
with open("style_df.html","w") as fp:
    fp.write(df_html)

如何按组类别列为每个数字列的条形图上色

我尝试使用

pd.indexlice

按“组”创建主数据帧的子集，然后将它们传递给

.bar（）

方法，如中所示。但是，我得到以下错误：

索引错误：索引器太多

。即使这样做有效，也不理想，因为我需要手动向样式器添加连续的

.bar（）

方法。理想情况下，我希望代码对任何给定数据帧的不同组级别做出反应

我认为使用内置的

Styler.apply

方法进行条件格式设置可能是最好的选择，但根据这里的示例，没有任何方法可以工作：。它们都基于单元格背景颜色或值本身的格式

任何提示都将不胜感激。

我从这篇文章中找到了调整代码的方法：

按组类别给单元格背景上色比按组类别给条形图上色更有意义。我想把它作为更大桌子的可视队列

我必须定义一个函数来执行这个过程，然后我可以将它应用到pandas styler方法中的表中

import pandas as pd
import numpy as np

# define categorical column.
grps = pd.DataFrame(['a', 'a', 'a', 'b', 'b', 'b']) 

# generate dataframe.
df = pd.DataFrame(np.random.randn(18).reshape(6, 3))

# concatenate categorical column and dataframe.
df = pd.concat([grps, df], axis = 1)

# Assign column headers.
df.columns = ['group', 1, 2, 3]

用于按类变量高亮显示行的函数

def highlight_rows(x):
    """ Function to apply alternating colour scheme to table cells by rows
    according to groups. 

    Parameters:
    x: dataframe to be styled.

    Returns:
    Styled dataframe

    """
    # ----------------------------------------------------------------------- #
    ### Set initial condition.

    # Generate copy of input dataframe. This will avoid chained indexing issues.
    df_cpy = x.copy()

    # ----------------------------------------------------------------------- #
    ### Define row index ranges per experimental group.

    # Reset numerical index in dataframe copy. Generates new column at
    # position 1 called 'index' and consisting of index positions.
    df_cpy = df_cpy.reset_index()

    # Generate dictionary of key:value pairs corresponding to 
    # grouped experimental class:index range as numerical list, respectively.
    grp_indexers_dict = dict(tuple((df_cpy.groupby('group')['index'])))

    # Generate list of series from dictionary values.
    indexers_series_lst = list(grp_indexers_dict.values())

    # Drop first column - 'index'. This is necessary to avoid 'ValueError' 
    # issue at a later stage. This is due to the extra column causing dataframe 
    # mismatching when this function is called from 'style_df()' function.
    df_cpy = df_cpy.drop('index', axis = 1)

    # ----------------------------------------------------------------------- #
    ### Initiate 'try' block.

    try:
    # Set default color as no colour.
       df_cpy.loc[:,:] = '' 

       # Set row colour by referencing elements of a list of series.
       # Each series corresponds to the numerical row index positions
       # for each group class. They therefore represent each class. 
       # They are generated dynamically from the input dataframe group column
       # in the 'style_df()' function, from which this function is called.
       # Numerical series can be used to slice a dataframe and specifically 
       # pass colour schemes to row subsets.
       # Note: 4 experimental groups defined below in order to account
       # for higher number of group levels. The idea is that these should 
       # always be in excess of total groups.

       # Group - 1.
       df_cpy.iloc[indexers_series_lst[0], ] = 'background-color: #A7CDDD'
       # Group - 2.
       df_cpy.iloc[indexers_series_lst[1], ] = 'background-color: #E3ECF8'
       # Group - 3.
       df_cpy.iloc[indexers_series_lst[2], ] = 'background-color: #A7CDDD'
       # Group - 4.
       df_cpy.iloc[indexers_series_lst[3], ] = 'background-color: #E3ECF8'

       # Return styled dataframe if total experimental classes equal
       # to total defined groups above.
       return(df_cpy)

    # ----------------------------------------------------------------------- #
    ### Initiate 'except' block.

    # Catches index error generated when there are fewer experimental
    # groups than defined in preceding 'try' block. 
    except IndexError:

       # Return styled dataframe.
       return(df_cpy)

将函数传递给样式器并生成样式化的html表

# style the dataframe.
style_df = (df.style
            .bar(align = 'zero', color = '#FFA07A')
            # Call 'highlight_rows()' function to colour rows by group class.
            .apply(highlight_rows, axis=None))

# write styled dataframe to html.
df_html = style_df.hide_index().render()
with open("style_df.html","w") as fp:
    fp.write(df_html)enter code here

虽然这对我处理的数据类型很有效（很难超过10个组，因此在我的实际代码中函数中最多有10个索引器），但它并不像让函数对组的数量做出动态反应那样优雅

我仍然很感兴趣，如果有人想出一种方法来做，我就是做不出来。我希望这有助于他们的风格的人