Pandas 熊猫式数据框-按分类列列出的彩色条
我有以下数据帧:Pandas 熊猫式数据框-按分类列列出的彩色条,pandas,dataframe,styles,Pandas,Dataframe,Styles,我有以下数据帧: # define categorical column. grps = pd.DataFrame(['a', 'a', 'a', 'b', 'b', 'b']) # generate dataframe. df = pd.DataFrame(np.random.randn(18).reshape(6, 3)) # concatenate categorical column and dataframe. df = pd.concat([grps, df], axis =
# define categorical column.
grps = pd.DataFrame(['a', 'a', 'a', 'b', 'b', 'b'])
# generate dataframe.
df = pd.DataFrame(np.random.randn(18).reshape(6, 3))
# concatenate categorical column and dataframe.
df = pd.concat([grps, df], axis = 1)
# Assign column headers.
df.columns = ['group', 1, 2, 3]
通常,我的数据帧可能包含类别列的不断变化的级别数,即“a”、“b”、“c”、“d”等
然后,我可以使用.bar()
方法生成一个有样式的熊猫数据帧,然后写入html文件:
# style the dataframe.
style_df = (df.style.bar(align = 'zero', color = '#FFA07A'))
# write styled dataframe to html.
df_html = style_df.hide_index().render()
with open("style_df.html","w") as fp:
fp.write(df_html)
如何按组类别列为每个数字列的条形图上色
我尝试使用pd.indexlice
按“组”创建主数据帧的子集,然后将它们传递给.bar()
方法,如中所示。但是,我得到以下错误:索引错误:索引器太多
。即使这样做有效,也不理想,因为我需要手动向样式器添加连续的.bar()
方法。理想情况下,我希望代码对任何给定数据帧的不同组级别做出反应
我认为使用内置的Styler.apply
方法进行条件格式设置可能是最好的选择,但根据这里的示例,没有任何方法可以工作:。它们都基于单元格背景颜色或值本身的格式
任何提示都将不胜感激。我从这篇文章中找到了调整代码的方法: 按组类别给单元格背景上色比按组类别给条形图上色更有意义。我想把它作为更大桌子的可视队列 我必须定义一个函数来执行这个过程,然后我可以将它应用到pandas styler方法中的表中
import pandas as pd
import numpy as np
# define categorical column.
grps = pd.DataFrame(['a', 'a', 'a', 'b', 'b', 'b'])
# generate dataframe.
df = pd.DataFrame(np.random.randn(18).reshape(6, 3))
# concatenate categorical column and dataframe.
df = pd.concat([grps, df], axis = 1)
# Assign column headers.
df.columns = ['group', 1, 2, 3]
用于按类变量高亮显示行的函数
def highlight_rows(x):
""" Function to apply alternating colour scheme to table cells by rows
according to groups.
Parameters:
x: dataframe to be styled.
Returns:
Styled dataframe
"""
# ----------------------------------------------------------------------- #
### Set initial condition.
# Generate copy of input dataframe. This will avoid chained indexing issues.
df_cpy = x.copy()
# ----------------------------------------------------------------------- #
### Define row index ranges per experimental group.
# Reset numerical index in dataframe copy. Generates new column at
# position 1 called 'index' and consisting of index positions.
df_cpy = df_cpy.reset_index()
# Generate dictionary of key:value pairs corresponding to
# grouped experimental class:index range as numerical list, respectively.
grp_indexers_dict = dict(tuple((df_cpy.groupby('group')['index'])))
# Generate list of series from dictionary values.
indexers_series_lst = list(grp_indexers_dict.values())
# Drop first column - 'index'. This is necessary to avoid 'ValueError'
# issue at a later stage. This is due to the extra column causing dataframe
# mismatching when this function is called from 'style_df()' function.
df_cpy = df_cpy.drop('index', axis = 1)
# ----------------------------------------------------------------------- #
### Initiate 'try' block.
try:
# Set default color as no colour.
df_cpy.loc[:,:] = ''
# Set row colour by referencing elements of a list of series.
# Each series corresponds to the numerical row index positions
# for each group class. They therefore represent each class.
# They are generated dynamically from the input dataframe group column
# in the 'style_df()' function, from which this function is called.
# Numerical series can be used to slice a dataframe and specifically
# pass colour schemes to row subsets.
# Note: 4 experimental groups defined below in order to account
# for higher number of group levels. The idea is that these should
# always be in excess of total groups.
# Group - 1.
df_cpy.iloc[indexers_series_lst[0], ] = 'background-color: #A7CDDD'
# Group - 2.
df_cpy.iloc[indexers_series_lst[1], ] = 'background-color: #E3ECF8'
# Group - 3.
df_cpy.iloc[indexers_series_lst[2], ] = 'background-color: #A7CDDD'
# Group - 4.
df_cpy.iloc[indexers_series_lst[3], ] = 'background-color: #E3ECF8'
# Return styled dataframe if total experimental classes equal
# to total defined groups above.
return(df_cpy)
# ----------------------------------------------------------------------- #
### Initiate 'except' block.
# Catches index error generated when there are fewer experimental
# groups than defined in preceding 'try' block.
except IndexError:
# Return styled dataframe.
return(df_cpy)
将函数传递给样式器并生成样式化的html表
# style the dataframe.
style_df = (df.style
.bar(align = 'zero', color = '#FFA07A')
# Call 'highlight_rows()' function to colour rows by group class.
.apply(highlight_rows, axis=None))
# write styled dataframe to html.
df_html = style_df.hide_index().render()
with open("style_df.html","w") as fp:
fp.write(df_html)enter code here
虽然这对我处理的数据类型很有效(很难超过10个组,因此在我的实际代码中函数中最多有10个索引器),但它并不像让函数对组的数量做出动态反应那样优雅
我仍然很感兴趣,如果有人想出一种方法来做,我就是做不出来。我希望这有助于他们的风格的人