Python 3.x 在Python中突出显示基于多个条件的数据帧单元格
给定一个小数据集,如下所示:Python 3.x 在Python中突出显示基于多个条件的数据帧单元格,python-3.x,pandas,dataframe,openpyxl,Python 3.x,Pandas,Dataframe,Openpyxl,给定一个小数据集,如下所示: id room area situation 0 1 A-102 world under construction 1 2 NaN 24 under construction 2 3 B309 NaN NaN 3 4 C·102 25 under decoration 4 5 E_1089 hello under
id room area situation
0 1 A-102 world under construction
1 2 NaN 24 under construction
2 3 B309 NaN NaN
3 4 C·102 25 under decoration
4 5 E_1089 hello under decoration
5 6 27 NaN under plan
6 7 27 NaN NaN
感谢@jezrael at的代码,我能够获得所需的结果:
a = np.where(df.room.str.match('^[a-zA-Z\d\-]*$', na = False), None,
'incorrect room name')
b = np.where(df.area.str.contains('^\d+$', na = True), None,
'area is not a numbers')
c = np.where(df.situation.str.contains('under decoration', na = False),
'decoration is in the content', None)
f = (lambda x: '; '.join(y for y in x if pd.notna(y))
if any(pd.notna(np.array(x))) else np.nan )
df['check'] = [f(x) for x in zip(a,b,c)]
print(df)
id room area situation \
0 1 A-102 world under construction
1 2 NaN 24 under construction
2 3 B309 NaN NaN
3 4 C·102 25 under decoration
4 5 E_1089 hello under decoration
5 6 27 NaN under plan
6 7 27 NaN NaN
check
0 area is not a numbers
1 incorrect room name
2 NaN
3 incorrect room name;decoration is in the content
4 incorrect room name;area is not a numbers;deco...
5 NaN
6 NaN
但现在我想更进一步,从room、area、situation
列高亮显示有问题的单元格,然后将数据框保存为excel文件
我如何在Pandas(更好)或其他Python包中做到这一点
提前感谢。想法是为样式的返回
数据帧
创建自定义函数,并重用m1、m2、m3
布尔掩码:
m1 = df.room.str.match('^[a-zA-Z\d\-]*$', na = False)
m2 = df.area.str.contains('^\d+$', na = True)
m3 = df.situation.str.contains('under decoration', na = False)
a = np.where(m1, None, 'incorrect room name')
b = np.where(m2, None, 'area is not a numbers')
c = np.where(m3, 'decoration is in the content', None)
f = (lambda x: '; '.join(y for y in x if pd.notna(y))
if any(pd.notna(np.array(x))) else np.nan )
df['check'] = [f(x) for x in zip(a, b, c)]
print(df)
def highlight(x):
c1 = 'background-color: yellow'
df1 = pd.DataFrame('', index=x.index, columns=x.columns)
df1['room'] = np.where(m1, '', c1)
df1['area'] = np.where(m2, '', c1)
df1['situation'] = np.where(m3, c1, '')
# print(df1)
return df1
df.style.apply(highlight, axis = None).to_excel('test.xlsx', index = False)
还有一个问题,如果一列有多个掩码,我们该怎么办?@ahbon-是否可以通过
或&
进行链接?不确定,因为它返回多个检查以分隔的指示短语
@ahbon-不确定是否理解它。@ahbon-ya,但通过,
所有检查字符串进行链接时会出现问题。