R 使用openxlsx按单元格填充颜色过滤Excel中突出显示的数据_R_Excel_Highlight_Openxlsx

R 使用openxlsx按单元格填充颜色过滤Excel中突出显示的数据

r excel

R 使用openxlsx按单元格填充颜色过滤Excel中突出显示的数据,r,excel,highlight,openxlsx,R,Excel,Highlight,Openxlsx,我有一个很大的Excel表格（18k行和400列），其中一些行使用不同的颜色高亮显示。有没有办法使用openxlsx按颜色过滤行我首先加载了工作簿 wb <- loadWorkbook(file = "Items Comparison.xlsx") getStyles(wb) df <- read.xlsx(wb, sheet = 1) 如何按填充颜色过滤数据更新基于@Henrik解决方案，我尝试使用他的代码，但不断出错。因此，为了了解发生了什么，我打印了

我有一个很大的Excel表格（18k行和400列），其中一些行使用不同的颜色高亮显示。有没有办法使用

openxlsx

按颜色过滤行

我首先加载了工作簿

wb <- loadWorkbook(file = "Items Comparison.xlsx")
getStyles(wb)
df <- read.xlsx(wb, sheet = 1)

如何按填充颜色过滤数据

更新

基于@Henrik解决方案，我尝试使用他的代码，但不断出错。因此，为了了解发生了什么，我打印了

x$style$fill$fillFg的输出
       rgb 
"FF384C70" 
       rgb 
"FF384C70" 
NULL
NULL
NULL
NULL
NULL
NULL
NULL
NULL
NULL
NULL
       rgb 
"FF384C70" 
NULL
NULL
NULL
       rgb 
"FFFFFF00" 
       rgb 
"FFFFFF00" 
 theme 
   "0" 
 theme 
   "0" 
       rgb 
"FFFFFF00" 
NULL
 theme 
   "2" 
                theme                  tint 
                  "4" "0.79998168889431442" 
 theme 
   "8" 
 theme 
   "8" 
       rgb 
"FFFFC000" 
       rgb 
"FFFFC000" 
                theme                  tint 
                  "5" "0.39997558519241921" 
                theme                  tint 
                  "5" "0.39997558519241921" 
                theme                  tint 
                  "9" "0.39997558519241921" 
                theme                  tint 
                  "5" "0.79998168889431442" 
       rgb 
"FFFFFF00" 
       rgb 
"FF384C70" 
NULL
NULL
NULL
       rgb 
"FF384C70" 
       rgb 
"FF384C70" 
[[1]]
       rgb 
"FF384C70" 

[[2]]
       rgb 
"FF384C70" 

[[3]]
NULL

[[4]]
NULL

[[5]]
NULL

[[6]]
NULL

[[7]]
NULL

[[8]]
NULL

[[9]]
NULL

[[10]]
NULL

[[11]]
NULL

[[12]]
NULL

[[13]]
       rgb 
"FF384C70" 

[[14]]
NULL

[[15]]
NULL

[[16]]
NULL

[[17]]
       rgb 
"FFFFFF00" 

[[18]]
       rgb 
"FFFFFF00" 

[[19]]
 theme 
   "0" 

[[20]]
 theme 
   "0" 

[[21]]
       rgb 
"FFFFFF00" 

[[22]]
NULL

[[23]]
 theme 
   "2" 

[[24]]
                theme                  tint 
                  "4" "0.79998168889431442" 

[[25]]
 theme 
   "8" 

[[26]]
 theme 
   "8" 

[[27]]
       rgb 
"FFFFC000" 

[[28]]
       rgb 
"FFFFC000" 

[[29]]
                theme                  tint 
                  "5" "0.39997558519241921" 

[[30]]
                theme                  tint 
                  "5" "0.39997558519241921" 

[[31]]
                theme                  tint 
                  "9" "0.39997558519241921" 

[[32]]
                theme                  tint 
                  "5" "0.79998168889431442" 

[[33]]
       rgb 
"FFFFFF00" 

[[34]]
       rgb 
"FF384C70" 

[[35]]
NULL

[[36]]
NULL

[[37]]
NULL

[[38]]
       rgb 
"FF384C70" 

[[39]]
       rgb 
"FF384C70" 

我仍然不明白为什么只有39项。总行数是可变的，但不是39。我也不了解操作-是按行还是按列？

库（tidyxl）
格式化在工作簿对象中，可以找到styleObjects
元素。在那里，您可以挖掘填充颜色（style$fill$fillFg
）和行
元素。在样式对象上循环（lappy
），检查颜色是否是所需的颜色（例如红色，“FFFF0000”x$style$fill$fillFg==“ffffff0000”
），并获取行索引（x$rows[1]
）

一种使用openxlsx包的解决方案
下面的示例查找颜色“FFC000”，并在第1列和第6列中查找
该方法首先确定哪些定义的样式具有感兴趣的字体颜色，然后查看样式对象以查看这些样式应用于哪些单元格，返回与颜色匹配的行的索引和预定义的列搜索。结果将给出列搜索中至少有一个单元格不匹配的所有行他指定了颜色
excelwb <- openxlsx::loadWorkbook(excel_file)
strikestyles <- getStyles(excelwb)
goldcolors <- which(sapply(strikestyles,'[[','fontColour')=="FFFFC000") 
goldcols <- c(1,6) #these are the columns that have the gold color of interest -- could also be 1:ncol
goldrows <- lapply(excelwb$styleObjects[goldcolors],
                     function(x) {
                       value_cols <- which(x$cols %in% goldcols) 
                       if (length(value_cols)==0) return(NULL)
                       else return (x$rows[value_cols])
                     })
goldrows <- as.numeric(unlist(goldrows))

excelwb它必须是openxlsx软件包，还是其他软件包是一个选项？其他软件包也是一个选项。我刚刚找到了我现在正在查看的tidyxl。这很好。这也是我正在阅读的内容。谢谢。我会让你知道我从中得到了什么。事实证明，它实际上比这个简单得多。我遵循了@nacnudus和sa基于颜色过滤是多么容易。我的目的是简单地用以下任何颜色过滤单元格。cells%>%斩首（“N”，“程序”）%>%select（地址、行、列、字符、程序、样式、本地格式、本地格式id）%>%filter（！is.na（fill[local\u format\u id]））
。我接受你的回答，因为你给我指出了这个方向。我喜欢你的解决方案，但我得到了一些错误：if中的错误（x$style$fill$fillFg==“ffffff0000”）{：参数的长度为零
。我更新了我的问题，以显示x$style$fill$fillFg的输出。感谢您的反馈。我的第二个示例显然不够复杂。您能否用一个真正最小的可重复示例来更新您的问题，以说明您的问题？
library(tidyxl)

formats <- xlsx_formats( "./temp/test_file.xlsx" )
cells <- xlsx_cells( "./temp/test_file.xlsx" )

#what colors are used?
formats$local$fill$patternFill$fgColor$rgb
# [1] NA         "FFC00000" "FF00B0F0" NA  

#find rows fo cells  with red background
cells[ cells$local_format_id %in%
         which( formats$local$fill$patternFill$fgColor$rgb == "FFC00000"), 
       "row" ]

# [1] 1

wb <- loadWorkbook(file = "foo.xlsx")
unlist(lapply(wb$styleObjects, function(x){
  x$rows[1][x$style$fill$fillFg == "FFFF0000"]}))

# [1] 3

l = lapply(wb$styleObjects, function(x){
  if(x$style$fill$fillFg == "FFFF0000"){
    data.frame(ri = x$rows, ci = x$cols, col = "FFFF0000")}})
l[lengths(l) > 0]

# [[1]]
#   ri ci      col
# 1  1  2 FFFF0000
# 2  2  3 FFFF0000
# 3  3  1 FFFF0000

excelwb <- openxlsx::loadWorkbook(excel_file)
strikestyles <- getStyles(excelwb)
goldcolors <- which(sapply(strikestyles,'[[','fontColour')=="FFFFC000") 
goldcols <- c(1,6) #these are the columns that have the gold color of interest -- could also be 1:ncol
goldrows <- lapply(excelwb$styleObjects[goldcolors],
                     function(x) {
                       value_cols <- which(x$cols %in% goldcols) 
                       if (length(value_cols)==0) return(NULL)
                       else return (x$rows[value_cols])
                     })
goldrows <- as.numeric(unlist(goldrows))