Java ApachePOI在R中_Java_R_Apache Poi

Java ApachePOI在R中

java r

Java ApachePOI在R中,java,r,apache-poi,Java,R,Apache Poi,我正在尝试将xlsx文件读入R并提取Excel公式。ApachePOI似乎是这项工作的合适工具，但我无法让它工作。我找到了这个列表，其中列出了POI组件及其依赖项。我尝试了以下代码： require(rJava) .jinit() .jaddClassPath("poi-3.11-20141221.jar") .jaddClassPath("poi-ooxml-3.11-20141221.jar") .jaddClassPath("poi-ooxml-schemas-3.11-20141221.

我正在尝试将xlsx文件读入R并提取Excel公式。ApachePOI似乎是这项工作的合适工具，但我无法让它工作。我找到了这个列表，其中列出了POI组件及其依赖项。我尝试了以下代码：

require(rJava)
.jinit()
.jaddClassPath("poi-3.11-20141221.jar")
.jaddClassPath("poi-ooxml-3.11-20141221.jar")
.jaddClassPath("poi-ooxml-schemas-3.11-20141221.jar")
.jaddClassPath("xmlbeans-2.6.0.jar")

inputStream <- .jnew("java/io/FileInputStream", path.expand(file.path))

xfile <- .jnew("org/apache/poi/xssf/eventusermodel/XSSFWorkbook", 
            .jcast(inputStream,"java/io/InputStream"))
wext <- .jnew("org/apache/poi/xssf/extractor/XSSFExcelExtractor", xfile)

text <- .jcall(wext, "Ljava/lang/String;", "getText")

require（rJava）
.jinit（）
.jaddClassPath（“poi-3.11-20141221.jar”）
.jaddClassPath（“poi-ooxml-3.11-20141221.jar”）
.jaddClassPath（“poi-ooxml-schemas-3.11-20141221.jar”）
.jaddClassPath（“xmlbeans-2.6.0.jar”）
输入流更新
devtools::install_git("https://gitlab.com/hrbrmstr/xlsxtractr.git")

或
然后：
它只提取formlua，但现在它是其他优秀软件包可能缺少的其他功能的基础

这可以很容易地应用到一个小的包或函数中，以获取xlsx
文件的路径并从中提取公式：
library(xml2)
library(purrr)

# need to write code to do the unzipping and also to work with all the
# sheets from the xlsx file.

sheet <- read_xml("~/dir/wb/xl/worksheets/sheet1.xml")
ns <- xml_ns_rename(xml_ns(sheet), d1 = "x")
xml_find_all(sheet, ".//x:row", ns) %>% 
  map_df(function(row) {
    xml_find_all(row, ".//x:c", ns) %>% 
      map_df(function(col) {
        xml_find_all(col, ".//x:f", ns) %>% 
          xml_text() -> f
        if (length(f) > 0) {
          data_frame(cell=xml_attr(col, "r"), f=f)
        } else {
          NULL
        }
      })
  })
## # A tibble: 2 × 2
##    cell                     f
##   <chr>                 <chr>
## 1    B2            SUM(A1:A3)
## 2    C2 SUM(A1:A3)*SUM(A1:A3)

库（xml2）
图书馆（purrr）
#需要编写代码来进行解压缩，还需要使用所有
#xlsx文件中的图纸。
表%
地图测向（功能（col）{
xml\u find\u all（列，“../x:f”，ns）%>%
xml_text（）->f
如果（长度（f）>0）{
数据帧（cell=xml\u attr（col，“r”），f=f）
}否则{
无效的
}
})
})
###A tible:2×2
##单元f
##                    
##1 B2总和（A1:A3）
##2 C2总和（A1:A3）*总和（A1:A3）

但是，如果您有xls
文件，这将不起作用。
更新
devtools::install_git("https://gitlab.com/hrbrmstr/xlsxtractr.git")

或
然后：
它只提取formlua，但现在它是其他优秀软件包可能缺少的其他功能的基础

这可以很容易地应用到一个小的包或函数中，以获取xlsx
文件的路径并从中提取公式：
library(xml2)
library(purrr)

# need to write code to do the unzipping and also to work with all the
# sheets from the xlsx file.

sheet <- read_xml("~/dir/wb/xl/worksheets/sheet1.xml")
ns <- xml_ns_rename(xml_ns(sheet), d1 = "x")
xml_find_all(sheet, ".//x:row", ns) %>% 
  map_df(function(row) {
    xml_find_all(row, ".//x:c", ns) %>% 
      map_df(function(col) {
        xml_find_all(col, ".//x:f", ns) %>% 
          xml_text() -> f
        if (length(f) > 0) {
          data_frame(cell=xml_attr(col, "r"), f=f)
        } else {
          NULL
        }
      })
  })
## # A tibble: 2 × 2
##    cell                     f
##   <chr>                 <chr>
## 1    B2            SUM(A1:A3)
## 2    C2 SUM(A1:A3)*SUM(A1:A3)

库（xml2）
图书馆（purrr）
#需要编写代码来进行解压缩，还需要使用所有
#xlsx文件中的图纸。
表%
地图测向（功能（col）{
xml\u find\u all（列，“../x:f”，ns）%>%
xml_text（）->f
如果（长度（f）>0）{
数据帧（cell=xml\u attr（col，“r”），f=f）
}否则{
无效的
}
})
})
###A tible:2×2
##单元f
##                    
##1 B2总和（A1:A3）
##2 C2总和（A1:A3）*总和（A1:A3）

但是，如果您有xls
文件，这将不起作用。
感谢您的努力和出色的工作。您应该考虑将解决方案提交到 OpenXLSX < /代码>包中。目前有一个功能要求完全相同的感谢你的努力和伟大的工作。您应该考虑将解决方案提交到 OpenXLSX < /代码>包中。当前存在完全相同的功能请求
library(xml2)
library(purrr)

# need to write code to do the unzipping and also to work with all the
# sheets from the xlsx file.

sheet <- read_xml("~/dir/wb/xl/worksheets/sheet1.xml")
ns <- xml_ns_rename(xml_ns(sheet), d1 = "x")
xml_find_all(sheet, ".//x:row", ns) %>% 
  map_df(function(row) {
    xml_find_all(row, ".//x:c", ns) %>% 
      map_df(function(col) {
        xml_find_all(col, ".//x:f", ns) %>% 
          xml_text() -> f
        if (length(f) > 0) {
          data_frame(cell=xml_attr(col, "r"), f=f)
        } else {
          NULL
        }
      })
  })
## # A tibble: 2 × 2
##    cell                     f
##   <chr>                 <chr>
## 1    B2            SUM(A1:A3)
## 2    C2 SUM(A1:A3)*SUM(A1:A3)