如何解决使用pdf#u工具将pdf转换为文本时可能出现的编码问题'；r中的s pdf_text（）函数_R_Pdftools

如何解决使用pdf#u工具将pdf转换为文本时可能出现的编码问题'；r中的s pdf_text（）函数

如何解决使用pdf#u工具将pdf转换为文本时可能出现的编码问题'；r中的s pdf_text（）函数,r,pdftools,R,Pdftools,我尝试了以下代码来读取目录中的多篇期刊文章PDF，将它们转换为文本，并将它们存储在r中的列表中 myFiles <- list.files(path = ".", pattern = "pdf", full.names = TRUE) parsedFiles <- lapply(myFiles, func

我尝试了以下代码来读取目录中的多篇期刊文章PDF，将它们转换为文本，并将它们存储在r中的列表中

myFiles <- list.files(path = ".", 
                      pattern = "pdf",  
                      full.names = TRUE)

parsedFiles <- lapply(myFiles, 
                      function(f) {
                        print(f)
                        tryPDF <- gsub("\\s+", 
                                       " ", 
                                       pdf_text(f))
                        if (all(tryPDF[-1] == "")) {
                          compiledPDF <- do.call(c, 
                                                 lapply(1:length(tryPDF), 
                                                        function(pg) {
                                                          bitmap <-
                                                            pdf_render_page(
                                                              pdf = f,
                                                              page = pg,
                                                              dpi = 200,
                                                              numeric = TRUE
                                                            )
                                                          tiff::writeTIFF(bitmap, 
                                                                          "temp.tiff")
                                                          out <- ocr("temp.tiff")
                                                          return(out)
                                                        }))
                          return(compiledPDF)
                          
                        }
                        return(tryPDF)
                      })

myFiles