如何在Google Cloud函数上使用Python pdf2image模块(因此是poppler)?

如何在Google Cloud函数上使用Python pdf2image模块(因此是poppler)?,python,image,pdf,google-cloud-functions,poppler,Python,Image,Pdf,Google Cloud Functions,Poppler,我尝试在谷歌云函数上将PDF转换为JPEG。我使用了Python模块pdf2image。但是我不知道如何解决错误没有这样的文件或目录:'pdfinfo'和“无法获取页面计数。poppler是否已安装并在云函数的路径中? 错误代码非常类似。是围绕poppler的“pdftoppm”和“pdftocairo”的包装。但是如何在google云功能上安装poppler包,并将其添加到PATH?我找不到相关的参考。甚至有可能?如果没有,怎么办 也有,但没有用 代码如下所示。入口点是process\u im

我尝试在谷歌云函数上将PDF转换为JPEG。我使用了Python模块
pdf2image
。但是我不知道如何解决错误
没有这样的文件或目录:'pdfinfo'
“无法获取页面计数。poppler是否已安装并在云函数的路径中?

错误代码非常类似。是围绕poppler的“pdftoppm”和“pdftocairo”的包装。但是如何在google云功能上安装poppler包,并将其添加到PATH?我找不到相关的参考。甚至有可能?如果没有,怎么办

也有,但没有用

代码如下所示。入口点是
process\u image

import requests
from pdf2image import convert_from_path

def process_image(event, context):
    # Download sample pdf file
    url = 'https://www.adobe.com/support/products/enterprise/knowledgecenter/media/c4611_sample_explain.pdf'
    r = requests.get(url, allow_redirects=True)
    open('/tmp/sample.pdf', 'wb').write(r.content)

    # Error occur on this line
    pages = convert_from_path('/tmp/sample.pdf')

    # Save pages to /tmp
    for idx, page in enumerate(pages):
        output_file_path = f"/tmp/{str(idx)}.jpg"
        page.save(output_file_path, 'JPEG')
        # To be saved to cloud storage
Requirement.txt:

requests==2.25.1
pdf2image==1.14.0
这是我得到的错误代码:

Traceback (most recent call last):
  File "/layers/google.python.pip/pip/lib/python3.8/site-packages/pdf2image/pdf2image.py", line 441, in pdfinfo_from_path
    proc = Popen(command, env=env, stdout=PIPE, stderr=PIPE)
  File "/opt/python3.8/lib/python3.8/subprocess.py", line 858, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/opt/python3.8/lib/python3.8/subprocess.py", line 1706, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'pdfinfo'
在处理上述异常期间,发生了另一个异常:

Traceback (most recent call last):
  File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/layers/google.python.pip/pip/lib/python3.8/site-packages/functions_framework/__init__.py", line 149, in view_func
    function(data, context)
  File "/workspace/main.py", line 11, in process_image
    pages = convert_from_path('/tmp/sample.pdf')
  File "/layers/google.python.pip/pip/lib/python3.8/site-packages/pdf2image/pdf2image.py", line 97, in convert_from_path
    page_count = pdfinfo_from_path(pdf_path, userpw, poppler_path=poppler_path)["Pages"]
  File "/layers/google.python.pip/pip/lib/python3.8/site-packages/pdf2image/pdf2image.py", line 467, in pdfinfo_from_path
    raise PDFInfoNotInstalledError(
pdf2image.exceptions.PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH?

提前感谢您的帮助。

出现此错误是因为poppler软件包在云功能中不工作,因为它需要将某些文件写入系统。不幸的是,在云功能等无服务器产品中,您无法写入文件系统

您可能想尝试方法,在另一个线程中描述,或者考虑使用GCP计算引擎来访问整个系统。