Python Can';t使用pdfplumber.open打开PDF文件

Python Can';t使用pdfplumber.open打开PDF文件,python,Python,我一直在关注一个名为“pythonicaccounter”的YouTube频道,我一直在尝试复制教程4,该教程教我如何从PDF发票中提取数据,但我失败了。我不断地遇到一个我还不知道如何解决的错误。我在OSx上使用anaconda和Jupyter笔记本。我的代码如下所示: import requests import pdfplumber def download_file(url): local_filename = url.split('/')[-1]

我一直在关注一个名为“pythonicaccounter”的YouTube频道,我一直在尝试复制教程4,该教程教我如何从PDF发票中提取数据,但我失败了。我不断地遇到一个我还不知道如何解决的错误。我在OSx上使用anaconda和Jupyter笔记本。我的代码如下所示:

    import requests
    import pdfplumber

    def download_file(url):
        local_filename = url.split('/')[-1]

        with requests.get(url) as r:
            with open(local_filename, 'wb') as f:
            f.write(r.content)
        
        return local_filename

    invoice_url = 'http://www.k-billing.com/example_invoices/professionalblue_example.pdf'

    invoice = download_file(invoice_url)

    with pdfplumber.open(invoice) as pdf:
        page = pdf.pages[0]
        text = page.extract_text()
在本教程中,代码运行良好。在我的例子中,我得到以下错误:

    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    <ipython-input-6-de1887236e07> in <module>
    ----> 1 with pdfplumber.open(invoice) as pdf:
          2     page = pdf.pages[0]
          3     text = page.extract_text()

    AttributeError: module 'pdfplumber' has no attribute 'open'
---------------------------------------------------------------------------
AttributeError回溯(最近一次呼叫上次)
在里面
---->1.以pdf格式打开(发票):
2页=pdf.pages[0]
3 text=page.extract_text()
AttributeError:模块“pdfplumber”没有属性“open”

我已经使用
pip install
安装了
pdfplumber
。我在网上搜索过这个错误。我不知道我做错了什么。

这里的缩进是错误的:

        with requests.get(url) as r:
            with open(local_filename, 'wb') as f:
            f.write(r.content)
应该是:

        with requests.get(url) as r:
            with open(local_filename, 'wb') as f:
                f.write(r.content)
在使用
python3.8.2
requests==2.22.0
ubuntu20.04
进行了更正之后,它对我有效


我没有使用OSX,代码确实适合我。因此,这里是我如何安装
pdfplumber
的,关于您使用的版本,可能有一些东西可以为您指明正确的方向

✓ alirvah ~ $ pip3 install pdfplumber
Collecting pdfplumber
  Downloading pdfplumber-0.5.24.tar.gz (42 kB)
     |████████████████████████████████| 42 kB 630 kB/s 
Requirement already satisfied: Pillow>=7.0.0 in /usr/lib/python3/dist-packages (from pdfplumber) (7.0.0)
Collecting Wand
  Downloading Wand-0.6.3-py2.py3-none-any.whl (133 kB)
     |████████████████████████████████| 133 kB 1.7 MB/s 
Collecting pdfminer.six==20200517
  Downloading pdfminer.six-20200517-py3-none-any.whl (5.6 MB)
     |████████████████████████████████| 5.6 MB 2.1 MB/s 
Collecting sortedcontainers
  Downloading sortedcontainers-2.3.0-py2.py3-none-any.whl (29 kB)
Collecting pycryptodome
  Downloading pycryptodome-3.9.9-cp38-cp38-manylinux1_x86_64.whl (13.7 MB)
     |████████████████████████████████| 13.7 MB 3.8 MB/s 
Requirement already satisfied: chardet; python_version > "3.0" in /usr/lib/python3/dist-packages (from pdfminer.six==20200517->pdfplumber) (3.0.4)
Building wheels for collected packages: pdfplumber
  Building wheel for pdfplumber (setup.py) ... done
  Created wheel for pdfplumber: filename=pdfplumber-0.5.24-py3-none-any.whl size=31123 sha256=e8edc98ee33fbe2caf6161ba8b9081a0dd798c8c747d8ceedff4f248cadf8e07
  Stored in directory: /home/alirvah/.cache/pip/wheels/2b/02/eb/8e0c88d08e0675b767895d4bcf54c0d4da1b37579b00409e0e
Successfully built pdfplumber
Installing collected packages: Wand, sortedcontainers, pycryptodome, pdfminer.six, pdfplumber
Successfully installed Wand-0.6.3 pdfminer.six-20200517 pdfplumber-0.5.24 pycryptodome-3.9.9 sortedcontainers-2.3.0

脚本的结果:

INVOICE
Invoice No. I1083
Account # C1006
Date 08-14-2008
Due By 08-31-2008
Demo Company
Phone : 111-222-3333 Terms None
1234 Main Street E-Mail : 333-444-4444 PO No. PO1234
Ashland, KY 41102 Web : http://www.ksoftware.net Sales Rep SalesPerson1
Bill To Ship To
Test Customer Test Customer
1234 Main Street 1234 Main Street
Ashland, KY 41101 Ashland,  41101
CCooddee DDeessccrriippttiioonn QTY Price Line Total
SKU1222 Test Import Name - Description Goes Here 1 $10.00 $10.00
Labor - Example labor item. Quantity is number of hours spent,  1.5 $100.00 $150.00
price is hourly rate. Quantity accepts decimal values.
Notes
An invoice note can go here. Multi-line and even multi-page notes are supported.
PPaayymmeenntt  DDeettaaiillss
Subtotal $160.00
Shipping$10.00 Tax $0.78
UPS Ground Total $170.78
Payments (-) $0.00
Balance Due $170.78
An invoice footer can go here

此处的缩进错误:

        with requests.get(url) as r:
            with open(local_filename, 'wb') as f:
            f.write(r.content)
应该是:

        with requests.get(url) as r:
            with open(local_filename, 'wb') as f:
                f.write(r.content)
在使用
python3.8.2
requests==2.22.0
ubuntu20.04
进行了更正之后,它对我有效


我没有使用OSX,代码确实适合我。因此,这里是我如何安装
pdfplumber
的,关于您使用的版本,可能有一些东西可以为您指明正确的方向

✓ alirvah ~ $ pip3 install pdfplumber
Collecting pdfplumber
  Downloading pdfplumber-0.5.24.tar.gz (42 kB)
     |████████████████████████████████| 42 kB 630 kB/s 
Requirement already satisfied: Pillow>=7.0.0 in /usr/lib/python3/dist-packages (from pdfplumber) (7.0.0)
Collecting Wand
  Downloading Wand-0.6.3-py2.py3-none-any.whl (133 kB)
     |████████████████████████████████| 133 kB 1.7 MB/s 
Collecting pdfminer.six==20200517
  Downloading pdfminer.six-20200517-py3-none-any.whl (5.6 MB)
     |████████████████████████████████| 5.6 MB 2.1 MB/s 
Collecting sortedcontainers
  Downloading sortedcontainers-2.3.0-py2.py3-none-any.whl (29 kB)
Collecting pycryptodome
  Downloading pycryptodome-3.9.9-cp38-cp38-manylinux1_x86_64.whl (13.7 MB)
     |████████████████████████████████| 13.7 MB 3.8 MB/s 
Requirement already satisfied: chardet; python_version > "3.0" in /usr/lib/python3/dist-packages (from pdfminer.six==20200517->pdfplumber) (3.0.4)
Building wheels for collected packages: pdfplumber
  Building wheel for pdfplumber (setup.py) ... done
  Created wheel for pdfplumber: filename=pdfplumber-0.5.24-py3-none-any.whl size=31123 sha256=e8edc98ee33fbe2caf6161ba8b9081a0dd798c8c747d8ceedff4f248cadf8e07
  Stored in directory: /home/alirvah/.cache/pip/wheels/2b/02/eb/8e0c88d08e0675b767895d4bcf54c0d4da1b37579b00409e0e
Successfully built pdfplumber
Installing collected packages: Wand, sortedcontainers, pycryptodome, pdfminer.six, pdfplumber
Successfully installed Wand-0.6.3 pdfminer.six-20200517 pdfplumber-0.5.24 pycryptodome-3.9.9 sortedcontainers-2.3.0

脚本的结果:

INVOICE
Invoice No. I1083
Account # C1006
Date 08-14-2008
Due By 08-31-2008
Demo Company
Phone : 111-222-3333 Terms None
1234 Main Street E-Mail : 333-444-4444 PO No. PO1234
Ashland, KY 41102 Web : http://www.ksoftware.net Sales Rep SalesPerson1
Bill To Ship To
Test Customer Test Customer
1234 Main Street 1234 Main Street
Ashland, KY 41101 Ashland,  41101
CCooddee DDeessccrriippttiioonn QTY Price Line Total
SKU1222 Test Import Name - Description Goes Here 1 $10.00 $10.00
Labor - Example labor item. Quantity is number of hours spent,  1.5 $100.00 $150.00
price is hourly rate. Quantity accepts decimal values.
Notes
An invoice note can go here. Multi-line and even multi-page notes are supported.
PPaayymmeenntt  DDeettaaiillss
Subtotal $160.00
Shipping$10.00 Tax $0.78
UPS Ground Total $170.78
Payments (-) $0.00
Balance Due $170.78
An invoice footer can go here

事实证明,这是我尝试安装PDFplumber包时遇到的问题。出于某种原因,我正在安装一个旧版本(0.1.2)。一旦我解决了这个问题,并安装了正确的软件包(0.5.24),脚本运行良好,我就能够完成教程。感谢您的贡献和帮助

我尝试安装PDFplumber软件包时出现了问题。出于某种原因,我正在安装一个旧版本(0.1.2)。一旦我解决了这个问题,并安装了正确的软件包(0.5.24),脚本运行良好,我就能够完成教程。感谢您的贡献和帮助

您能确认您安装的
pdfplumber
版本吗?@Tomerikoo编辑问题时请务必小心。不太可能,但如果您添加的缺少空行实际上导致了问题呢?我见过python的一些奇怪的东西。。。所以,正如我所说的。。。小心你的改变。@GhostCat我完全知道这一点。这就是为什么我没有在第二个
with
语句下触及语法错误的原因。我相信添加一个空行不会改变任何事情,只会让代码对其他试图帮助的人更具可读性。看看anaonda,它说我已经安装了0.1.2版的PDFPlumber。你能确认你安装的
PDFPlumber
版本吗?@Tomerikoo编辑问题时请非常小心。不太可能,但如果您添加的缺少空行实际上导致了问题呢?我见过python的一些奇怪的东西。。。所以,正如我所说的。。。小心你的改变。@GhostCat我完全知道这一点。这就是为什么我没有在第二个
with
语句下触及语法错误的原因。我相信添加一个空行不会改变任何事情,只会让代码对其他试图帮助的人更具可读性。看看anaonda,它说我已经安装了PDFPlumber的0.1.2版。这肯定不是问题,可能是打字错误。OP在那之后的一行中得到了错误,所以很明显这是他们会得到的错误…因此我提供了
pip3 install pdfplumber
的输出,以参考我使用的库版本。您好@Patrik,感谢您的输入。我必须道歉,当我将代码从Jupyter转换为堆栈溢出时,缩进是一个输入错误。我诚挚的道歉。我确实注意到,您似乎安装了不同版本的PDFplumber。我已经安装了0.1.2,而您的似乎已经安装了0.5.24。也许这就是问题所在,而这肯定不是问题所在,可能是一个打字错误。OP在那之后的一行中得到了错误,所以很明显这是他们会得到的错误…因此我提供了
pip3 install pdfplumber
的输出,以参考我使用的库版本。您好@Patrik,感谢您的输入。我必须道歉,当我将代码从Jupyter转换为堆栈溢出时,缩进是一个输入错误。我诚挚的道歉。我确实注意到,您似乎安装了不同版本的PDFplumber。我已经安装了0.1.2,而您的似乎已经安装了0.5.24。也许这就是问题所在