Python 名称错误::名称';pytesseract';没有定义
刚刚加入,我想知道是否有人能解释一下我的问题。我不熟悉机器学习代码和Python光学字符识别,因此有一些问题。我现在想用Pytesseract来阅读带有等距图的PDF,我这周才开始这个项目。所以,像手写、测量等,我将按照本教程从pytesseract开始。对于IDE,我正在使用GoogleColab。这是我的代码和我得到的错误。请原谅任何不好的做法,我已经有一段时间没有编码了Python 名称错误::名称';pytesseract';没有定义,python,ocr,tesseract,python-tesseract,Python,Ocr,Tesseract,Python Tesseract,刚刚加入,我想知道是否有人能解释一下我的问题。我不熟悉机器学习代码和Python光学字符识别,因此有一些问题。我现在想用Pytesseract来阅读带有等距图的PDF,我这周才开始这个项目。所以,像手写、测量等,我将按照本教程从pytesseract开始。对于IDE,我正在使用GoogleColab。这是我的代码和我得到的错误。请原谅任何不好的做法,我已经有一段时间没有编码了 # Install Packages !pip install pytesseract !pip install pdf
# Install Packages
!pip install pytesseract
!pip install pdfplumber
!pip install Pillow
# Imports
#import pytesseract
import pdfplumber
from google.colab import drive
# Mount Drive
drive.mount('/content/gdrive', force_remount=True)
# Functions
def pdf_reader():
with pdfplumber.open(r'/content/gdrive/My Drive/Colab Notebooks/iso1_1.pdf') as pdf:
iso1 = pdf.pages[0]
print(iso1.extract_text())
with pdfplumber.open(r'/content/gdrive/My Drive/Colab Notebooks/iso2_2.pdf') as pdf:
iso2 = pdf.pages[0]
print(iso2.extract_text())
with pdfplumber.open(r'/content/gdrive/My Drive/Colab Notebooks/iso3_3.pdf') as pdf:
iso3 = pdf.pages[0]
print(iso3.extract_text())
with pdfplumber.open(r'/content/gdrive/My Drive/Colab Notebooks/iso4_4.pdf') as pdf:
iso4 = pdf.pages[0]
print(iso4.extract_text())
with pdfplumber.open(r'/content/gdrive/My Drive/Colab Notebooks/iso5_5.pdf') as pdf:
iso5 = pdf.pages[0]
print(iso5.extract_text())
# OCR Function
try:
from PIL import Image
except ImportError:
import pytesseract
def ocr_process(filename1):
text1 = pytesseract.image_to_string(Image.open(filename1))
return text1
print(ocr_process('/content/gdrive/My Drive/Colab Notebooks/iso1_1.pdf'))
pdf_reader()
我得到的错误是:
Requirement already satisfied: pytesseract in /usr/local/lib/python3.6/dist-packages (0.3.6)
Requirement already satisfied: Pillow in /usr/local/lib/python3.6/dist-packages (from pytesseract) (7.0.0)
Requirement already satisfied: pdfplumber in /usr/local/lib/python3.6/dist-packages (0.5.23)
Requirement already satisfied: Pillow>=7.0.0 in /usr/local/lib/python3.6/dist-packages (from pdfplumber) (7.0.0)
Requirement already satisfied: Wand in /usr/local/lib/python3.6/dist-packages (from pdfplumber) (0.6.3)
Requirement already satisfied: pdfminer.six==20200517 in /usr/local/lib/python3.6/dist-packages (from pdfplumber) (20200517)
Requirement already satisfied: pycryptodome in /usr/local/lib/python3.6/dist-packages (from pdfminer.six==20200517->pdfplumber) (3.9.8)
Requirement already satisfied: chardet; python_version > "3.0" in /usr/local/lib/python3.6/dist-packages (from pdfminer.six==20200517->pdfplumber) (3.0.4)
Requirement already satisfied: sortedcontainers in /usr/local/lib/python3.6/dist-packages (from pdfminer.six==20200517->pdfplumber) (2.2.2)
Requirement already satisfied: Pillow in /usr/local/lib/python3.6/dist-packages (7.0.0)
Mounted at /content/gdrive
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-10-8559cf424168> in <module>()
48
49
---> 50 print(ocr_process('/content/gdrive/My Drive/Colab Notebooks/iso1_1.pdf'))
51
52 pdf_reader()
<ipython-input-10-8559cf424168> in ocr_process(filename1)
44
45 def ocr_process(filename1):
---> 46 text1 = pytesseract.image_to_string(Image.open(filename1))
47 return text1
48
NameError: name 'pytesseract' is not defined
已满足要求:pytesseract位于/usr/local/lib/python3.6/dist-packages(0.3.6)中
已满足要求:枕在/usr/local/lib/python3.6/dist-packages中(来自pytesseract)(7.0.0)
已经满足的要求:PDFP/usr/local/lib/python3.6/dist-packages(0.5.23)中的木材
已满足要求:枕头>=7.0.0 in/usr/local/lib/python3.6/dist-packages(来自pdfplumber)(7.0.0)
已满足要求:魔杖in/usr/local/lib/python3.6/dist-packages(来自pdfplumber)(0.6.3)
已满足要求:pdfminer.six==20200517 in/usr/local/lib/python3.6/dist-packages(来自pdfplumber)(20200517)
已满足要求:pycryptodome in/usr/local/lib/python3.6/dist-packages(来自pdfminer.six==20200517->pdfplumber)(3.9.8)
已满足的要求:chardet;python_version>/usr/local/lib/python3.6/dist-packages中的“3.0”(来自pdfminer.six==20200517->pdfplumber)(3.0.4)
已满足要求:在/usr/local/lib/python3.6/dist-packages中分类容器(来自pdfminer.six==20200517->pdfplumber)(2.2.2)
已满足要求:枕在/usr/local/lib/python3.6/dist-packages(7.0.0)中
安装在/content/gdrive
---------------------------------------------------------------------------
NameError回溯(最近一次呼叫上次)
在()
48
49
--->50打印(ocr_过程('/content/gdrive/My Drive/Colab Notebooks/iso1_1.pdf'))
51
52 pdf_阅读器()
在ocr_过程中(文件名1)
44
45 def ocr_流程(文件名1):
--->46 text1=pytesseract.image_to_字符串(image.open(filename1))
47返回文本1
48
NameError:未定义名称“pytesseract”
除了这里的这个线程之外,我已经尝试过了,但是没有发现太多,但是我尝试过根据这个线程来更改它,但是得到了相同的错误,除了image\u到string not defined
提前感谢。ocr\u过程
正在尝试使用pytesseract
,无论它是否通过您的导入,请尝试除上面的之外。只有在知道已导入库的情况下,才应使用该库。只有在没有PIL库的情况下,它才会起作用,因为pytesseract
import显然从未发生过,所以它看起来确实有PIL库
另外,由于您也使用了PIL的图像
,因此在您的示例中,您必须同时导入这两个库