Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/325.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
TypeError:uuu init_uuuu()接受1个位置参数,但给出了2个位置参数(使用Pytesseract的Python多处理)_Python_Multiprocessing_Python Tesseract - Fatal编程技术网

TypeError:uuu init_uuuu()接受1个位置参数,但给出了2个位置参数(使用Pytesseract的Python多处理)

TypeError:uuu init_uuuu()接受1个位置参数,但给出了2个位置参数(使用Pytesseract的Python多处理),python,multiprocessing,python-tesseract,Python,Multiprocessing,Python Tesseract,我在尝试使用Python的多处理库以及pytesseract和pdf2image时收到下面的错误消息,我不太确定这是什么意思,也不知道如何更正它。我在其他帖子中看到过类似的输出消息,它们将self作为类方法中的参数传递,但我没有在这个实例中创建类 C:\Users\erik7>python "C:\Users\erik7\Documents\Python Projects\multiprocess_test2.py" 0 Exception in thread Thre

我在尝试使用Python的
多处理
库以及
pytesseract
pdf2image
时收到下面的错误消息,我不太确定这是什么意思,也不知道如何更正它。我在其他帖子中看到过类似的输出消息,它们将
self
作为类方法中的参数传递,但我没有在这个实例中创建类

C:\Users\erik7>python "C:\Users\erik7\Documents\Python Projects\multiprocess_test2.py"
0
Exception in thread Thread-11:
Traceback (most recent call last):
  File "C:\Users\erik7\AppData\Local\Programs\Python\Python38-32\lib\threading.py", line 932, in _bootstrap_inner
    self.run()
  File "C:\Users\erik7\AppData\Local\Programs\Python\Python38-32\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\erik7\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\pool.py", line 576, in _handle_results
    task = get()
  File "C:\Users\erik7\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\connection.py", line 251, in recv
    return _ForkingPickler.loads(buf.getbuffer())
TypeError: __init__() takes 1 positional argument but 2 were given
1
2
3
4
5
6
7
8
9
我的代码:

import pytesseract
import pdf2image
import multiprocessing


def extract(img, page_num):
    
    print(page_num)
    
    return pytesseract.image_to_osd(img, output_type = pytesseract.Output.DICT)['orientaton']


if __name__ == "__main__":

    pdf_path = r"C:/Users/erik7/Documents/Late Scans for Testing/scans_template2.pdf"
    output_fmt = 'jpeg'
    img_dpi = 300
    pop_path = r"C:\Users\erik7\Downloads\poppler-0.90.1\bin"
    output_path = r"C:\Users\erik7\Downloads"
    
    pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
    
    converted_path = r"C:\Users\erik7\Downloads\converted_images"
    converted = pdf2image.convert_from_path(pdf_path = pdf_path, fmt = output_fmt, dpi = img_dpi, poppler_path = pop_path, output_folder = converted_path, grayscale = True, thread_count = 2)

    results = [] 
    
    iterable = [[img, page_num] for page_num, img in enumerate(converted)]
    p = multiprocessing.Pool()
    r = p.starmap(extract, iterable)
    results.append(r)
    p.close()
    
    print("\n**PROCESS COMPLETED SUCCESSFULLY")

让它工作起来。我需要将
pytesseract.pytesseract.tesseract\u cmd=r“C:\Program Files\tesseract OCR\tesseract.exe”
移动到我的
extract
函数中,使程序能够使用
多处理成功运行:

import pytesseract
import pdf2image
import multiprocessing


def extract(img, page_num):
    
    print(page_num)
    
    return pytesseract.image_to_osd(img, output_type = pytesseract.Output.DICT)['orientaton']


if __name__ == "__main__":

    pdf_path = r"C:/Users/erik7/Documents/Late Scans for Testing/scans_template2.pdf"
    output_fmt = 'jpeg'
    img_dpi = 300
    pop_path = r"C:\Users\erik7\Downloads\poppler-0.90.1\bin"
    output_path = r"C:\Users\erik7\Downloads"
    
    converted_path = r"C:\Users\erik7\Downloads\converted_images"
    converted = pdf2image.convert_from_path(pdf_path = pdf_path, fmt = output_fmt, dpi = img_dpi, poppler_path = pop_path, output_folder = converted_path, grayscale = True, thread_count = 2)

    results = [] 
    
    iterable = [[img, page_num] for page_num, img in enumerate(converted)]
    p = multiprocessing.Pool()
    r = p.starmap(extract, iterable)
    results.append(r)
    p.close()
    
    print("\n**PROCESS COMPLETED SUCCESSFULLY")