Python 使用**魔杖**减少图像文件大小以提高OCR性能？_Python_Jupyter Notebook_Imagemagick_Jupyter_Wand

Python 使用**魔杖**减少图像文件大小以提高OCR性能？

python jupyter-notebook imagemagick

Python 使用**魔杖**减少图像文件大小以提高OCR性能？,python,jupyter-notebook,imagemagick,jupyter,wand,Python,Jupyter Notebook,Imagemagick,Jupyter,Wand,我正在尝试编写使用Python的wandsimple MagickWand API绑定从PDF中提取页面，将它们缝合到一个较长（“较高”）的图像中，并将该图像传递给Google Cloud Vision进行OCR文本检测。我一直在与Google Cloud Vision的10MB文件大小限制做斗争我认为降低文件大小的一个好方法可能是消除所有颜色通道，只向Google提供一张黑白图像。我知道如何获得灰度，但我如何才能使我的彩色图像成为黑白（“双层”）的一个？我也愿意接受其他降低文件大小的建议。提

我正在尝试编写使用Python的wandsimple MagickWand API绑定从PDF中提取页面，将它们缝合到一个较长（“较高”）的图像中，并将该图像传递给Google Cloud Vision进行OCR文本检测。我一直在与Google Cloud Vision的10MB文件大小限制做斗争

我认为降低文件大小的一个好方法可能是消除所有颜色通道，只向Google提供一张黑白图像。我知道如何获得灰度，但我如何才能使我的彩色图像成为黑白（“双层”）的一个？我也愿意接受其他降低文件大小的建议。提前谢谢

from wand.image import Image

selected_pages = [0,1]

imageFromPdf = Image(filename=pdf_filepath+str(selected_pages), resolution=600)
pages = len(imageFromPdf.sequence)

image = Image(
    width=imageFromPdf.width,
    height=imageFromPdf.height * pages
    )
for i in range(pages):
    image.composite(
    imageFromPdf.sequence[i],
    top=imageFromPdf.height * i,
    left=0
    )

image.colorspace = 'gray' 
image.alpha_channel = False
image.format = 'png'

image

以下是从PythonWand（0.5.7）获得两层输出的几种方法。最后一个需要我7岁才能工作。在我的测试中需要注意的一点是，在IM7中，前两个结果在抖动或不抖动方面交换。但我已经向Python Wand开发者报告了这一点

输入：

使用IM 6的第一个输出：

使用IM7的第二个输出：

您可以调用其中一种阈值方法使其成为两层。分辨率

将是一个超过3000万像素的大型光栅（假设美国信纸大小）。您可能需要将分辨率降低到120，然后调用

Image.transform\u colorspace（'gray'）

这非常有用！谢谢，fmw42！

from wand.image import Image
from wand.display import display

# Using Wand 0.5.7, all images are not dithered in IM 6 and all images are dithered in IM 7
with Image(filename='lena.jpg') as img:
    with img.clone() as img_copy1:
        img_copy1.quantize(number_colors=2, colorspace_type='gray', treedepth=0, dither=False, measure_error=False)
        img_copy1.auto_level()
        img_copy1.save(filename='lena_monochrome_no_dither.jpg')
        display(img_copy1)
        with img.clone() as img_copy2:
            img_copy2.quantize(number_colors=2, colorspace_type='gray', treedepth=0, dither=True, measure_error=False)
            img_copy2.auto_level()
            img_copy2.save(filename='lena_monochrome_dither.jpg')
            display(img_copy2)
            with img.clone() as img_copy3:
                img_copy3.threshold(threshold=0.5)
                img_copy3.save(filename='lena_threshold.jpg')
                display(img_copy3)
                # only works in IM 7
                with img.clone() as img_copy4:
                    img_copy4.auto_threshold(method='otsu')
                    img_copy4.save(filename='lena_threshold_otsu.jpg')
                    display(img_copy4)