我可以从Python Tesseract文件夹中的所有图像中获取数据吗？_Python_Image_Path_Tesseract_Python Tesseract

我可以从Python Tesseract文件夹中的所有图像中获取数据吗？

python image path

我可以从Python Tesseract文件夹中的所有图像中获取数据吗？,python,image,path,tesseract,python-tesseract,Python,Image,Path,Tesseract,Python Tesseract,我不想只得到一个图像，我想在一个文件夹中得到图像，如果可能的话，我想一个接一个地快速得到图像（比如1秒冷却，总共100个图像） [另一个我的坏主意是等待照片直播，当照片进入文件夹时，程序将读取并键入它，重要的是直播观看，但不一定是tho] 有人能帮我吗谢谢 {https://towardsdatascience.com/how-to-extract-text-from-images-with-python-db9b87fe432b} 编辑：从文件夹中的所有图像中提取文本我找到了这段代码，它

我不想只得到一个图像，我想在一个文件夹中得到图像，如果可能的话，我想一个接一个地快速得到图像（比如1秒冷却，总共100个图像）

[另一个我的坏主意是等待照片直播，当照片进入文件夹时，程序将读取并键入它，重要的是直播观看，但不一定是tho]

有人能帮我吗

谢谢

{https://towardsdatascience.com/how-to-extract-text-from-images-with-python-db9b87fe432b}

编辑：

从文件夹中的所有图像中提取文本

我找到了这段代码，它正在读取和创建文本文件，并在此处写入数据。

为了便于扫描和从文件夹中获取所有文件，您可以使用

glob

或

os.walk

导入全局操作系统
folder=“您的/文件夹/路径”
#要直接获取文件夹下的所有*.png文件，请执行以下操作：
files=glob.glob（文件夹+“/*.png”）
#文件将是一个列表，其中包含文件夹下的所有*.png文件，不包括子文件夹。
#或使用os.walk：
结果=[]
对于根目录，在os.walk（文件夹）中的文件：
如果文件.endswith（'.png'）：
result.append（os.path.join（根，文件））
#结果将是一个列表，其中包含文件夹中的所有*.png文件，包括子文件夹。

如果要对文件夹进行实时监视，并在将新的

.png

文件写入文件夹时触发某些操作

如果创建文件时不需要即时响应，而且文件夹也不是很拥挤

你能做的最简单的事情就是每隔几秒钟扫描同一个文件夹，并将新文件列表与旧文件列表进行比较，然后处理新文件

如果需要

eventListener

类型的响应，即一旦创建了文件，就会立即触发操作，那么可以检查名为

watchdog

的python库

以下是PyPI主页：

使用

watchdog

可以创建如下文件监视器：

从watchdog.events导入PatternMatchingEventHandler
从watchdog.Observer导入观察者
类PNG_处理程序（PatternMatchingEventHandler）
定义初始值（self，）：
super（）
已创建def（自身、事件）：
newfilepath=event.src\u路径
#newfilepath是新创建的.png文件的路径
#您可以在这里实现处理程序方法。
#其他方法具有相同的原理。
def on_已删除（自身、事件）：
通过
def on_已修改（自身、事件）：
通过
def on_移动（自身、事件）：
通过
观察者=观察者（）
schedule（PNG_Handler（），“path/to/folder”，recursive=True）

每当创建“*.png”文件时，都会调用

on_created

函数

import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
print(pytesseract.image_to_string(r'D:\examplepdf2image.png'))

# storing the text in a single file 
from PIL import Image 
import pytesseract as pt 
import os  

def main(): 
    # path for the folder for getting the raw images 
    path ="C:\\Users\\USER\\Desktop\\Masaüstü\\Test\\Input"
  
    # link to the file in which output needs to be kept 
    fullTempPath ="C:\\Users\\USER\\Desktop\\Masaüstü\\Test\\Output\\outputFile.txt"
  
    # iterating the images inside the folder 
    for imageName in os.listdir(path): 
        inputPath = os.path.join(path, imageName) 
        img = Image.open(inputPath) 

        # applying ocr using pytesseract for python 
        text = pt.image_to_string(img, lang ="eng") 
  
        # saving the  text for appending it to the output.txt file 
        # a + parameter used for creating the file if not present 
        # and if present then append the text content 
        file1 = open(fullTempPath, "a+") 
  
        # providing the name of the image 
        file1.write(imageName+"\n") 
  
        # providing the content in the image 
        file1.write(text+"\n") 
        file1.close()  
  
    # for printing the output file 
    file2 = open(fullTempPath, 'r') 
    print(file2.read()) 
    file2.close()         

if __name__ == '__main__': 
    main()