Python 从演示文稿文件中提取图像_Python_Python 2.7_Python Pptx

Python 从演示文稿文件中提取图像

python python-2.7

Python 从演示文稿文件中提取图像,python,python-2.7,python-pptx,Python,Python 2.7,Python Pptx,我正在开发python pptx包。对于我的代码，我需要提取演示文件中的所有图像。有人能帮我度过这段时间吗提前谢谢你的帮助我的代码如下所示： import pptx prs=pptx.Presentation（文件名）当使用shape_type时，它显示ppt中的图片（13）。但我希望将图片提取到代码所在的文件夹中。请将其用作参考 ppt = PPTExtractor("some/PowerPointFile") # found images len(ppt) # image list

我正在开发python pptx包。对于我的代码，我需要提取演示文件中的所有图像。有人能帮我度过这段时间吗

提前谢谢你的帮助

我的代码如下所示：

import pptx

prs=pptx.Presentation（文件名）

当使用shape_type时，它显示ppt中的图片（13）。但我希望将图片提取到代码所在的文件夹中。

请将其用作参考

ppt = PPTExtractor("some/PowerPointFile")
# found images
len(ppt)
# image list
images = ppt.namelist()
# extract image
ppt.extract(images[0])

# save image with different name
ppt.extract(images[0], "nuevo-nombre.png")
# extract all images
ppt.extractall()

将图像保存在不同的目录中：

ppt.extract("image.png", path="/another/directory")
ppt.extractall(path="/another/directory")

python-pptx

中的

Picture

（shape）对象提供对其显示图像的访问：

from pptx import Presentation
from pptx.enum.shapes import MSO_SHAPE_TYPE

def iter_picture_shapes(prs):
    for slide in prs.slides:
        for shape in slide.shapes:
            if shape.shape_type == MSO_SHAPE_TYPE.PICTURE:
                yield shape

for picture in iter_picture_shapes(Presentation(filename)):
    image = picture.image
    # ---get image "file" contents---
    image_bytes = image.blob
    # ---make up a name for the file, e.g. 'image.jpg'---
    image_filename = 'image.%s' % image.ext
    with open(image_filename, 'wb') as f:
        f.write(image_bytes)

生成唯一的文件名将留给您作为练习。你需要的其他东西都在这里

有关

图像

对象的更多详细信息，请参阅此处的文档：

scanny的解决方案对我不起作用，因为我在组元素中有图像元素。这对我很有用：

from pptx import Presentation
from pptx.enum.shapes import MSO_SHAPE_TYPE

n=0
def write_image(shape):
    global n
    image = shape.image
    # ---get image "file" contents---
    image_bytes = image.blob
    # ---make up a name for the file, e.g. 'image.jpg'---
    image_filename = 'image{:03d}.{}'.format(n, image.ext)
    n += 1
    print(image_filename)
    with open(image_filename, 'wb') as f:
        f.write(image_bytes)

def visitor(shape):
    if shape.shape_type == MSO_SHAPE_TYPE.GROUP:
        for s in shape.shapes:
            visitor(s)
    if shape.shape_type == MSO_SHAPE_TYPE.PICTURE:
        write_image(shape)

def iter_picture_shapes(prs):
    for slide in prs.slides:
        for shape in slide.shapes:
            visitor(shape)

iter_picture_shapes(Presentation(filename))

请在上面的问题中添加您的代码。阅读。相关：我真的很喜欢你的图书馆，它非常有用。您还可以为视频对象制作文档吗？我深入研究了文档/您的github repo，试图找出如何获取视频/音频文件并提取它们，但没有成功，因此如果您愿意帮助我，我将非常高兴。

from pptx import Presentation
from pptx.enum.shapes import MSO_SHAPE_TYPE

n=0
def write_image(shape):
    global n
    image = shape.image
    # ---get image "file" contents---
    image_bytes = image.blob
    # ---make up a name for the file, e.g. 'image.jpg'---
    image_filename = 'image{:03d}.{}'.format(n, image.ext)
    n += 1
    print(image_filename)
    with open(image_filename, 'wb') as f:
        f.write(image_bytes)

def visitor(shape):
    if shape.shape_type == MSO_SHAPE_TYPE.GROUP:
        for s in shape.shapes:
            visitor(s)
    if shape.shape_type == MSO_SHAPE_TYPE.PICTURE:
        write_image(shape)

def iter_picture_shapes(prs):
    for slide in prs.slides:
        for shape in slide.shapes:
            visitor(shape)

iter_picture_shapes(Presentation(filename))