Python 如何使用pdfminer.pdffont将字体大小和名称提取为库_Python_Pdfminer

Python 如何使用pdfminer.pdffont将字体大小和名称提取为库

python

Python 如何使用pdfminer.pdffont将字体大小和名称提取为库,python,pdfminer,Python,Pdfminer,我希望提取pdfminer中每个单词的字体及其大小这是使用pdfminer提取pdf布局的代码那么要提取Pdfont，我应该怎么做不要告诉我使用我希望在代码中使用的命令行 def read_pdf_miner(fileObj): """ This function takes the file object, read the file content and store it into a dictionary for processing :param fil

我希望提取pdfminer中每个单词的字体及其大小 这是使用pdfminer提取pdf布局的代码那么要提取Pdfont，我应该怎么做不要告诉我使用我希望在代码中使用的命令行

def read_pdf_miner(fileObj):
    """
    This function takes the file object, read the file content and store it into a dictionary for processing

    :param fileObj: File object for reading the file
    :return: None
    """
    file_pointer = open(fileObj,'rb')

    parser = PDFParser(file_pointer)

    document = PDFDocument(parser)

    if not document.is_extractable:
        raise PDFTextExtractionNotAllowed


    rsrcmgr = PDFResourceManager()

    device = PDFDevice(rsrcmgr)

    laparams = LAParams()

    device = PDFPageAggregator(rsrcmgr, laparams=laparams)

    interpreter = PDFPageInterpreter(rsrcmgr, device)
    page_num = 1
    id = 0
    for page in PDFPage.create_pages(document):
        interpreter.process_page(page)

        layout = device.get_result()
        for lt_obj in layout:
            if isinstance(lt_obj, LTTextBoxHorizontal):
                text_dict[id] = lt_obj.get_text()
                text_prop_dict[id] = lt_obj
                id += 1
        page_dict[page_num]=text_prop_dict.copy()
        text_prop_dict.clear()
        page_num += 1

你明白了吗？是的我明白了。谢谢你把答案贴出来！