Pdf itext获取内容大小

Pdf itext获取内容大小,pdf,size,itextsharp,Pdf,Size,Itextsharp,我只是花了几个小时浏览网页。似乎其他人也有这个问题,但我找不到答案 我有一大堆PDF文件,我需要它们的测量值,即它们的高度和页面内容的宽度 在Adobe Illustrator中,导入PDF时,可以选择修剪到“边界框”。这正是我需要的 我尝试了很多方法,这里是大杂烩: Dim pdfStream = IO.File.OpenRead(FilePath) Dim img = PdfImages(pdfStream) Dim pdfReader = New PdfReader(pdfStream)

我只是花了几个小时浏览网页。似乎其他人也有这个问题,但我找不到答案

我有一大堆PDF文件,我需要它们的测量值,即它们的高度和页面内容的宽度

在Adobe Illustrator中,导入PDF时,可以选择修剪到“边界框”。这正是我需要的

我尝试了很多方法,这里是大杂烩:

Dim pdfStream = IO.File.OpenRead(FilePath)
Dim img = PdfImages(pdfStream)
Dim pdfReader = New PdfReader(pdfStream)
Dim pdfDictionary = pdfReader.GetPageN(1)
Dim mediaBox = pdfDictionary.GetAsArray(PdfName.MEDIABOX)
Dim b = pdfReader.GetPageSize(pdfDictionary)
Dim ms = New MemoryStream
Dim document = New Document(pdfReader.GetPageSizeWithRotation(1))
Dim writer = PdfWriter.GetInstance(document, ms)
document.Open()
document.SetPageSize(pdfReader.GetPageSize(1))
document.NewPage()
Dim cb = writer.DirectContent
cb.Clip()
Dim pageImport = writer.GetImportedPage(pdfReader, 1)
pdfReader.Close()
pdfStream.Close()
我所能得到的只是页面大小,这是无用的。我在一大堆PDF上尝试过这个,所以它不像是一个损坏的文件或其他东西。

为了实现您的目标

修剪到“边界框”。这正是我需要的

实际上,您必须解决两个问题:

  • 您必须更改某些PDF文档各个页面的裁剪框
  • 您必须确定某个页面的边界框,即(我假设)包含页面所有可见内容的最小框(具有水平和垂直边)
  • 广告1)更改各个页面的裁剪框

    您不应该使用为该任务找到的代码。操作单个文档几乎总是最好使用
    PdfStamper,
    而不是
    PdfWriter。

    示例/说明了如何做到这一点。中心方法:

    public byte[] ManipulatePdf(byte[] src)
    {
      PdfReader reader = new PdfReader(src);
      int n = reader.NumberOfPages;
      PdfDictionary pageDict;
      PdfRectangle rect = new PdfRectangle(55, 76, 560, 816);
      for (int i = 1; i <= n; i++)
      {
        pageDict = reader.GetPageN(i);
        pageDict.Put(PdfName.CROPBOX, rect);
      }
      using (MemoryStream ms = new MemoryStream())
      {
        using (PdfStamper stamper = new PdfStamper(reader, ms))
        {
        }
        return ms.ToArray();
      }
    }
    
    通过
    finder.GetLlx()、finder.GetLly()、finder.GetUrx()、
    finder.GetUry()
    执行
    ProcessContent
    后,将提供页面
    i
    的边界框左下角和右上角的坐标(忽略矢量图形)。您可以使用这些数据构造一个矩形,用于在上面的代码中馈送
    pageDict.Put(PdfName.CROPBOX,rect)

    但是,如果还需要考虑矢量图形,则必须稍微扩展解析器名称空间类,以便为矢量图形操作符创建解析事件,
    TextMarginFinder
    也要考虑这些事件。更多信息,请阅读。

    巫术:
    mkl的代码付诸实践(只需在矢量图形的左上角和右下角放置一些小的白色文本):

    publicstaticvoidstartmanipulation()
    {
    byte[]ba=System.IO.File.ReadAllBytes(@“D:\username\Documents\Downloads\itextsharp master\itextsharp master\src\CropTest\Files\dwg305.pdf”);
    //FindBoundingBox(ba);
    ba=EPDF(ba);
    System.IO.File.writealBytes(@“D:\username\Downloads\mysizedpdf.pdf”,ba);
    }//结束子开始操作
    公共静态字节[]EPDF(字节[]src)
    {
    字节[]byteBuffer=null;
    使用(iTextSharp.text.pdf.PdfReader reader=new iTextSharp.text.pdf.PdfReader(src))
    {
    iTextSharp.text.pdf.parser.PdfReaderContentParser parser=新的iTextSharp.text.pdf.parser.PdfReaderContentParser(阅读器);
    int n=reader.NumberOfPages;
    iTextSharp.text.pdf.PdfDictionary pageDict;
    
    对于(int pageNumber=1;pageNumber我有一个代码,也许我可以帮助您

    import java.io.ByteArrayOutputStream;
    import java.io.File;
    import java.io.FileOutputStream;
    import java.io.IOException;
    import java.nio.file.Files;
    import java.util.HashMap;
    import java.util.Map;
    
    import org.apache.commons.io.FileUtils;
    
    import com.itextpdf.text.BaseColor;
    import com.itextpdf.text.DocumentException;
    import com.itextpdf.text.Element;
    import com.itextpdf.text.Font;
    import com.itextpdf.text.Font.FontFamily;
    import com.itextpdf.text.FontFactory;
    import com.itextpdf.text.Image;
    import com.itextpdf.text.Phrase;
    import com.itextpdf.text.Rectangle;
    import com.itextpdf.text.pdf.BarcodeQRCode;
    import com.itextpdf.text.pdf.ColumnText;
    import com.itextpdf.text.pdf.PdfArray;
    import com.itextpdf.text.pdf.PdfContentByte;
    import com.itextpdf.text.pdf.PdfDictionary;
    import com.itextpdf.text.pdf.PdfDocument;
    import com.itextpdf.text.pdf.PdfGState;
    import com.itextpdf.text.pdf.PdfName;
    import com.itextpdf.text.pdf.PdfNumber;
    import com.itextpdf.text.pdf.PdfReader;
    import com.itextpdf.text.pdf.PdfRectangle;
    import com.itextpdf.text.pdf.PdfStamper;
    import com.itextpdf.text.pdf.parser.PdfReaderContentParser;
    import com.itextpdf.text.pdf.parser.TextMarginFinder;
    import com.itextpdf.text.pdf.qrcode.EncodeHintType;
    
    public static void sign(String src){
            try {
                String line1 = "Sign By: (VINICIUS)";
                String line2 = "Security Seal Number: 123545678";
    
    
                ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
                byte[] array = Files.readAllBytes(new File(src).toPath());
                int size = 36;
                String docUrl = "https://website.com";
                Map<EncodeHintType, Object> hints = new HashMap<EncodeHintType, Object>();
                BarcodeQRCode qrCode = new BarcodeQRCode(docUrl, size, size, hints);
                PdfReader reader = new PdfReader(array);
                PdfStamper stamper = new PdfStamper(reader, outputStream);
                PdfGState gs1 = new PdfGState();
                gs1.setFillOpacity(0.5f);
                int pageCount = reader.getNumberOfPages();
    
                Float y1 = 30f;
                Float y2 = 20f;
                Float y3 = 10f;
                PdfArray cropbox;
                PdfDictionary pageDict = null;
                float resultX = 30 + size;
                float imgX = 15f;
                for (int i = 1; i <= pageCount; i++) {
                    PdfContentByte contentByte = stamper.getOverContent(i);
                    Rectangle pgSize = reader.getPageSizeWithRotation(i);
                    if(pgSize.getHeight() > 842){
                        y1 = (float) (pgSize.getHeight() - 812);
                        y2 = (float) (pgSize.getHeight() - 822);
                        y3 = (float) (pgSize.getHeight() - 832);
                    }
                    pageDict = reader.getPageN(i);
                    cropbox = pageDict.getAsArray(PdfName.CROPBOX);
                    if(cropbox != null){
                        float wDoc     = pgSize.getWidth();
                        float hDoc     = pgSize.getHeight();
                        PdfNumber wCropboxNumber = cropbox.getAsNumber(2);
                        PdfNumber hCropboxNumber = cropbox.getAsNumber(3);
                        float wCropbox = wCropboxNumber.floatValue();
                        float hCropbox = hCropboxNumber.floatValue();
                        resultX = (wDoc - wCropbox)+30+size;
                        y1   = (hDoc - hCropbox) + 30;
                        y2   = (hDoc - hCropbox) + 20;
                        y3   = (hDoc - hCropbox) + 10;
                        imgX = (wDoc - wCropbox) + 15; 
                    }
    
    
                    contentByte.beginText();
                    contentByte.setFontAndSize(FontFactory.getFont(FontFactory.HELVETICA).getBaseFont(), 7);
                    contentByte.setColorFill(BaseColor.DARK_GRAY);
                    contentByte.showTextAligned(Element.ALIGN_LEFT, line1, resultX, y1 , 0); // 30
                    contentByte.showTextAligned(Element.ALIGN_LEFT, line2, resultX, y2 , 0); // 20
    
                    //contentByte.showTextAligned(Element.ALIGN_LEFT, line1, resultX, y1 , 0); // 30 
    
    
                    contentByte.endText();
    
                    Image image = qrCode.getImage();
                    image.setScaleToFitHeight(true);
                    image.setAbsolutePosition(imgX , y3); // 10
                    image.setBorder(Image.NO_BORDER);
                    image.setSpacingAfter(0);
                    image.setSpacingBefore(0);
                    contentByte.addImage(image);
                }
    
                stamper.close();
    
                File assinado = new File("sign.pdf");
                if(assinado.exists()){
                    assinado.delete();
                }
    
                FileUtils.writeByteArrayToFile(new File("sign.pdf"), outputStream.toByteArray());
    
            } catch (Exception e) {
                e.printStackTrace();
            }
    
        }
    
    import java.io.ByteArrayOutputStream;
    导入java.io.File;
    导入java.io.FileOutputStream;
    导入java.io.IOException;
    导入java.nio.file.Files;
    导入java.util.HashMap;
    导入java.util.Map;
    导入org.apache.commons.io.FileUtils;
    导入com.itextpdf.text.BaseColor;
    导入com.itextpdf.text.DocumentException;
    导入com.itextpdf.text.Element;
    导入com.itextpdf.text.Font;
    导入com.itextpdf.text.Font.FontFamily;
    导入com.itextpdf.text.FontFactory;
    导入com.itextpdf.text.Image;
    导入com.itextpdf.text.Phrase;
    导入com.itextpdf.text.Rectangle;
    导入com.itextpdf.text.pdf.BarcodeQRCode;
    导入com.itextpdf.text.pdf.ColumnText;
    导入com.itextpdf.text.pdf.PdfArray;
    导入com.itextpdf.text.pdf.PdfContentByte;
    导入com.itextpdf.text.pdf.PdfDictionary;
    导入com.itextpdf.text.pdf.PdfDocument;
    导入com.itextpdf.text.pdf.PdfGState;
    导入com.itextpdf.text.pdf.PdfName;
    导入com.itextpdf.text.pdf.PdfNumber;
    导入com.itextpdf.text.pdf.PdfReader;
    导入com.itextpdf.text.pdf.PdfRectangle;
    导入com.itextpdf.text.pdf.PdfStamper;
    导入com.itextpdf.text.pdf.parser.PdfReaderContentParser;
    导入com.itextpdf.text.pdf.parser.TextMarginFinder;
    导入com.itextpdf.text.pdf.qrcode.EncodeHintType;
    公共静态无效符号(字符串src){
    试一试{
    String line1=“签名人:(VINICIUS)”;
    String line2=“安全印章编号:123545678”;
    ByteArrayOutputStream outputStream=新建ByteArrayOutputStream();
    byte[]数组=Files.readAllBytes(新文件(src.toPath());
    int size=36;
    字符串docUrl=”https://website.com";
    映射提示=新的HashMap();
    条码qrCode=新条码qrCode(docUrl、大小、大小、提示);
    PdfReader读取器=新PdfReader(阵列);
    PdfStamper压模=新的PdfStamper(读卡器,输出流);
    PdfGState gs1=新的PdfGState();
    gs1.不透明度(0.5f);
    int pageCount=reader.getNumberOfPages();
    浮动y1=30f;
    浮动y2=20f;
    浮动y3=10f;
    PdfArray-cropbox;
    PdfDictionary pageDict=null;
    浮动结果x=30+尺寸;
    浮动imgX=15f;
    对于(int i=1;i 842){
    y1=(浮点)(pgSize.getHeight()-812);
    y2=(float)(pgSize.getHeight()-822);
    y3=(float)(pgSize.getHeight()-832);
    }
    pageDict=reader.getPageN(i);
    cropbox=pageDict.getAsArray(PdfName.cropbox);
    如果(cropbox!=null){
    float wDoc=pgSize.getWidth();
    float hDoc=pgSize.getHeight();
    PdfNumber wCropboxNumber=cropbox.getAsNumber(2);
    PdfNumber hCropboxNumber=cropbox.getAsNumber(3);
    浮箱=
    
    public static void StartManipulation()
    {
        byte[] ba = System.IO.File.ReadAllBytes(@"D:\username\Documents\Downloads\itextsharp-master\itextsharp-master\src\CropTest\Files\dwg305.pdf");
        // FindBoundingBox(ba);
        ba = ManipulatePdf(ba);
        System.IO.File.WriteAllBytes(@"D:\username\Downloads\mysizedpdf.pdf", ba);
    } // End Sub StartManipulation
    
    
    
    public static byte[] ManipulatePdf(byte[] src)
    {
        byte[] byteBuffer = null;
    
        using (iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(src))
        {
            iTextSharp.text.pdf.parser.PdfReaderContentParser parser = new iTextSharp.text.pdf.parser.PdfReaderContentParser(reader);
            int n = reader.NumberOfPages;
            iTextSharp.text.pdf.PdfDictionary pageDict;
    
            for (int pageNumber = 1; pageNumber <= n; pageNumber++)
            {
                pageDict = reader.GetPageN(pageNumber);
    
                iTextSharp.text.pdf.parser.TextMarginFinder finder = parser.ProcessContent(pageNumber, new iTextSharp.text.pdf.parser.TextMarginFinder());
    
                // iTextSharp.text.Rectangle pageSize = reader.GetPageSize(pageNumber);
    
                // Get Content Size
                float Llx = finder.GetLlx();
                float Lly = finder.GetLly();
                float Urx = finder.GetUrx();
                float Ury = finder.GetUry();
                //iTextSharp.text.pdf.PdfRectangle rect = new iTextSharp.text.pdf.PdfRectangle(55, 76, 560, 816);
                //iTextSharp.text.pdf.PdfRectangle rectTextContentSize = new iTextSharp.text.pdf.PdfRectangle(Llx, Lly, Urx, Ury);
    
                int SafetyMargin = 100;
                iTextSharp.text.pdf.PdfRectangle rectTextContentSize = new iTextSharp.text.pdf.PdfRectangle(Llx - SafetyMargin, Lly - SafetyMargin, Urx + SafetyMargin, Ury + SafetyMargin);
    
                pageDict.Put(iTextSharp.text.pdf.PdfName.CROPBOX, rectTextContentSize);
            } // Next i 
    
            using (System.IO.MemoryStream ms = new System.IO.MemoryStream())
            {
                using (iTextSharp.text.pdf.PdfStamper stamper = new iTextSharp.text.pdf.PdfStamper(reader, ms))
                { }
    
                byteBuffer = ms.ToArray();
            } // End Using ms
    
        } // End Using reader 
    
        return byteBuffer;
    } // End Function ManipulatePdf 
    
    
    public static System.Drawing.Size FindBoundingBox(byte[] src)
    {
        System.Drawing.Size sze = default(System.Drawing.Size);
        // iTextSharp.text.pdf
        // iTextSharp.text.pdf.parser
    
        using (iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(src))
        {
            iTextSharp.text.pdf.parser.PdfReaderContentParser parser = new iTextSharp.text.pdf.parser.PdfReaderContentParser(reader);
    
            for (int pageNumber = 1; pageNumber <= reader.NumberOfPages; pageNumber++)
            {
                iTextSharp.text.pdf.parser.TextMarginFinder finder = parser.ProcessContent(pageNumber, new iTextSharp.text.pdf.parser.TextMarginFinder());
    
                iTextSharp.text.Rectangle pageSize = reader.GetPageSize(pageNumber);
                float Llx = finder.GetLlx();
                float Lly = finder.GetLly();
                float Urx = finder.GetUrx();
                float Ury = finder.GetUry();
    
                float PdfSharpLly = pageSize.Height - Lly;
                float PdfSharpUry = pageSize.Height - Ury;
    
    
                sze = new System.Drawing.Size((int)(Urx - Llx), (int)(Ury - Lly));
    
    
                System.Console.WriteLine("Width: {0}<r\nHeight: {1}", pageSize.Width, pageSize.Height);
                System.Console.WriteLine("Llx: {0}\r\nLly: {1}\r\nUrx: {2}\r\nUry: {3}\r\n", Llx, Lly, Urx, Ury);
            } // Next pageNumber 
    
        } // End Using reader 
    
        return sze;
    } // End Function FindBoundingBox 
    
    import java.io.ByteArrayOutputStream;
    import java.io.File;
    import java.io.FileOutputStream;
    import java.io.IOException;
    import java.nio.file.Files;
    import java.util.HashMap;
    import java.util.Map;
    
    import org.apache.commons.io.FileUtils;
    
    import com.itextpdf.text.BaseColor;
    import com.itextpdf.text.DocumentException;
    import com.itextpdf.text.Element;
    import com.itextpdf.text.Font;
    import com.itextpdf.text.Font.FontFamily;
    import com.itextpdf.text.FontFactory;
    import com.itextpdf.text.Image;
    import com.itextpdf.text.Phrase;
    import com.itextpdf.text.Rectangle;
    import com.itextpdf.text.pdf.BarcodeQRCode;
    import com.itextpdf.text.pdf.ColumnText;
    import com.itextpdf.text.pdf.PdfArray;
    import com.itextpdf.text.pdf.PdfContentByte;
    import com.itextpdf.text.pdf.PdfDictionary;
    import com.itextpdf.text.pdf.PdfDocument;
    import com.itextpdf.text.pdf.PdfGState;
    import com.itextpdf.text.pdf.PdfName;
    import com.itextpdf.text.pdf.PdfNumber;
    import com.itextpdf.text.pdf.PdfReader;
    import com.itextpdf.text.pdf.PdfRectangle;
    import com.itextpdf.text.pdf.PdfStamper;
    import com.itextpdf.text.pdf.parser.PdfReaderContentParser;
    import com.itextpdf.text.pdf.parser.TextMarginFinder;
    import com.itextpdf.text.pdf.qrcode.EncodeHintType;
    
    public static void sign(String src){
            try {
                String line1 = "Sign By: (VINICIUS)";
                String line2 = "Security Seal Number: 123545678";
    
    
                ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
                byte[] array = Files.readAllBytes(new File(src).toPath());
                int size = 36;
                String docUrl = "https://website.com";
                Map<EncodeHintType, Object> hints = new HashMap<EncodeHintType, Object>();
                BarcodeQRCode qrCode = new BarcodeQRCode(docUrl, size, size, hints);
                PdfReader reader = new PdfReader(array);
                PdfStamper stamper = new PdfStamper(reader, outputStream);
                PdfGState gs1 = new PdfGState();
                gs1.setFillOpacity(0.5f);
                int pageCount = reader.getNumberOfPages();
    
                Float y1 = 30f;
                Float y2 = 20f;
                Float y3 = 10f;
                PdfArray cropbox;
                PdfDictionary pageDict = null;
                float resultX = 30 + size;
                float imgX = 15f;
                for (int i = 1; i <= pageCount; i++) {
                    PdfContentByte contentByte = stamper.getOverContent(i);
                    Rectangle pgSize = reader.getPageSizeWithRotation(i);
                    if(pgSize.getHeight() > 842){
                        y1 = (float) (pgSize.getHeight() - 812);
                        y2 = (float) (pgSize.getHeight() - 822);
                        y3 = (float) (pgSize.getHeight() - 832);
                    }
                    pageDict = reader.getPageN(i);
                    cropbox = pageDict.getAsArray(PdfName.CROPBOX);
                    if(cropbox != null){
                        float wDoc     = pgSize.getWidth();
                        float hDoc     = pgSize.getHeight();
                        PdfNumber wCropboxNumber = cropbox.getAsNumber(2);
                        PdfNumber hCropboxNumber = cropbox.getAsNumber(3);
                        float wCropbox = wCropboxNumber.floatValue();
                        float hCropbox = hCropboxNumber.floatValue();
                        resultX = (wDoc - wCropbox)+30+size;
                        y1   = (hDoc - hCropbox) + 30;
                        y2   = (hDoc - hCropbox) + 20;
                        y3   = (hDoc - hCropbox) + 10;
                        imgX = (wDoc - wCropbox) + 15; 
                    }
    
    
                    contentByte.beginText();
                    contentByte.setFontAndSize(FontFactory.getFont(FontFactory.HELVETICA).getBaseFont(), 7);
                    contentByte.setColorFill(BaseColor.DARK_GRAY);
                    contentByte.showTextAligned(Element.ALIGN_LEFT, line1, resultX, y1 , 0); // 30
                    contentByte.showTextAligned(Element.ALIGN_LEFT, line2, resultX, y2 , 0); // 20
    
                    //contentByte.showTextAligned(Element.ALIGN_LEFT, line1, resultX, y1 , 0); // 30 
    
    
                    contentByte.endText();
    
                    Image image = qrCode.getImage();
                    image.setScaleToFitHeight(true);
                    image.setAbsolutePosition(imgX , y3); // 10
                    image.setBorder(Image.NO_BORDER);
                    image.setSpacingAfter(0);
                    image.setSpacingBefore(0);
                    contentByte.addImage(image);
                }
    
                stamper.close();
    
                File assinado = new File("sign.pdf");
                if(assinado.exists()){
                    assinado.delete();
                }
    
                FileUtils.writeByteArrayToFile(new File("sign.pdf"), outputStream.toByteArray());
    
            } catch (Exception e) {
                e.printStackTrace();
            }
    
        }