C# 使用itextsharp按正确顺序从pdf中提取图像_C#_Image_Pdf_Itextsharp

C# 使用itextsharp按正确顺序从pdf中提取图像

c# image pdf

C# 使用itextsharp按正确顺序从pdf中提取图像,c#,image,pdf,itextsharp,C#,Image,Pdf,Itextsharp,我需要为pdf页面提取一幅图像，我有一个从stackoverflow中的另一个问题中提取的代码来从pdf中提取图像，有些时候效果非常好，但有些时候没有按照我希望的顺序提取图像（第一页将是第一幅图像），并且第一幅图像对应于pdf文件的第一页，代码如下： private static int WriteImageFile(string pdf, string path) { int nfotos = 0; try

我需要为pdf页面提取一幅图像，我有一个从stackoverflow中的另一个问题中提取的代码来从pdf中提取图像，有些时候效果非常好，但有些时候没有按照我希望的顺序提取图像（第一页将是第一幅图像），并且第一幅图像对应于pdf文件的第一页，代码如下：

 private static int WriteImageFile(string pdf, string path)
        {
            int nfotos = 0;
            try
            {
                // Get a List of Image
                List<System.Drawing.Image> ListImage = ExtractImages(pdf);
                nfotos = ListImage.Count;
                for (int i = 0; i < ListImage.Count; i++)
                {
                    try
                    {
                        ListImage[i].Save(path+ "\\Image" + i + ".bmp", System.Drawing.Imaging.ImageFormat.Bmp);
                                          }
                    catch (Exception e)
                    { MessageBox.Show(e.Message); }
                }

            }
            catch (Exception ex)
            {
                MessageBox.Show(ex.Message);
            }
            return nfotos;
        }


 private static List<System.Drawing.Image> ExtractImages(String PDFSourcePath)
        {
            List<System.Drawing.Image> ImgList = new List<System.Drawing.Image>();

            iTextSharp.text.pdf.RandomAccessFileOrArray RAFObj = null;
            iTextSharp.text.pdf.PdfReader PDFReaderObj = null;
            iTextSharp.text.pdf.PdfObject PDFObj = null;
            iTextSharp.text.pdf.PdfStream PDFStremObj = null;

            try
            {
                RAFObj = new iTextSharp.text.pdf.RandomAccessFileOrArray(PDFSourcePath);
                PDFReaderObj = new iTextSharp.text.pdf.PdfReader(RAFObj, null);
                Form1 formulario = new Form1();
                for (int i = 0; i <= PDFReaderObj.XrefSize - 1; i++)
                {
                    PDFObj = PDFReaderObj.GetPdfObject(i);

                    if ((PDFObj != null) && PDFObj.IsStream())
                    {
                        PDFStremObj = (iTextSharp.text.pdf.PdfStream)PDFObj;
                        iTextSharp.text.pdf.PdfObject subtype = PDFStremObj.Get(iTextSharp.text.pdf.PdfName.SUBTYPE);

                        if ((subtype != null) && subtype.ToString() == iTextSharp.text.pdf.PdfName.IMAGE.ToString())
                        {
                            try
                            {

                                iTextSharp.text.pdf.parser.PdfImageObject PdfImageObj =
                         new iTextSharp.text.pdf.parser.PdfImageObject((iTextSharp.text.pdf.PRStream)PDFStremObj);

                                System.Drawing.Image ImgPDF = PdfImageObj.GetDrawingImage();


                                ImgList.Add(ImgPDF);
                            }
                            catch (Exception)
                            {

                            }
                        }
                    }
                }
                PDFReaderObj.Close();
            }
            catch (Exception ex)
            {
                throw new Exception(ex.Message);
            }
            return ImgList;
        }

私有静态int-writeImage文件（字符串pdf，字符串路径）
{
int nfotos=0；
尝试
{
//获取图像列表
列表图像=提取图像（pdf）；
nfotos=ListImage.Count；
for（int i=0；i对于（int i=0；i）你做错了。你正在使用蛮力在PDF中的对象上循环。你没有考虑到一个图像可以出现在多个页面上。你没有检查每个页面的内容流以找出每个图像的坐标。请阅读并感谢链接！！，我现在正在查看，但有我不明白的一件事，当他在做：MyImageRenderListener=new MyImageRenderListener（结果）；结果是一个字符串和函数public void renderImage（ImageRenderInfo renderInfo）需要一个renderinfo，你能给我解释一下吗？MyImageRenderListener
的构造函数是保存图像的路径和文件名格式（String.format（）
）

方法使用提供的字符串将图像字节写入磁盘。@Chris Haas是正确的。理想情况下，您应该阅读编写示例的书，但由于您没有这本书，您需要对示例进行实验。您需要学习。我刚刚发现了这个问题，其中的答案也应该让您有一些启发gs清除：