C#：使用iTextSharp从图像中提取数据_C#_Image_Itextsharp

C#：使用iTextSharp从图像中提取数据

c# image

C#：使用iTextSharp从图像中提取数据,c#,image,itextsharp,C#,Image,Itextsharp,您好，我正在尝试从扫描的文档或pdf文件中提取数据，其中包含一些数据的图像。对于PdF文件，我已成功读取数据，但对于图像或包含图片的PdF，我失败了。我已将图像文件转换为pdf，但没有结果。这是我的小代码： public bool ConvertImageToPdf(string ImageIn, string PDFOut) { try { iTextSharp.text.Document

您好，我正在尝试从扫描的文档或pdf文件中提取数据，其中包含一些数据的图像。对于PdF文件，我已成功读取数据，但对于图像或包含图片的PdF，我失败了。我已将图像文件转换为pdf，但没有结果。这是我的小代码：

 public bool ConvertImageToPdf(string ImageIn, string PDFOut)
        {

            try
            {
                iTextSharp.text.Document doc1 = new iTextSharp.text.Document();
                byte[] b = File.ReadAllBytes(ImageIn);
                iTextSharp.text.Image image = iTextSharp.text.Image.GetInstance(b);
                using (FileStream fs = new FileStream(PDFOut, FileMode.Create, FileAccess.Write, FileShare.None))
                {
                    using (Document doc = new Document(image))
                    {
                        using (PdfWriter writer = PdfWriter.GetInstance(doc, fs))
                        {

                            Paragraph paragraph = new Paragraph("");
                            doc.Open();

                            image.SetAbsolutePosition(0, 0);
                            writer.DirectContent.AddImage(image);
                            doc.Add(paragraph); // add paragraph to the document  
                            doc.Add(image); //add an image to the created pdf document
                            doc.Close();
                            return true;
                        }
                    }
                }
            }
            catch (Exception ex)
            {
                return false;
            }



        }

对于提取数据，我使用以下代码：

   for (int page = 1; page <= reader.NumberOfPages; page++)
                {
                    outFile.Write(ExtractTextFromPDFBytes(reader.GetPageContent(page)) + " ");
}

用于（int page=1；page将图像放入pdf将生成一个包含图像的pdf…它不会自动对图像执行任何OCR来提取文本@devnull69说的是正确的：你需要OCR，而iText不执行OCR。你说得对@devnull69，我想当我添加它时，它将被视为pdf文件并提取数据。有什么例外吗我能用哪一种OCR？