C# 使用Tesseract引擎的Tessnet2-为什么输出非常差？_C#_Ocr_Tesseract_Tessnet2

C# 使用Tesseract引擎的Tessnet2-为什么输出非常差？

C# 使用Tesseract引擎的Tessnet2-为什么输出非常差？,c#,ocr,tesseract,tessnet2,C#,Ocr,Tesseract,Tessnet2,我正在尝试使用C#中使用Tesseract引擎的TesserNet2。对于我提供给Tessnet2的许多测试图像，输出非常糟糕，几乎没有什么是正确的这是我在C#console项目Program.cs类中的代码： static void Main(string[] args) { try { Bitmap image = new Bitmap(@"C:\Users\hp\Desktop\eurotext.tif"); va

我正在尝试使用C#中使用Tesseract引擎的TesserNet2。对于我提供给Tessnet2的许多测试图像，输出非常糟糕，几乎没有什么是正确的

这是我在C#console项目Program.cs类中的代码：

 static void Main(string[] args)
    {
        try
        {
        Bitmap image = new Bitmap(@"C:\Users\hp\Desktop\eurotext.tif");
        var ocr = new Tesseract();

        //when I tried to add the SetVariable(...), it didn't change the output much

        ocr.Init(@"C:\Program Files (x86)\Tesseract-OCR", "eng", true);

        var result = ocr.DoOCR(image, Rectangle.Empty);
        foreach (Word word in result)
            Console.WriteLine("{0} : {1}", word.Confidence, word.Text);

        Console.ReadLine();
    }
    catch (Exception exception)
    {
        Console.WriteLine("Error");
    }
}

例如，这是一个样本（大二进制300 dpi）测试图像“eurotext.tif”：

这是该图像的Tessnet2输出：

我一直在使用此网站学习使用Tessnet2的步骤：

我使用此网站尝试正确使用SetVariable（…）函数，使其实现我想要的功能，但运气不佳，输出也没有太大差异：

我找到了减少发动机错误的Tesseract指南：

它说“Tesseract最适合使用至少300 DPI的DPI文本”。。此示例图像为300 dpi
这个示例图像也是二进制的，这应该会提供更好的输出，正如许多人在各种网站上建议的那样

我到处寻找可以提高准确性的解决方案，发现很多帖子和人都有类似的问题，但没有有效的解决方案

这个问题的原因可能是什么？我怎样才能解决它

我是这个主题的初学者，所以如果解决方案太琐碎，请耐心等待

谢谢

要获得要显示的文本，您必须更改：

ocr.Init(@"C:\Program Files (x86)\Tesseract-OCR", "eng", true);

致：

使用的字体（.traineddata文件）是否与您的样本对应？也许您可以添加一个白名单（要识别哪个字符）来为Tesseract提供更好的建议。

ocr.Init(@"C:\Program Files (x86)\Tesseract-OCR", "eng", false);