Ios7 tesseract ocr在ios上工作得很糟糕(7)

Ios7 tesseract ocr在ios上工作得很糟糕(7),ios7,ocr,tesseract,text2image,Ios7,Ocr,Tesseract,Text2image,我不知道我或tesseract库是否出了问题,但它工作得很糟糕 Tesseract* tesseract = [[Tesseract alloc] initWithDataPath:@"tessdata" language:@"eng"]; [tesseract setVariableValue:@"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZéèô" forKey:@"tessedit_char_whit

我不知道我或tesseract库是否出了问题,但它工作得很糟糕

Tesseract* tesseract = [[Tesseract alloc] initWithDataPath:@"tessdata" language:@"eng"];

    [tesseract setVariableValue:@"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZéèô" forKey:@"tessedit_char_whitelist"]; //limit search
    [tesseract setImage:[UIImage imageNamed:@"sampledoc.jpg"]]; //image to check
    [tesseract recognize];

    NSLog(@"%@", [tesseract recognizedText]);

    [tesseract clear];
这是我要从中提取文本的示例图像:

这是我跑步后得到的:

THE SILVER CHAIR
by r 5 Lawn
CHAPTER ow
BEHIND THE cm

lr W1C a dull aulumn day and llll Pole vmscrylng ulmo mo gym
She ms clymg because Illey had been bullymg her Hus Is not gmng In baa school oolyl se I
shall say 15 lane is poslble Ibvlll lllrs schwll which lsnol 1 plusinl subjzrl II was Tcr
eduummlr o sdsooV rm bolh boysuld glrlsl Mm used no he cnllcd o wmxodl schonll some
said on wax ml nculy so mixed as an mlndsohhe people whn an n These penple had um mu
m boyund glrlsshauld loeullma mdn who my mo And unlonunalcb mm ml or
mom aflhc hlggzsl bays mo girls liked best was bullying Ihe mm All suns orlllmgsl hound
mmgso went on Much u an nvdmlry saloon wnuld mm bum flwnd om ma snowed m lulfn
R1my hm al Ilus school xhcy vlucrfl Or mu Iflhcy mo mo people who am am wxc not
expellad m pomsloa The mm no they Mile lntntesilng psycholoycnl msxs mdsaul for
them and mm mlhem for hnun Mo Ifyml knew lhe nghl sorlofdnngxmsay In mo um
mo maul result wos um vou became mlhev 1 fmounlelhan olllnrwlsc
no mswmy ml Pole W crymg on ml dull autumn my on me dlmp Vmlc pith Much runs
bellman um um arm gym ma Ihe lhvubbezy mm ole mam nearly nmulea her ay whan
boy came round Ihz oomuonhogym Mxmlmg mm ms lnmlds m ms pocktu I12 mm In
lmo nu
 CuIV yuu look when yolfre gomw ma JIH Fob
Mu nglur sud me km won mam man a and am he mom hen rm ll WV Polef he
not was upv
ml only mndc lung the am you mm mo yodic llymg oo my somclhmg um um Ihn lfyou
spnk you1l smrl ctymg owl
 lfs mum I suww l as mualr sand me hwy Mlmlbx ouggmg ms hlnds nmm mm ms vovkals
ml waded Them wlsw moo forhurm sly llH1hVlIgoCVOllWiIE ooolo have Said u They both
knew
wow laok has said the beyl Wherek no gond us all r
He mezm WEIL am he am mlk mum mo mlnmne begmnmg n lecmne ml suddenly liew mm a
lmxpcr hvmdl Isqnllc Illkcly llllng Io hlppen Ifyou law been mmrupled in n cryl

I

我该怎么办?

他指的是像素分辨率(PPI),而不是图像尺寸


我重新缩放了图像(从96 DPI)到300 DPI,几乎所有的文本都被正确识别。在OCR步骤之前,图像肯定需要预处理。

他指的是像素分辨率(PPI),而不是图像尺寸

Tesseract *tesseract = [[Tesseract alloc] initWithDataPath:@"tessdata" language:@"eng"];
[tesseract setImage:chosenImage];
[tesseract recognize];

NSLog(@"%@",[tesseract recognizedText]);

我重新缩放了图像(从96 DPI)到300 DPI,几乎所有的文本都被正确识别。在OCR步骤之前,图像肯定需要预处理。

您使用的实际样本图像的分辨率是多少?1936 × 2592,我用我的ipad拍的…你使用的实际样本图像的分辨率是多少?1936年 × 2592,我用我的ipad拍的…没有人认出我用iphone或ipad拍的照片。。我该怎么办?我想从用iOS设备拍摄的图像中提取字符?如何使用objective-c重新缩放图像dpi?或者我应该?试着在SO上搜索“resize UIImage”。我用iphone或iPad拍摄的照片都不被识别。。我该怎么办?我想从用iOS设备拍摄的图像中提取字符?如何使用objective-c重新缩放图像dpi?或者我应该?试着搜索“调整UIImage大小”等等。
Tesseract *tesseract = [[Tesseract alloc] initWithDataPath:@"tessdata" language:@"eng"];
[tesseract setImage:chosenImage];
[tesseract recognize];

NSLog(@"%@",[tesseract recognizedText]);