Azure OCR[打印文本]未按正确顺序读取收据行_Azure_Microsoft Cognitive_Azure Cognitive Services_Form Recognizer

Azure OCR[打印文本]未按正确顺序读取收据行

azure

Azure OCR[打印文本]未按正确顺序读取收据行,azure,microsoft-cognitive,azure-cognitive-services,form-recognizer,Azure,Microsoft Cognitive,Azure Cognitive Services,Form Recognizer,应用目标：读取收据图像，提取商店/组织名称以及支付的总金额。将其输入web表单，以便自动填写和提交 Post请求-“https://*.cognitiveservices.azure.com/vision/v2.0/recognizeText？{params} 获取请求-https://*.cognitiveservices.azure.com/vision/v2.0/textOperations/{operationId} 但是，当我返回结果时，有时会混淆行顺序（参见下图[JSON响应中的类似

应用目标：读取收据图像，提取商店/组织名称以及支付的总金额。将其输入web表单，以便自动填写和提交

Post请求-

“https://*.cognitiveservices.azure.com/vision/v2.0/recognizeText？{params}

获取请求-

https://*.cognitiveservices.azure.com/vision/v2.0/textOperations/{operationId}

但是，当我返回结果时，有时会混淆行顺序（参见下图[JSON响应中的类似结果]）

这种混合导致总金额为0.88美元

9份测试收据中有2份存在类似情况

问：为什么它适用于类似和不同的结构性收入，但由于某些原因，并不适用于所有人？还有，有什么想法可以绕过它吗？

我快速查看了您的案例

OCR结果

正如您所提到的，结果并不像您所想的那样排序。我快速查看了边界框值，但不知道它们是如何排序的。您可以尝试基于此合并字段，但有一个服务已经在为您执行此操作

表单识别器：

使用表单识别器和您的图像，我为您的收据得到以下结果

如下图所示，

理解结果

包含

总计

及其值（“值”：9.11）、

商品名称（“Chick-fil-a”）和其他字段
{
    "status": "Succeeded",
    "recognitionResults": [
        {
            "page": 1,
            "clockwiseOrientation": 0.17,
            "width": 404,
            "height": 1226,
            "unit": "pixel",
            "lines": [
                {
                    "boundingBox": [
                        108,
                        55,
                        297,
                        56,
                        296,
                        71,
                        107,
                        70
                    ],
                    "text": "Welcome to Chick-fil-a",
                    "words": [
                        {
                            "boundingBox": [
                                108,
                                56,
                                169,
                                56,
                                169,
                                71,
                                108,
                                71
                            ],
                            "text": "Welcome",
                            "confidence": "Low"
                        },
                        {
                            "boundingBox": [
                                177,
                                56,
                                194,
                                56,
                                194,
                                71,
                                177,
                                71
                            ],
                            "text": "to"
                        },
                        {
                            "boundingBox": [
                                201,
                                56,
                                296,
                                57,
                                296,
                                71,
                                201,
                                71
                            ],
                            "text": "Chick-fil-a"
                        }
                    ]
                },
...
OTHER LINES CUT FOR DISPLAY
...
            ]
        }
    ],
    "understandingResults": [
        {
            "pages": [
                1
            ],
            "fields": {
                "Subtotal": null,
                "Total": {
                    "valueType": "numberValue",
                    "value": 9.11,
                    "text": "$9.11",
                    "elements": [
                        {
                            "$ref": "#/recognitionResults/0/lines/32/words/0"
                        },
                        {
                            "$ref": "#/recognitionResults/0/lines/32/words/1"
                        }
                    ]
                },
                "Tax": {
                    "valueType": "numberValue",
                    "value": 0.88,
                    "text": "$0.88",
                    "elements": [
                        {
                            "$ref": "#/recognitionResults/0/lines/31/words/0"
                        },
                        {
                            "$ref": "#/recognitionResults/0/lines/31/words/1"
                        },
                        {
                            "$ref": "#/recognitionResults/0/lines/31/words/2"
                        }
                    ]
                },
                "MerchantAddress": null,
                "MerchantName": {
                    "valueType": "stringValue",
                    "value": "Chick-fil-a",
                    "text": "Chick-fil-a",
                    "elements": [
                        {
                            "$ref": "#/recognitionResults/0/lines/0/words/2"
                        }
                    ]
                },
                "MerchantPhoneNumber": {
                    "valueType": "stringValue",
                    "value": "+13092689500",
                    "text": "309-268-9500",
                    "elements": [
                        {
                            "$ref": "#/recognitionResults/0/lines/4/words/0"
                        }
                    ]
                },
                "TransactionDate": {
                    "valueType": "stringValue",
                    "value": "2019-06-21",
                    "text": "6/21/2019",
                    "elements": [
                        {
                            "$ref": "#/recognitionResults/0/lines/6/words/0"
                        }
                    ]
                },
                "TransactionTime": {
                    "valueType": "stringValue",
                    "value": "13:00:57",
                    "text": "1:00:57 PM",
                    "elements": [
                        {
                            "$ref": "#/recognitionResults/0/lines/6/words/1"
                        },
                        {
                            "$ref": "#/recognitionResults/0/lines/6/words/2"
                        }
                    ]
                }
            }
        }
    ]
}

关于表单识别器的更多详细信息：
您是否尝试过使用收据模型的“表单识别器”？这里似乎完全是您的情况（获取组织名称和支付总额）此外，您仍然可以基于边界框在OCR结果上构建自己的逻辑。我确实尝试过边界框，但不幸的是，它没有使它变得更好。嗯，有趣的是，实际上我还没有想到在这种情况下使用表单识别器（因为它在我正在处理的早期PoC案例中给出了完全错误的结果）。我暂时接受这一回答，并将在周一用表单识别器结果更新帖子的更多详细信息。感谢识别器，Microsoft在服务发布后发布了一个专用于收据的模型