Almost Successful: Amazon Textract for Automated Invoice Processing

victor_palacios · April 1, 2022, 4:00pm

Just wondering if you tried using the workflow I made previously with Tess4J? Since this is another table, I worry that OCR software which doesn’t specialize specifically in tables will return poor results (even Textract which can deal with tables gives errors as you mentioned). I tried investigating open source OCR software but most had subpar results for anything in a table format.

For the 2 becoming a colon, if that problem is localized to 2 and for that specific date format, you can run regex to find only that type of error and correct it.

*Almost* Successful: Amazon Textract for Automated Invoice Processing

Almost Successful: Amazon Textract for Automated Invoice Processing