In 1914, Emanuel Goldberg developed the first form of Optical Character Recognition (OCR). He developed a machine that could read characters and convert them into standard telegraph code. Over 100 years later, businesses are still utilizing OCR, but struggling to extract meaningful data from OCR to help power their business. Most OCR solutions successfully extract machine text from documents, but businesses fail to categorize this content in meaningful ways without manual intervention. This leaves accounts payable departments with a semi-successful solution.
Standard Invoice Solutions:
The contributors to the EZ Cloud framework have been consultants in implementing autonomous AP solutions for customers for over 10 years. Standard invoice solutions for parsing OCR output and mapping to metadata fields usually consist of basic regex code. This code simply searches for any text within the document that matches a set of rules defined by the code. This could be something such as a phone number has to be 9 digits, etc.
More advanced solutions may offer a ‘masking’ feature. Masking effectively hides a portion of the invoice from being scanned with the regex code. The issue with this approach is that not all invoices are the same. While in most invoices, the PO number may appear at the top of an invoice, this is not consistent across vendors. This combined with applying broad rules such as “pull back any number that is 9 digits as a phone number,” leads to incorrect data being extracted, if any data is found to match the code conditions at all.
The EZ Cloud extraction process was built around a framework that can support machine learning at multiple points. Rather than focusing on locating all numbers within a document that could potentially map to a metadata field, the EZ Cloud extraction process shifts its focus to the actual invoice.
If a human were to pick up an invoice and parse it for the PO number, our eyes would likely search the document for words such as ‘PO’ or ‘PO Number.’ Once located, we would automatically search adjacent to the words searching for the number. Our patent-pending extraction engine works similarly to how a human would search for the data.
As EZ Cloud launched as of May 2021, we have many more ideas being implemented in the near future to further improve our extraction, and therefore overall automation of the product. Popular products in enterprise settings typically report around a 70% extraction rate. In our initial testing, we obtained an 82% extraction rate.
Using machine learning, the EZ Cloud process ‘learns’ your organization’s process and continues to improve beyond the initial 82% rate. In essence, the more EZ Cloud is used, the better it works for your organization.