Back to Hub

Debunking the Myths of Optical Character Recognition: What It Is and How It Affects Accounts Payable

09/14/2018 By

Spend Matters welcomes this guest post from Justin Holden, vice president of sales, Yooz North America.

OCR: Another one of those technical buzzwords that we’re hearing a lot about these days. It’s of particular interest in fintech (another new buzzword that stands for “financial technology”) and even more specifically in accounts payable. But what is it? More important, what isn’t it? And how does it make a difference in the accounts payable (AP) workflow?

OCR stands for optical character recognition. It is the mechanical or electronic conversion of images of typed, printed or even handwritten text, into machine-encoded text. The text can come from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo), or from subtitle text superimposed on an image (like from a television broadcast).

It looks like this1:

Simply put, it’s a computer looking at an image or file and being able to identify what is on it. And herein lies the biggest misconception. Many confuse OCR with data extraction.

So, what is OCR exactly? OCR is a technology that turns a picture into words. The next layer, smart data extraction, understands and processes the text from the OCR to transform it into relevant data. As many of you are exploring AP automation providers, you may ask, “Do you have OCR technology?” This is an important question. But what you really want to know is if the solution has a complete technology, combining OCR, smart data extraction and machine learning. Today, there are three predominant types of extraction technology:

  1. Human verified or outsourced extraction
  2. Zonal-based extraction that utilizes predefined templates
  3. Systems based upon artificial intelligence (AI) or machine learning.

These are necessary, as OCR by itself does not know what to do with the information it reads. Some providers might use OCR, but then apply human extraction, outsourcing to a third party — also called third-party verification. OCR extraction that layers human verification uses people to put data read by the OCR into predefined fields. In this scenario data entry is done by an outsourced firm and takes time as the data is being populated by people, typically 24 to 72 business hours.

In a template-based data extraction tool, a user has to predefine specifically “where” on a document a specific piece of information can be found and “what” the tool should do with the data it finds. The extraction process can be done fairly quickly; however, it also can become an administrative burden, as templates must be managed and updated as documents change. In this scenario, humans have to constantly manage the templates — read them, interpret them, and update them. This might defeat the intent of transitioning to AP automation because you are not saving time, and it might even be more time-intensive. Even duplicating efforts in some respects.

Data extraction using AI or machine learning is able to “understand” what information on a document needs to be used and, more importantly, what should be done with said information to make it relevant data. For example, technology utilizing machine learning is able to populate the total amount of an invoice without being taught or shown where to grab the data. Because the tool has seen thousands of examples it is able to draw on past experiences to make conclusions.

When it comes to some AP automation solutions, smart data extraction technology leverages OCR to read information from scanned/photos of paper invoices or invoices images received via email. It then interprets the information, extracts the relevant data, then applies it to the appropriate field in the application to then be reviewed and sent for approval. Finally, the data is exported to an ERP. If there are pieces of data that cannot be interpreted or read, the technology learns over time how to extract those missing pieces. This is referred to as machine learning and is powered by AI

With constant enhancements, no end user is ever involved to teach the software. The staff transitions from manual data entry and third-party verification to simply reviewing data extraction for accuracy. If there is a miss, the reviewer can click inside the application to quickly correct it and flag the miss. Utilizing machine learning optimizations, the system will become more intelligent over time, reducing the number of mistakes.

Today’s organizations are focused on speed, efficiency, and leveraging technology to solve business problems. When looking at options, take the time to first set your business goals and determine what challenges need to be solved. Then find a solution that solves as many or all of those critical needs. Sure, you can ask, “Do you have OCR?” But don’t stop there. Keep digging until you have a complete understanding of each solution you are considering and, more importantly, what best suits your business.

1Hewlett Packard Enterprise Development LP. 2018. Retrieved June 29, 2018, from https://dev.havenondemand.com/apis/ocrdocument#overview