Using Salesforce and Google Cloud to Build OCR App

Salesforce is a great platform. They provide everything you need to build robust applications and data models. However, modern enterprises are heterogeneous and sometimes the process you need isn’t part of the platform. In these cases, you have to reach outside the platform. In this post, we summarize a demo where we use Salesforce Apex to make a remote call from Salesforce to Google Cloud. For this demo, we built an Optical Character Recognition app that parses text from a picture of a Driver’s License and creates a Driver’s License record in Salesforce and associates it with a Contact. Of course, the data returned from Google Cloud can then be used to populate any object or trigger other actions.  

High-level steps

  1. Use Lightning Component to load driver’s license image
  2. Send image to url that will delegate to Google Cloud to read the data
  3. Receive JSON payload from the url call
  4. Parse the JSON and store values in DriversLicense object
  5. Associate Drivers License object with Contact and save
  6. Store image as an attachment to Contact

Discussion

First, let’s assume the data we get back from Google Cloud will populate a Salesforce object called DriversLicense. See Figure 1. Second, let’s assume the process exposes a url that accepts a base64 encoded string that is an image of a driver’s license. Details of the Google Cloud process are beyond the scope of this write-up, however, in the SpringML demo the process is exposed as a url endpoint using python and flask hosted on an application server.

Driver licence Demo pic 1
Figure 1


The next thing is to start figuring out how to load the driver’s license image into Salesforce. We can do this by creating a Lightning Component. This component should do three main things;

  1. The component needs to define the visual widgets for the user to interact with (buttons, spinners, image, etc). 
  2. It needs to define some event hooks for load image and get the image. You will want to handle the load image event so you can do the reading of the selected file, but also so you can send the read data to your remote process. You will use get image event to read an existing image from Salesforce and display it with the Contact object. 
  3. Finally, it also will specify an Apex class. This is where the real work takes place. The Apex class will be responsible for handling the call and response of the url endpoint. It is also responsible for retrieving an existing image and storing a parsed image back into Salesforce.
DriversLicenseDemo Pic 2
Figure 2 – Driver’s License component before loading an image

The Apex class has two functions which are getDriversLicenseDetails() and getImageUrlFromAttachment(). The getDriversLicenseDetails is the process by which we call the Google Cloud process and upon successful retrieval of the data we store the image into Salesforce The other function getImageUrlFromAttachment is called by the Lightning component to get a previously stored image to display. We will get into more details of this class later.

For the moment, let’s go back to the component. You will need the component to provide an interface for the end-user to load or reload a drivers license image. It will also need to display the loaded image. We are not going to get into the details of binding events to helper methods here. The goal of this discussion is not to provide a walkthrough of how to create a component. For more information see the Salesforce trailheads. To see what our component looks like to the user see Figure 2 and 3. Our component is labeled “Upload a drivers license image.”

 

DriversLicenseDemo Pic 3
Figure 3 – Driver’s License component after loading an image. Note the DriversLicense object details are populated. This data was read from the image and a new DriversLicense object was created with that data.

At this point, we have a Lightning Component that will give the user the ability to select an image to be loaded into Salesforce. Now we need to actually read the file and send it to the Apex class function that will call the Google Cloud process. To do this we bind the onChange event of the component’s file input element to a onReadImage controller method. This controller method reads the file from the user’s storage, does the base64 encoding, and delegates the processing of the image to an Apex class that will send it to Google Cloud for processing.

The Apex class method handling the call to Google Cloud which we called getDriversLicenseDetails needs to accomplish a few things. 

  1. Make an http post call to the url endpoint 
  2. Handle the response from the url endpoint 
  3. Parse the payload
  4. Create or update the DriversLicense object for the Contact
  5. Write the file to Salesforce attachment for the Contact.    

To make the HTTP request call you will want to use the HttpRequest object. To handle the response you will use the HttpResponse object. They are both readily available to use in your Apex class. Below is a basic template for making an HTTP call with a JSON payload.

Basic HTTP Request/Response Template

Http http = new Http();
  HttpRequest request = new HttpRequest();
  request.setEndpoint();
  request.setMethod('POST');
  request.setHeader('Content-Type', 'application/json;charset=UTF-8');
  request.setBody('{"image_content": "' + base64 + '"}');
		
HttpResponse response = http.send(request);

if (response.getStatusCode() != 200) {
   System.debug('The status code returned was not expected: ' +
   response.getStatusCode() + ' ' + response.getStatus());
}

//handle response payload here

The last thing we need to cover is the second Apex class method called getImageUrlFromAttachment. This method is responsible for reading a previously stored image by the getDriversLicenseDetails method and passing it back to the calling routine. In our case, the calling routine is an Apex handler. The handler is bound to the onLoad event of the component. So what we end up with is when the component loads or refreshes it will call a controller method that ultimately calls our getImageUrlFromAttachment and then writes it to an image element defined in the component.

Summary

In summary, Salesforce has given us the ability to reach outside the platform to access resources. In this discussion, we showed how to make a remote call to Google Cloud to build an Optical Character Recognition (OCR) App that saves users lots of time from having to manually enter numerous Driver’s License records. Instead, all the user needs to do is take a picture! We outlined what is needed to call an external resource and use the response from that resource to populate a Salesforce object. Although this is an example, it’s easy to see how using Salesforce Apex opens up more options for interacting with other platforms.

Be sure and watch the SpringML demo Using Machine Learning to Deploy OCR in Salesforce

If you’d like additional Salesforce resources, check out these related SpringML resources: 

If you still have questions, drop us a line at info@SpringML.com or tweet us @springmlinc.

With over 150+ implementations, SpringML can help your business reach new heights using its industry, business and expert-driven services. If you are ready to learn how the Salesforce Einstein Bots can help improve customer experience, talk to our experts now.

Special Thanks:  Nicolai Johnson-Borelli also contributed to this blog post.