Enabling Natural Language Conversations in IVR

Chatbots have the potential to disrupt customer support. They are built to understand questions asked by users in natural language. In other words users can speak as if they are speaking to a human agent and the chatbot is able to understand and react to those queries. Chatbots help in automating customer support so that human agents can spend time on the more complex issues.

Interactive voice response (IVR) systems were developed in the 70’s and gained popularity in the call center industry in 1990’s. IVR is a technology that allows a computer to interact with humans through the use of voice and tones input via a keypad. If you’ve ever called a call center (in retail, banking, insurance, etc) and got the message “press 1 for X, 2 for Y, etc) then you know what an IVR is. IVR systems usually respond to pre-recorded audio to further direct users on how to proceed. Primary intent of IVR was also to help ease the load from human agents either by collecting some initial information or by routing the calls to the right agents. Unfortunately most IVR systems are complex to maintain and don’t provide the best user experience.

One thing that is in IVRs favor is the fact that its deployed in production in most call centers and users are used to calling the 800 number. In this post we explore how old IVR technology can be integrated with cutting edge chatbots. Imagine calling into an 800 number and being greeted with a question like “tell us in your own words how we can help you” – handling such requests are what chatbots are good at. Integrating IVR with chatbot can help improve the user experience for customers who are used to calling in for support.

Here’s a high level architecture showing the end to end flow. This shows Cisco’s IVR system along with Google’s Dialogflow for Chatbot.

Here’s the description of the flow.

  1. IVR systems with VXML Gateway
  2. ASR / TTS components (e.g. Nuance) converts audio to speech and vice versa
  3. Use Cisco Call Studio to build a CustomElement (in Java) to make web service calls to send request to Dialogflow
  4. Use AppEngine to serve an HTTPS front end to
  5. TTS to convert response from Dialogflow to speech and relay to users

Lets see how we can build a demo IVR system using Twilio. Here’s a quick demo of such an IVR call.

Sign up for a Free Trial on Twilio. You can get free credits from GCP if you sign up here: http://ahoy.twilio.com/googlecloudplatform

Go through setting up programmable Voice. Here’s how the configuration looks like once you have your own number.

This screen shows the configuration in more detail. The Webhook points to an endpoint that can accept http posts. I provide more information on what this endpoint looks like. Whenever a user dials into this number Twilio invokes the webhook.

For this demo this endpoint was written using Python Flask. Here’s the snippet of code that gets invoked when the user first calls.

The verb collects digits or transcribes speech from a caller, when the caller is done entering digits or speaking, Twilio submits that data to the provided ‘action’ URL in an HTTP GET or POST request, just like a web browser submits data from an HTML form.

The below snippet of code captures what the user spoke and calls the detect_intent_texts function.

This function calls Dialogflow api to send the user’s query. The api matches against the intent and returns the configured response. See this link for more information.

When these individual components are brought together you can have an IVR system that’s integrated to a chatbot. This allows users who are familiar with calling 800 numbers to experience the user friendly engagement that a modern chatbot provides.