Pages

Sunday, November 19, 2017

GDG Devfest Hyderabad 2017 - Building Google Assitant App

Growing up we read stories where we talk to robots and get our things done. There has been decent amount of advancement in robotics to help us in the manufacturing industry and space exploration.However, it is still not a thing that someone can own one easily and dictate with commands.  Voice-based interaction and computing is a new and exciting area. With so much of investment in Amazon's Alexa, Google's Assistant, Apple's Siri and Microsoft's Cortana etc, it's definitely going to be a serious business in near future. Right now most of us are just having fun with knowledge based questions and puzzle/riddles. These platforms also allow us to create our own app and make it accessible to the whole world to interact with just user's voice.

Lately, I have been trying out building such apps with Google Voice Actions and got a chance to demo the same at GDG Devfest Hyderabad this year. Here is the slide, however it was mostly a live demo.

I tried to create a Google assistant app which will respond with the stock price of any company registered in NSE.

Creating an Agent and Intent

  1.  Head over to DialogFlow.com to create your own agent. Name it "Market-Guru". Note every agent is an app once integrated to Google Assitant.
  2. In the Intents section, you can change the default and fallback intent to customize what should it respond when you enter the app (e.g "Welcome to Market Guru") and when it does not recognize any intent. (e.g "I can't tell that, you can try asking stock price of companies.")
  3.  Create a new Intent called "Stock-Teller". This is something that should be invoked when you ask the app a certain question. 
  4. You can enter this question and its varients in the "User says" section of the intent.
  5. In the "Text Response" section at the bottom, you can mention what it should answer when you ask the question.
  6. Once you save the question, you will get a toaster saying "Agent training started". This is when dialogflow trains its model using the sample questions you provided for the intent and its own language knowledge. 
  7. You can test your agent right now by typing the questions in the top right section. e.g. "What is the price of Infosys limited?" and you should get the hardcoded response that you have typed in earlier. However if you ask something like "How are you?" It does not really match with the questions in the intent "Stock-Teller" or any other intent defined, it will fall back to the fallback intent and respond with a generic answer that you may have hardcoded there. 
  8. You can also ask the variant of the question like "What is the share price of Infosys Limited" and it should be already smart enough to match with the intent "Stock-Teller" although this exact question is not mentioned.
  9. Now you may want to recognize the company name that was spoken in the question. So you can click on the phrase "Infosys Limited" and create a parameter and name it company_name. The entity of the parameter should be @sys.any because it could be any text, right now we don't have an exhaustive list of company names.
  10. You can change the response from hardcoded string to "I don't know $company_name price yet." and test again. It should be able to respond the recognized company name.
  11. Extracting keywords/parameters from questions.
  12. Try adding more varients of questions and tag the company name for better accurancy. like "How much does Infosys cost?"

Fulfillment

So far we have just hardcoded the response. To dynamically fetch the required data from any web service you can make it talk to a web service.
  1. Since dialogflow calls web service and expects the response in a specific format you have to create an endpoint to plug.
  2. You can create a cloud function at Google Cloud Platform Console and use the following Node.js code to return the stock price given the company name. This actually calls another web service to fetch the stock price and formats in a way that dialogflow can understand.
  3. In Fulfillment section add the generated endpoint URL.
  4. After saving, you should go to the intent "Stock-Teller" and check the box at the bottom which says "Use webhook" under Fulfillment section. This will make it call the webhook URL instead of responding with the hardcoded reponse.
  5. Now you can test again and if the URL call is sucessful, it should return the stock price and you can try the play button again the reponse to read it loud also. If the URL call failed, it would fall back to the default hardcoded reponse.  
  6. If something goes wrong, check the JSON request/response from the bottom right side and verify if correct parameters are sent or the webhook response code is HTTP 200 or not. If the webhook execution fails, then you may want to check the cloud function log to see any problems in Node.js code. 

Integration

Finally we want to make it available in Google Assitant.
  1. In the integrations tab enable "Google Assitant" so that it would call dialogflow API (Google Assitant would convert speech to text and send a query to dialogflow with question in text format.) to answer your questions. Note, in this screen you have a lot more options like Skype and Slack to create a chatbot, however we need a voice conversation.  
  2. Once you click the Google Assitant integration, you can click on the "Test" to make your app available for simulation and then click "View" to head over to the Google Voice Actions simulator. You can also select the agent that you have created at the top.
  3. In the simulator you can just type or speaki using the mic button to say "Talk to my test app" and it should pull up your recently created agent from dialogflow.
  4. First it would greet you with the response mentioned in "Default response"
  5. Since your app is up for simulation already, now you can try this from any of your device live phone's Google Assitant , Google Home or Raspberry Pi which is signed in with the same Google accout that you created your test app.
  6.  You can also just import the while agent which I have exported to setup everything at once, instead of creating from scratch in the console. You may have to adjust the webhook URL to make sure it point to the current one. I may take down the demo cloud function to avoid cost.
    https://github.com/neilghosh/devfesthyd2017/raw/master/Market-Guru.zip
  7. All the things that you do within the dialogflow console can also be programetically done using dialogflow APIs.

Training 

If at any time during testing the intents ad parameters are not recognized properly then  you can head over to the training tab and correct the behaviour for any bad executions in the history. This will re-train the machine learning model helping with future convesations.

Other Possibilities.

Here are some more stuff you can do to make your UX better.
  1. Make the parameters "Action" section of the intent mandatory to ask the user back if he gives incomplete information in the question.
  2. You can create a follow up intent for any intent to make confirmation or get user choice before you execute what user says. 
  3. For follow up questions, you can use the parameters from the existing intent's context. i.e. #context_name.param_name would retrive any earlier parameters and user does not have to repeat the whole question all over again just because he missed out on a specific piece of information.
  4. You can set output context and take decisions based in the intent name from within the webhook prgramatically.
  5. You can optionally send rich output to your phone's Google Assitant which can include images, cards etc.