Connecting Voice Assistants and Sitecore via Cognigy Part 3: Putting it all Together

So you wanna have an Alexa skill use Sitecore as its content source?  Sure, you can make a direct call from the Alexa Skill to a Sitecore web API but this approach can leave your code in a tangled web of ‘ifs.’  In this blog series, I’ll explain how Cognigy.AI can be used to effectively bridge the connection–and handle most of the logic–between voice assistants (like Alexa and Google Home) and Sitecore content.

Pre-requisites: Part 1: Overview and Part 2: Connecting Alexa and Cognigy

Part 3:  Putting it all Together

In Part 1, I offered a brief overview of Cognigy.AI and explained how using Cognigy.AI can save time and code by handling the conversational logic when working with a digital channel (like a voice assistant, IoT device. or AR/VR) while pulling data from an external repository such as Sitecore.

In Part 2, I dove into the necessary components to create an Alexa skill that can be used as the conversation starter with Cognigy.AI. Part 2 leaves us with a handful of JSON given to Cognigy.AI from Alexa.  What do we do with it?

json alexa intent example

Cognigy Input Store

When data is passed from an endpoint into Cognigy.AI, it is saved as JSON in the ‘data’ node of the store called the “Cognigy Input” store (or ci).  This data lasts the lifetime of the current request.

Target Flow for an Endpoint

We’ll use the JSON data in a Cognigy flow. A Cognigy.AI project is made up several components including flows and endpoints.  The endpoint defines from where the conversation can start.  An endpoint has a Target Flow.  It is within this flow that we can start to evaluate this JSON data.

Based on the number of intents that exist within your connected Alexa skill, you’ll use the appropriate conditional logic node (either if..then or switch) in your Target Flow.  If you have three intents, it might look like this (below) with the bottom node acting as a catch-all.

switch statement

The switch statement is evaluating the property ci.data.request.intent.name from the Alexa data.  In this example, the Target Flow is being used to route to other flows based on the intent name.

Http Request

Now that we’ve identified the intent, we can call a web API endpoint in Sitecore to gather the data for the intent.  This can be accomplished with the Http Request node.

In the example below, we are using a GET Http Request.

Within the Url field, I am passing the City slot value to the Api WeatherController’s GetCurrent(string city) action. In the Headers area, the OAuth access token from Alexa’s Account Linking is passed.  The data returned from the API call is stored in the ContextStore, currentweather.

http request.png

Custom Modules

Cognigy.AI also offers the ability to set up Custom Modules within their Cognigy Integration Framework.  This Http Request node could be replaced by a module for Sitecore that contains pre-canned API calls.  This allows for more flexibility when creating flows.  With Custom Modules, flows can be set up without a non-technical user knowing all the details necessary for making an API call such as the one illustrated above.  Github sample of existing Custom Modules.

Cognigy Context Store

The Cognigy context store (or cc) is another data store that is available for the lifetime of the current session with Alexa (as opposed to the current request lifetime for ci).
Within the Http Request node properties, the ContextStore name is defined and JSON returned from the api call is stored in cc.[ContextStore].

In the image above, note the {{cc.apiBaseUrl}}. Cognigy.AI also allows you to store default data in the cc by defining the “Default Context” of each flow.  Flows can share the Default Context as well.

default context

Say Nodes

So, now you have your data back from Sitecore and you want to send it to Alexa.  Alexa accepts data using the Speech Synthesis Markup Language (SSML) format (which is based on XML) to assist with context and intonation.  Take a look at the Alexa SSML documentation.

The SSML is added to a Say node “Alexa” channel via the SSML Editor and it is sent to Alexa so she can convert the text to speech. Cognigy.AI also supports sending visual imagery (text only or text and image … video coming soon?) to devices with a display monitor.

Note that the Say node is reused among other channels (see the channel icons at the top).  This allows you to have a single set of flows used by every channel but allowing for the proper channel input.

say node

Finishing the Conversation

When a flow is complete, the Say node SSML is passed back to Alexa via the Alexa endpoint in Cognigy.

output json to alexa.png

Alexa then notes the outputSpeech in the response and speaks the SSML!

End Notes

I have just touched the tip of the iceberg of Cognigy.AI capabilities but this blog series should give you a starting point for your exploration into voice technologies.  Stay tuned for an upcoming post regarding using Sitecore data within a Cognigy.AI chatbot!

 


cognigy logo
Special thanks to Andy and Derek of Cognigy for allowing use of their demo site!

  • Andy Van Oostrum – VP Sales North America : a.vanoostrum@cognigy.com
  • Derek Roberti – VP Technology North America: d.roberti@cognigy.com

See other posts in this blog series:

Advertisements

Connecting Voice Assistants and Sitecore via Cognigy Part 2: Connecting Alexa and Cognigy

So you wanna have an Alexa skill use Sitecore as its content source?  Sure, you can make a direct call from the Alexa Skill to a Sitecore web API but this approach can leave your code in a tangled web of ‘ifs.’  In this blog series, I’ll explain how Cognigy.AI can be used to effectively bridge the connection–and handle most of the logic–between voice assistants (like Alexa and Google Home) and Sitecore content.

Pre-requisite: Part 1: Overview

Blog Series Part 2:  Connecting Alexa and Cognigy.AI

In the previous post, I offered a brief overview of Cognigy.AI and explain how using Cognigy.AI can save time and code by taking on the conversational logic when working with a digital channel (like a voice assistant, IoT device. or AR/VR) and pulling data from an external repository such as Sitecore.  Let’s dive into the Alexa Skills side of things and also see how to hook it up via Cognigy.AI’s endpoint.

Amazon Alexa Skills Kit Developer Console

The Alexa Skills Kit Developer Console is already well-documented so I won’t go into too much detail.  Instead, I’ll focus on what you need to know to get started with an Alexa Skill.

Invocation

When you set up a new skill, it needs to be invoked with a phrase.  Example invocations:

daily horoscopes Alexa, enable daily horoscopes
current weather Alexa, load current weather

Endpoints

To connect Alexa and Cognigy.AI, you’ll need the Cognigy.AI generated endpoint url and you’ll place it in the Alexa Skills Service Endpoint setting OR you can deploy the endpoint url to the skill from Cognigy.AI by selecting the skill.  For faster throughput, Alexa allows for multiple endpoints based on region.

alexa service endpoint

Intents

An intent is an interpretation of a user request and responds with an action to fulfill the request.

For example, let’s say that your local system stores the local daily forecast of every major city in a designated area. An Alex Skill intent would be the starting point to fulfill the request of getting the daily weather (“DailyWeatherIntent”).  To match the Alexa user’s spoken word to an intent, the intent must have utterances.

Alexa also has some built-in intents that you can incorporate within your skill.

built in intents

Utterances

An utterance is a sample phrase assigned to an intent that represents an anticipated request from the user.  This user request is accurately fulfilled with the action provided by the intent.

Utterances for the “DailyWeatherIntent” example might be the following:

  • What is the current weather?
  • What’s going on outside for today?
  • Is it going to rain?

While we are on the topic of utterances, we must talk about Natural Language Processors (NLPs).  Their primary responsibility is to convert text-to-speech and speech-to-text. With machine learning, they are able to handle variations of an utterance and still provide a positive match to an intent.  This allows you to minimize the number of utterances that you have to list for an intent.  An example, the phrase “Tell me the weather” could be uttered and auto-matched to the utterance “What is the current weather?”  Each NLP has different levels of sophistication so the number of utterances that you’ll need for an intent will vary based on the NLP used.  As you would expect, Alexa’s has been around for a while and is quite sophisticated.  Cognigy.AI has its own configurable NLP and is growing more intelligent with every release.

Slots

Aside from the NLP, utterances can be minimized with the use of slots.  Slots are words found in utterances that can be used in two ways:

  • as a synonym – to extend the NLP matching logic for an intent
  • as a variable – to pass on to the data repository (e.g. Sitecore)

A slot is represented within an utterance surrounded by curly braces {}. Let’s look at examples of both.

Synonym Slots

Utterance: What is the {weather}?

If the slot, {weather}, is defined with alternative context phrases like ‘forecast’ or ‘chance of rain,’ the following utterances are implied along with the utterance above and therefore do not need to be defined within an intent:

What is the forecast?
What is the chance of rain?

Variable Slots

Utterance: What is the current weather in {city}?

Examples:
What is the current weather in Los Angeles?
What is the current weather in LA?

The slot can also represent a variable that is passed on to the data repository. The slot type can be a finite list of acceptable slot values stored in the skill.  Each slot value can also have synonyms (like “Los Angeles” and “LA”) and an optional ID which could be used to better match to a value in the data repository.  The Alexa Developer Skill kit comes with pre-defined slot types or you can spin up your own.

A slot value could also be a true variable (without a list of possible values in the skill) and simply passed to the data repository for reconciliation.  Use the AMAZON.SearchQuery slot type.  Only one such slot type can be used per utterance.

Utterance: Switch to customer account {accountNumber}.

Example:
“Switch to customer account 12345.”

Required Slots

An intent may require that a slot is populated prior to its fulfillment.  In the example directly above, accountNumber is a slot that is important to have to be able to switch customer accounts.   The value for this could be gathered as shown in the utterance above or by a dialog between Alexa and the user.  Notice below that the utterance does not have a slot.

Utterance:  Switch the customer account

Example:

User: “Switch my customer account”
Alexa:  “To which account would you like to switch?”
User: “12345”
Alexa: “OK, one moment …”

slot filling

Account Linking

It is worth mentioning Account Linking.  Through OAuth and the Alexa App on the user’s phone, the user can log in one time to the data repository (like Sitecore) and establish an OAuth connection for the Alexa Skill.  This allows the user to access account information that would normally be exposed by logging in to their account on your web application.  Some great documentation is located on the Alexa Skills Kit site.

Account linking will enable Alexa to fulfill questions like:

Alexa, ask [your invocation] what is the next course that I should take?

Alexa, tell [your invocation] that I’d like to pay my invoice.

Alexa, ask [your invocation] for the balance on my account.

Alexa and Cognigy.AI JSON

And finally, let’s take a look at the JSON being sent from Alexa to Cognigy.AI.

Below we see that under context, the OAuth access token is sent (if account linking is set up).  This can be passed on to Sitecore from within Cognigy.AI.

Under request, we see the name of the intent and also any slots that have been fulfilled for the intent.  In the example, below we have a “WeatherIntent” with a slot of “City” fulfilled with the value of “Seattle.”

json alexa intent example

In the next post of the series, we’ll see how to consume this data within Cognigy.AI, grab data from Sitecore, and send an answer back to Alexa!  Stay tuned!


cognigy logo
Contact:

  • Andy Van Oostrum – VP Sales North America : a.vanoostrum@cognigy.com
  • Derek Roberti – VP Technology North America: d.roberti@cognigy.com

See other posts in this blog series:

SUGCON 2017 – My Highlights

I had the most fortunate opportunity to fly over from the US to Amsterdam to attend SUGCON 2017 (Thank you Verndale!). Wow, do the Dutch and Danes put on a great conference!  Although I’m still curious about whole walnuts and the shot of juice? served in the late afternoon on the last day, I learned a boatload about Sitecore and upcoming hot topics.  Here’s a summary of some of the sessions that I attended:

Keynote

Lars Nielson and Pieter Brinkman opened the conference with some inspiring words and asking us all to push the boundaries of Sitecore.

OData Content API

There was talk on Sitecore Services Client and its OData Content API; consolidate and reduce the number of technologies (and security risks) to handle HTTP calls.

SXA overview and HoloLens and Robbie and Alexa

hololens

We were given a demo on how quickly SXA can be used to spin up a website; in this case, a website that visually shows all user groups throughout the world, on a map.  But that’s not all … how about throwing in some HoloLens integration and Sitecore Cloud and the map becomes a globe and the globe can be spun, user groups can be chosen and opened for more detail.  Next, let’s add a connection (via a JSON Sitecore device) to a robot named Robbie and integrate it with Amazon’s Alexa … “Alexa, Ask Sitecore to find all user group events.”

What else?  Well, Robbie can see things; Robbie can see your face … Robbie has learned how to read emotions via facial recognition.  The future says to set up Sitecore personalization based on the user’s emotion.  Cool.

This spawned possible answers to questions like:

  • What factors are driving desirable behavior most strongly?
  • How can we prevent the loss of existing customers?
  • What audience behavioral segments are we unaware of? We could target them, or banish them.
  • Can we optimize the sequence of tactics in our lead nurturing program?
  • Can we automate up-selling and cross-selling?
  • Can we make our content authoring processes more efficient?

Sitecore Integration with Microsoft Cognitive Services

mcs

This is the session I was most excited about.  Back in March, for the Sitecore Hackathon, our group used Data Exchange Framework to tap into Microsoft Cognitive Services and pull in tags for images.  It wasn’t until after we were done that I learned that Mark Stiles had already been hard at work creating a layer to easily integrate Sitecore and these services … and he presented its possibilities at the conference.

Within his “Sitecore Cognitive Services,” there are three bits of functionality: the Bot Framework, Sitecore Cognitive Services Service, and the Sitecore Cognitive Services module.

The Bot Framework is machine learning at its best.  Program a chatbot to gradually learn how to properly respond to your customers questions.

The Service is a fast way to integrate your website with the plethora of features Microsoft Cognitive Services offers, like Computer Vision, Emotion Recognition, Facial Analysis,  Video Analysis, even Speech Analysis.  There’s so much to tap into.

The Module is taking this a step further and integrating all of this into the Sitecore Admin.  Faceted image searches, textual analysis of text within images, facial analysis of people within Sitecore images.  Yes, machine-generate the alt text for your images, recognize if an image is too, ahem, adult, what colors are used in the image?

The possibilities seem endless.  By the way, Mark hasn’t tapped into all of the possible Microsoft Cognitive Services functionality; if you’re interested in helping out, contact him.

xDB Contacts and Concurrency Control

Ohhhh … lots of yummy take-aways from this session as well.

For instance, did you know that there are four sets of APIs for accessing the xDB contact data? … which is in a number of locations/states, I may add …

  • Tracker.Current.Contact (in-memory)
  • ContactRepository (access the data in the Collection database)
  • SharedSessionStateManager (access the data in the Shared Session database)
  • ContactManager (utilize both the ContactRepository and SharedSessionStateManager)

Dmytro Shevchenko also discussed the need for a shared session (i.e. for being able to maintain a single contact from interactions coming from multiple private sessions).

A great illustration offered was the summary of Contact Lock stages.

contact lock states

In the initial state, the latest Contact data exists in the Collection database.

When the Session starts, the Contact is locked in the Collection database and copied and loaded  (and unlocked) in the Shared Session database.

On Page Request, the Contact is still locked in the Collection database, and also locked in the Shared Session database but loaded into the Tracker API at this time.

When the Page Request ends, the Contact is removed from the Tracker, it is saved and unlocked in the Shared Session database, and still locked (and not updated) in the Collection database.

At the end of the session, the Contact is no longer available in the Shared Session (of course) and is saved and unlocked in the Collection database.

Well, that’s my summary of highlights from Day 1.  I’ll be posting Day 2 highlights after my extended trip to Paris.