Selecting the Right NLU Service for chatbot applications
Our Japeto chatbot platform works with lots of NLUs (Natural Language Understanding) providers, including well-known AI products such as OpenAI GPT 3.5, IBM Watson, and Amazon Lex.
Behind the scenes, we use multiple NLU engines at the same time to enhance the accuracy of our chatbot, and to make sure we’re never dependent on any one provider. We’re also continually testing NLU engines to make sure we’re using the best one for each use-case.
When building the backend of Japeto we wanted functionality, pricing, and easy integration. To integrate we could use prebuilt libraries or write our own code to interact with the NLU service APIs.
Understanding NLU Terminology
Intents: These represent the goals or actions that users want to achieve. For instance, if a user says, “I want to order a pizza,” the intent could be “OrderPizza.”
Slots: They are like variables that help fulfil an intent. Slots might include pizza type, size, and quantity in our example. We need slots to complete the intent.
Utterances: Utterances are phrases or sentences users say to trigger an intent. In our example, “I want to order a pizza” is an utterance.
Slot Types: These define the possible values a slot can have. Slots might include values like “margherita” or “pepperoni” for our pizza example.
Using these terms chatbots that are capable of understanding and conversing with users.
Most NLU services use these terms in their language models. But sometimes the term or definition can be different between platforms.
Amazon Lex
Amazon Lex is the NLU engine which helps to power Amazon Alexa on its Echo products. Lex allows users to create chatbots using its interface, or via their API.
The interface that Amazon’s Lex provides is useful for smaller projects. You don’t need lots of programming knowledge to use it. It’s simple for designing, building, testing, and deploying conversational interfaces in applications, with a live test bot available in the interface.
For the API option, Lex follows the standard language model terminology we already covered. Below is an example of JSON needed to interact with the Lex API. The other fields needed aren’t included here, to simplify this example.
```json
POST /bots/botId/botAliases/botAliasId/botLocales/localeId/sessions/sessionId/text HTTP/1.1
Content-type: application/json
{
"dialogAction": {
"slotElicitationStyle": "string",
"slotToElicit": "string",
"subSlotToElicit": {
"name": "string",
"subSlotToElicit": "ElicitSubSlot"
},
"type": "string"
},
"intent": {
"confirmationState": "ElicitSlot",
"name": "OrderPizza",
"slots": {
"string" : {
"shape": "string",
"subSlots": {
"string" : "Slot"
},
"value": {
"interpretedValue": "string",
"originalValue": "string",
"resolvedValues": [ "string" ]
},
"values": [
"Slot"
]
}
},
"state": "string"
},
"text": "Pepperoni"
}
```
Pricing
$0.00075 per text request to the API. 1million requests = $750
Usage with Java/Spring Boot
AWS provide a comprehensive Java library for use with their Lex API, which you can find in the [Maven Central Repository]. This library includes all you need to connect with AWS Lex.
Key Takeaways
To summarise Amazon Lex is a powerful NLU engine, that has some quirks. One such example is shorter utterances such as “hi” need extra training data to be recognised. This means for Lex to recognise “hi”, you need to have training data such as “hi”, “Hi” or “HI”. A workaround to this would be to generate these using a Large Language Model such as ChatGPT. It is the only minor drawback in what otherwise is a very sophisticated NLU engine. AWS Lex features as one of the NLU engines in our Japeto platform
Snips NLU
(https://snips-nlu.readthedocs.io/en/latest/)
Snips NLU is an open-source Python library. Developers use it to extract structured information from sentences written in natural language.
Language Model
Snips’ language model uses the same terms described earlier in this article. But, there are two differences that could prove confusing to a first-time user. In Snips, ‘Slot types’ are ‘entities’, and ‘slot values’ are ‘slot names’.
Pricing
A huge benefit to Snips NLU is that it is open source and so free to use. It’s always worth checking that the development team are updating the software.
However, you’ll need to install it on a server you own, which might have costs of its own.
Usage with Java/Spring Boot
Snips is mostly built using Python, with some parts of the code base written in Rust. They have a neat command line tool that is easy to use when paired with their thorough documentation.
To use Snips NLU with our Java app, we would have to come up with a creative solution. One option could be building a Django web app with an API that we can call from Java. Since Snips is fully functional on its own (no external APIs needed). This option would have a similar response time to calling another platform such as Lex.
Key Takeaways
We did notice it took a little more training data for Snips to accurately match the user’s intents compared to Lex.
IBM Watson
(https://www.ibm.com/cloud/watson-natural-language-understanding)
Next up, we have IBM Watson. Watson’s ‘Assistant’ offers similar functionality to AWS Lex and is a great way to build a chatbot from the ground up.
Language Model
IBM’s language model has a similar approach to Snips in that it uses the term ‘entities’ as opposed to slots. Below is an example of the response that IBM Watson Assistant is capable of. Again, we have removed some of the fields for brevity.
```json
{
"intents": [
{
"intent": "book_flight",
"confidence": 0.9
}
],
"entities": [
{
"entity": "destination",
"value": "London",
"confidence": 0.95
},
{
"entity": "departure_date",
"value": "2023-10-01",
"confidence": 0.9
},
{
"entity": "return_date",
"value": "2023-10-15",
"confidence": 0.9
}
],
"input": {
"text": "I want to book a flight to London for October 1st returning on the 15th."
},
"output": {
"generic": [
{
"response_type": "text",
"text": "Sure, I can help you book a flight. Would you like to proceed?"
}
],
"text": [
"Sure, I can help you book a flight. Would you like to proceed?"
],
"nodes_visited": [
"book_flight"
],
}
}
```
Pricing
There are two pricing plans for Watson Assistant – Lite or Plus.
Lite is a free plan with which you get 10,000 messages per month and up to 5 skills. A skill can be considered like a version of a bot, which you can then assign to an assistant. With the plus plan it costs £110 per instance per month. This gets you unlimited messages and a limit of 50 skills.
The Lite plan could be used to trial the software before making a paid commitment. IBM also offer discounts for qualifying start-ups.
Usage with Java/Spring Boot
Akin to Amazon Lex, IBM offers a Java SDK for easy integration into your apps. This can be found here. The SDK allows the programmer to easily build and send requests to IBM Watson. Once you have built a request and instantiated a Watson Assistant, you simply pass in your text along with the assistant ID and session ID associated with your bot.
You can also customise the setup of your chatbot with the online UI as well as directly from the Java code.
Key Takeaways
Watson offers the same amount of flexibility and responsiveness as Lex in an easy-to-use package. In testing, we felt like the intent matching was better at times. But one drawback could be the price as if you are running more than 50 bots then your overheads are doubled.
However, IBM has an excellent track record of support for information security and business support, so can be an ideal choice for enterprise applications.
Google Dialogflow
(https://cloud.google.com/dialogflow)
Google’s Dialogflow is another easy-to-use platform to design and integrate a conversational interface. Dialogflow provides two different virtual agent services. Each of which has its own agent type, user interface, API, client libraries, and documentation: Dialogflow CX and Dialogflow ES.
Language Model
Dialogflow uses similar terminology to the other services (intents, entities). But in combination with what it calls ‘Flows’ and ‘State Handlers’.
```json
{
"responseId": "38e8f23d-eed2-445e-a3e7-149b242dd669",
"queryResult": {
"text": "I want to buy a shirt",
"languageCode": "en",
"responseMessages": [
{
"text": {
"text": [
"Ok, let's start a new order."
]
}
},
{
"text": {
"text": [
"I'd like to collect a bit more information from you."
]
}
},
{
"text": {
"text": [
"What color would you like?"
]
}
},
{}
],
"currentPage": {
"name": "projects/PROJECT_ID/locations/us-central1/agents/133b0350-f2d2-4928-b0b3-5b332259d0f7/flows/00000000-0000-0000-0000-000000000000/pages/ce0b88c4-9292-455c-9c59-ec153dad94cc",
"displayName": "New Order"
},
"intent": {
"name": "projects/PROJECT_ID/locations/us-central1/agents/133b0350-f2d2-4928-b0b3-5b332259d0f7/intents/0adebb70-a727-4687-b8bc-fbbc2ac0b665",
"displayName": "order.new"
},
"intentDetectionConfidence": 1,
"diagnosticInfo": { ... },
"match": {
"intent": {
"name": "projects/PROJECT_ID/locations/us-central1/agents/133b0350-f2d2-4928-b0b3-5b332259d0f7/intents/0adebb70-a727-4687-b8bc-fbbc2ac0b665",
"displayName": "order.new"
},
"resolvedInput": "I want to buy a shirt",
"matchType": "INTENT",
"confidence": 1
}
}
}
```
Pricing
There are three big differences between CX and ES that might inform your choice.
– CX offers built in testing and ML capabilities for training, whereas ES does not.
– ES is mostly text based whereas CX has a visual drag and drop UI.
– ES is multi-lingual, CX is only in English.
Price wise, CX would set you back $7000 for 1 million requests, whereas ES would cost $2000 for the same amount of requests. These are both charged at a flat rate.
Usage with Java/Spring Boot
Like Lex and Watson, Google offers a Java SDK to interact with Dialogflow. In the SDK, there are many methods provided for building and managing Agent allowing. Providing you a lot of control in your application.
Key Takeaways
DialogFlow provides similar matching capabilities as Lex, so may be a good fit if the rest of your infrastructure runs on Google’s Cloud. However, it’s worth bearing in mind that DialogFlow has more expensive per-message costs than Lex.
In general, it seems that DialogFlow is well-suited to smaller chatbots, i.e. those with a few intents which fulfil a specific purpose, like ordering products.
Rasa NLU
Rasa NLU prides itself on being the most popular open-source offering, with over 25 million downloads. Rasa NLU is a conversational, AI platform that understands and holds conversation. Through a set of API’s it can connect to messaging channels and third-party systems. It supplies the building blocks for creating virtual (digital) assistants or chatbots.
Language Model
Rasa uses intents and entities in their language model; but, utterances are simply called ‘examples’. Here is YAML file illustrating how you would train your bot.
```yaml
stories:
- story: story with a response
steps:
- intent: greet
- action: utter_greet
```
It is possible to play with Rasa here:
Pricing
Free, although with possible running costs as you would need to provide your own server.
Usage with Java/Spring Boot
Since Rasa is a Python library, we would have to use a similar process to Snips for the integration with Java.
Key Takeaways
Rasa NLU looks excellent from a high level and could be a good alternative to Snips. Due to its cost, it’s usage with Java and it has a similar way to define intents. Their documentation seems comprehensive, which is a huge bonus for any developer.
OpenAI GPT-3
We’ve saved the most talked about service until last. OpenAI has made headline news with their ChatGPT chatbot, which is based on their GPT-3.5 (4 if you have a plus subscription) large language model (LLM).
Language Model
To get around this, you can ‘fine-tune’ their base GPT-3.5 model to your own specification using carefully crafted training data. If cost is an issue, you can choose between four versions of GPT-3 which are slightly different in their capabilities – Davinci, Ada, Babbage and Curie.
To fine-tune, OpenAI have a Python CLI app that you can pass training data in JSONL format.
```
{"prompt": "", "completion": ""}
{"prompt": "", "completion": ""}
{"prompt": "", "completion": ""}
```
An example of a prompt/completion could be:
`{“prompt”: “User says: Hello. \n\nClassify the intent: \n\n###\n\n”, “completion”: ” Greeting\n\n###\n\n”}`
The hashes and newline characters are in keeping with OpenAI’s training data best practices (https://platform.openai.com/docs/guides/fine-tuning/preparing-your-dataset).
Pricing
Prices are per 1,000 tokens. You can think of tokens as pieces of words, where 1,000 tokens are about 750 words. This paragraph is 35 tokens.
The overall cost depends on which model you choose. There are as follows:
– Davinci (most powerful) = $0.02
– Curie = $0.002
– Babbage = $0.0005
– Ada (fastest) = $0.0004
Usage with Java/Spring Boot
To use OpenAI’s API with Java, you need to write your own code that interacts with the fine-tune and completion endpoints. As mentioned earlier, there is a Python CLI which is excellent. However, for our use it doesn’t give us the flexibility to plug in to our application easily.
Spring Boot’s RestTemplate library is a good starting point. It has most of the functionality needed to build your own app that communicates with OpenAI.
Key Takeaways
From our initial experiments, GPT-3.5 is the most reliable service at present for classifying a user’s intents. But we need to conduct more testing to make sure this is reliable.
We recommend selecting an NLU service based on your needs. Consider your budget, integration capabilities, and customization needs when making your decision. We encourage thorough testing to ensure the chosen service aligns with the goals of chatbot.
- Amazon Lex offers a powerful engine but may need extra training for short utterances.
- Snips NLU is a sound cost-effective open-source choice, but may need creative integration with Java.
- IBM Watson offers excellent flexibility and responsiveness, but costs can increase with more bots.
- Google Dialogflow offers versatile options, but costs vary based on the version.
- Rasa NLU is a cost-effective open-source option with comprehensive documentation, but may involve a complex integration.
- OpenAI GPT-3.5 offers reliable intent classification, but requires custom code for integration and fine-tuning for control.