Chatbots are stupid

My mission is to apply cutting-edge AI technologies in a B2B context to reduce the friction of doing business, especially for small companies and the Chief of Everything (COE). Perhaps like me, your interest has been piqued by all the recent noise about ChatGPT, which is based on a large language generative model with the ability to generate human-like output. Although ChatGPT shows great promise, it is not yet ready for business.

In this article, my aim is to provide another short demonstration to show why ChatGPT is a huge step in the right direction.

Artificial Stupidity

Put succinctly: chatbots are stupid.

No offense to all the brilliant people who have contributed to this complex and difficult field. Passing the Turing Test is hard. (I had to retake it 3 times myself.) So many people are doing so many amazing things, and we are getting closer. But chatbots ain't it.

Chatbots are mainly for B2C use, but I use them as a representative example of the use of AI in business because they are so well known. Traditional chatbots, which I think can be represented by Google Dialogflow, require logic, structure, planning, and lots of care. To produce a reasonably successful chatbot, it takes work, by a trained expert. That is why currently developing a chatbot is still a somewhat costly proposition, though it does continue to become more accessible. At this point in time, they can provide value to interactions that have a high volume, such as customer interactions in a B2C setting.

Until the costs of producing a chatbot drop even more significantly, they will only be useful in this type of situation where the volume warrants the investment. In other words, AI will remain out of reach in many B2B contexts that are, by nature, not repetitive enough to warrant a large investment.

A brief introduction to Dialogflow

Google Dialogflow is a natural language processing (NLP) platform that allows developers to build conversational interfaces, such as chatbots and virtual assistants. The platform uses machine learning algorithms to understand and interpret natural language input from users, and can generate responses based on predefined rules and data.

Google provides two editions. Dialogflow CX provides an advanced agent type suitable for large or very complex agents. Dialogflow ES, for its part, provides the standard agent type suitable for small and simple agents.

In this demo, I used Dialogflow ES. I will try to explain what I did, using as few technical terms as possible, but to understand the experiment, a few words are in order.

One important concept of Dialogflow is what they call an agent. An agent is a natural language understanding module that interprets the nuances of human language, and handles conversations with the user.

Dialogflow also uses a concept called intents to partition a dialog into smaller chunks. To try to make sense of what the user writes, when the user enters a prompt, it will be parsed and fed into a machine learning (ML) model. The model will attempt to match it to an intent. An intent could be virtually any topic, provided that it is simple enough to deal with. What is "simple" is a design decision that is made based on the experience of the chatbot programmer.

When instantiating a new agent, Dialoglow automatically provides two intents out of the box: a Default Fallback Intent and a Default Welcome Intent. These will be sufficient for this very simple illustration.

The Default Welcome Intent handles the initial greeting from the user. For instance, when the user appears for the first time, the agent would parse the initial "Hello" message and match it to the welcome intent.

The Default Fallback Intent is matched when the agent fails to match the user's prompt with any other intent.

To simplify the test, I programmed the Default Welcome Intent to always respond with 👋 (emoji for a waving hand), and the Default Fallback Intent to always respond with 🤯 (emoji for an exploding head).

Battle of the bots

Using the configuration described above for Dialogflow, and the default chat example from OpenAI I provided very simple prompts in order to test out both Dialogflow and ChatGPT.

First up, Dialogflow.

User:  Hi!

Agent: 👋

User:  Wassup?

Agent: 🤯

Ok, it didn't take much to break down. Over to ChatGPT:

Human: Hi!

AI:    Hi there! How can I help you?

Human: Wassup?

AI:    Not too much, just trying to help you out! What can I do for you?

ChatGPT wins this round hands down. Let's try again.

User:  Hi chatbot, can you help me?

Agent: 👋

User:  That wasn't a very helpful reply. You don't seem too smart to me.

Agent: 🤯

Two completely useless replies.

Human: Hi chatbot, can you help me?
AI:    Absolutely! What can I do for you?
Human: That wasn't a very helpful reply. You don't seem too smart to me.
AI:    I apologize if my response didn't give you the help you need.
       I'm here to provide assistance, so please let me know what I can do for you.

Clearly, ChatGPT is the winner again. Let's try one last time.

User:  I need some help, please!

Agent: 🤯

Even this simple phrase doesn't work without instructing Dialogflow what to do.

Human: I need some help, please!
AI:    Sure, what can I do for you?
Human: What products do you offer?
AI:    We offer a wide variety of products and services,
       including software, hardware, and services.
       We specialize in customizing solutions to fit your
       needs. Do you have any particular product or service in mind?

ChatGPT provides very friendly human input, but it immediately starts confabulating.

From this very simple demonstration, there are a number of observations we can make:

Instructions

Dialogflow has to be told what to do, and it has to be told almost everything
ChatGPT does not need to be told what to do. It has an answer for pretty much everything (though not always the desired answer)

Details

Dialogflow requires a lot of details to work
ChatGPT seems to require very few details

Limitations

Dialogflow seems to hit its limitations very quickly
ChatGPT seems to have much more flexibility

If Dialogflow is representative of the current generation of chatbots, then we can assert that chatbots are cumbersome, inflexible, and costly. They are sometimes useful and are an impressive feat of technological advancement, but nonetheless extremely limited from a conversational point of view.

That said, because of ChatGPT's tendency to confabulate and hallucinate, in a business context, there is not much choice: use a traditional chatbot, and invest lots of time and money to make it somewhat conversational.

My conclusion is that generative AI like ChatGPT provides much better, more intelligent human-like output with considerably less effort. Once we can overcome some of the current problems with ChatGPT, rather than stupid chatbots that are difficult to implement, we will have intelligent chatbots that are accessible to all.

Ways to make AI accessible for B2B

Using generative models like ChatGPT will soon solve these problems that traditional chatbots suffer from.

Because they only require some prompting, and are based on a more natural way of thinking, they are much more accessible than chatbots. I predict that very soon, this will drastically reduce the costs of design and implementation, making AI more accessible in a B2B context, especially for small companies.

In our experiments, we have gotten fair results so far by considering the following in our application designs:

Including precise context. By augmenting the prompt with the context of the business and the current processes the model generates very good responses.
Including guardrails. By including a number of guardrails, we are able to reduce the hallucinations and confabulations that occur when the model goes off script.
Never leaving the model unattended. By always including human supervision, and making it effortless for the supervisor to manage, we can ensure that hallucinations and confabulations don't get sent out. These can actually be used as opportunities to further train the model.

Conclusion

Chatbots are stupid.

However, ChatGPT is a game changer, putting new possibilities on display. We predict that AI will be commonplace in the B2B setting very soon.