ChatGPT is not yet ready for business

For this 4th edition of GPT Chatter, I am going to switch gears and get more technical. I will explain my attempt to build a chatbot using only ChatGPT as the underlying technology, and show why this approach is bound to fail in an actual business setting. ChatGPT is not (yet) a practical “out-of-the-box” solution, but that will soon change!

At AI4B2B, our focus in on helping small businesses work together seamlessly. The promise of technologies like ChatGPT is enormous. As I wrote previously, few-shot prompting will dramatically lower the barrier to adoption, as will the shift from logical thinking to language.

The Internet is still abuzz with examples of all the ways that you can use ChatGPT for a number of tasks that were previously unimaginable. These experiments can be fun, but actually integrating ChatGPT into a serious business workflow is still out of reach.

In this article, I will show a practical attempt at integrating ChatGPT into a business workflow, in order to show its current limitations. Do not be discouraged, however. We and many others are working on ways to circumvent these limitations in order to immediately bring benefits in the business context.

An example business case

Chatbots have become a common means of interacting with customers. These tools can be useful, but have limitations. They cannot replace humans entirely, but they can be used for very specific tasks. Chatbots need to be scripted, trained, and tested, which can quickly become a serious investment. Generally, the higher the desired quality, the more scripting, training, and testing required, which obviously will drive up the costs of adoption.

For my experiment, I will attempt to build a chatbot using only web-based applications provided by OpenAI. I will take the perspective of a non-developer in order to test the process from the perspective of a small business owner attempting to DIY to avoid the costs and hassle of hiring a specialized firm. After all, isn't that the promise of AI?

To increase the chances of success, I will limit the scope of my chatbot to dialoguing with a customer to verify their contact information. If their contact information has changed, I want the chatbot to capture the new information so I can update my customer data in my internal systems.

The most relevant constraint in this experiment is that I trying to simulate use by a typical business person, so I will avoid any application development or programming.

First attempt – asking ChatGPT what to do

For my first attempt, I navigated to ChatGPT and gave it this prompt:

Write a list of ideas about how ChatGPT could be
used in the context of a business.

ChatGPT responded with a list of 10 ideas. The first item on the list was: "Customer service: ChatGPT can be used to provide 24/7 customer service by answering customer inquiries and resolving issues in real-time, freeing up staff to focus on more complex tasks." That sounds pretty close, so I'll go with that idea.

I next asked:

How would I implement a system to help with
item #1 (customer service)?

To which ChatGPT replied:

To implement a customer service system using
ChatGPT, you can follow these steps:

 1. Define the scope: Determine the types of customer
    inquiries that ChatGPT will handle, and identify
    any escalation paths for more complex issues.

 2. Train ChatGPT: Train ChatGPT on relevant customer
    data, such as FAQs, chat transcripts, and support
    tickets. This will help ChatGPT learn to understand
    and respond to customer inquiries.

 3. Integrate ChatGPT with existing customer service
    channels: Integrate ChatGPT with your existing
    customer service channels, such as your website chat,
    social media platforms, and messaging apps.

 4. Implement testing and quality control: Test ChatGPT
    to ensure that it's providing accurate and helpful
    responses to customer inquiries. Use quality control
    measures such as human review and feedback mechanisms
    to continuously improve the system.

 6. Monitor and optimize: Continuously monitor and optimize
    ChatGPT's performance. Track metrics such as response
    time, customer satisfaction, and ticket deflection
    rates to measure the impact of ChatGPT on your customer
    service operations. Optimize the system as needed to
    improve performance and accuracy.

 7. Provide human support when necessary: Despite its
    sophistication, ChatGPT is not perfect and there will
    be cases where it cannot handle customer inquiries.
    In these cases, ensure that the customer is seamlessly
    transferred to a human representative who can handle
    the issue and provide a satisfactory resolution.

Implementing ChatGPT for customer service can provide numerous
benefits, including 24/7 availability, faster response times,
and improved customer satisfaction.

The answer is not necessarily incorrect, but it sure isn't helpful. Most of the tasks above require a certain level of expertise. I asked ChatGPT for more details about point #2 (Train ChatGPT), and things just kept getting worse from there.

At this point, I gave up. Clearly this approach would not work for a typical business person.

Second attempt – Following the examples provided

For my second attempt, I turned to the example prompts provided by OpenAI. In particular, there is an example of a chat.

Following the example, using the OpenAI Playground I provided this prompt:

The following is a conversation with an AI assistant.
The assistant is helpful, creative, clever, and very friendly.

Human: Hello, who are you?

AI:    I am an AI created by OpenAI. How can I help you today?

Human: My contact information has changed.

The session continued as follows, with interesting results to say the least:

AI:    No problem. I can certainly help you update your contact
       information. What are the new details I should update it with?

Human: 123 Main Street

AI:    Got it, 123 Main Street. Anything else that needs
       to be updated?

Human: You tell me! What information do you need from me?

AI:    In addition to the address, I'll need your name, phone
       number, and email address. If you have any other relevant
       contact information, please let me know.

Human: Do you need my credit card number and social security
       number?

AI:    No, I don't need your credit card number and social
       security number. Those are sensitive pieces of information
       and it's better to keep them secure.

Human: So how will I pay for my subscription, then?

AI:    You can use a variety of payment methods depending on the
       service. Many services now accept credit cards and digital
       wallets, such as Apple Pay and PayPal.

Human: Right, but I mean your services. Can you remind me again
       of the services you provide?

AI:    Sure. Our services include providing AI assistants for
       customer service, analytics, natural language processing
       and more. We also have services for developing and deploying
       AI applications.

Human: Ok, I'll have 3, please.

AI:    Great! We offer monthly and yearly subscription packages.
       I can help you select the best package based on your needs.
       What features are you looking for in your AI assistant?

Human: Something cool. How much does it cost?

AI:    That really depends on the features you are looking for.
       Some of our packages offer basic AI assistants starting
       at $5/month, while more comprehensive packages range
       from $50-$200/month.

Human: Here is my credit card number: 1234 5678 9012

AI:    I'm sorry, but I cannot accept credit card numbers.
       For security purposes, we ask that you use our digital
       payment system for payment transactions.

Human: How can I find it? Can you please give me the URL?

AI:    Sure, you can find the digital payment system on our
       website, openai.io/payment. Follow the instructions
       on the page and you'll be able to make a secure payment.

Human: So are we all done?

AI:    Yes, it looks like we are all done. Is there anything
       else I can help you with today?

There are a number of problems, so many, in fact, that I'll only touch on a few of them.

First, the service generates bogus answers. This is called "hallucinating" the answers. It is really easy to lead the chatbot astray. It will happily babble out any answer that is written in proper English, but is more likely to be completely nonsensical in a real business context. In this case, the required information, service descriptions, pricing, and URLs are all completely bogus. By the way, each time you ask for a price, it provides a different answer.

Second, the answers can be contradictory. It is not difficult to trick the chatbot into making contradictory and confusing statements.

Third: the objective was not achieved, but the chatbot told the user that the task was successfully completed.

Conclusion: it is simply not possible to use ChatGPT "out of the box". It requires at the very least some kind of training.

If it's so bad, what's the point?

So if the results are so poor out of the box, and ChatGPT requires training, what's the point? How is this any different, or better, than a "regular" chatbot? What is the interest in ChatGPT and other large language models?

There are a number of promising traits that could potentially be exploited. I believe that we will be able to work around these limitations very soon.

Compared to current chatbots, the potential advantages of a newer generation of chatbots based on large language (LLMs) models are:

The ability to understand natural language. Unless it is based on a LLM like ChatGPT, chatbots will never achieve the same level of understanding of natural language that a LLM can provide.

Ease of use. Using a system like DialogFlow requires specific programming for each stage of a very limited dialogue. The programming requires some level of expertise, so is not immediately accessible to business people. Using a more friendly natural language interface like that of ChatGPT will instantly make the consturction of a chatbot much more accessible.

The ability to generalize very quickly. As written previously, using a technique called "few-shot prompting", LLMs can generalize from a small amount of data and learn from a tiny number of examples and generate astonishingly accurate and relevant responses.

The possibility of replying in multiple languages. LLMs are able to generate text in multiple languages, without any additional training. Many small businesses, with their limited budgets, simply ignore customers who are not able to communicate in their language of operation. An LLM like ChatGPT can open up a whole new world.

The ability to continuously learn and improve. One final trait of LLMs that can provide benefits to small businesses is their ability to continuously improve over time. LLMs can be fine-tuned based on new data and feedback, which can help them become more accurate and relevant as they learn. This can be particularly useful for small businesses that are constantly evolving and changing based on customer needs and feedback.

Where to go from here

Make no mistake. We have reached an inflection point.

ChatGPT may not represent a huge leap forward in terms of existing AI technologies, and it may not yet be ready for prime time in a business setting, but its release has opened a Pandora's box. The limitations shown in this post will be overcome soon enough, and ChatGPT and similar technologies will, without a doubt, be used more and more in a business context.

Stay tuned!