At a client hackathon in the Netherlands, CGI AI Consultant Koen van Kan and his conversational AI team saw in the limitations of the free version of ChatGPT data an opportunity to develop a robust chatbot tool that merges a large language model (LLM)’s reasoning with custom data to provide accurate, personalized answers.
What follows is a summary of his interview with Teus Molenaar for Computable on July 10, 2023, in which he discusses the approach to the design and build of the solution, as well as the technical and ethical considerations for new AI tools.
Compiling a complete, up-to-date knowledge base
After identifying a limitation in the LLM’s factual knowledge (as the LLM’s knowledge is limited to information included in their training dataset and the earliest data being fed to the tool only dates to 2021), Van Kan’s team sought to create a solution that automates the search process in multiple, large data sources and adds custom data to the the LLM’s powerful reasoning capabilities so organizations can use it for specific use cases.
Van Kan’s team started by populating a knowledge base with documents containing the requisite information for the chatbot to respond to user questions, using the vector database tool Pinecone.
In this process, the documents’ texts first are converted into numbers (embeddings) to create numeric representations of the semantic meaning behind the text. This means that two texts with similar embeddings are placed close to each other in the vector space and have similar meanings.
“To illustrate, embeddings of ‘man’ and ‘king’ are more similar than embeddings of ‘man’ and ‘chess,’” Van Kan tells Molenaar. “Converting textual data to numerical embeddings is done by a machine learning (ML) model.” In this project, the team used OpenAI’s machine learning (ML) model text-embeddings-ada-002.
Using semantic search to create meaningful answers
Next, the team used Pinecone to perform a semantic search of the knowledge base, scanning for documents that contain content closest in meaning to the search query. Documents are then ranked from most similar to least similar to the user query.
As Van Kan explains, “Once you have the relevant information, we harness of the power of LLMs to come up with a meaningful answer.” As an additional step to ensure meaningful answers for users, the CGI team taught the chatbot to report that ‘no answer is possible,’ if no sufficient matching information is found in the knowledge base.
Enhancing with prompt engineering to optimize performance
The last step in the team’s process was to use prompt engineering to refine the interaction with the AI tool and optimize the answer provided. As detailed in the interview, prompt engineering (also known as prompt design) is a technique used in natural language processing (NLP) that entails carefully crafting input instructions to be used by the LLM for generating its output.
Per Computable: “In this process, a machine-learning model is trained on large amounts of text data to predict the next word in the sentence. The model can then be used to generate new text by predicting the most likely next word, based on the previous words in the sentence.”
While prompt engineering can also help with translation, using OpenAI’s LLM means that the language in any given document does not present a challenge, as there is no language barrier. Questions and answers can thus be provided in any language, regardless of the language in which knowledge base documents are written. Additionally, you can personalize the answer to a user’s age demographic or background knowledge, as well as tailor the answer to a particular bot persona or tone of voice, aligned with a company’s brand guidelines.
Exploring use cases while considering best practice
As Van Kan shares in the interview, linking an LLM to a knowledge base creates a powerful tool with many potential application areas. From automating HR-related questions from employees to enabling real-time pop-up knowledge articles to contact center employees to support their customer conversations, the goal is to make interactions more efficient and enjoyable.
However, Van Kan stresses that such a tool should only be used if there is a good knowledge base, in which both the LLM and semantic search are feasible and desired. If a knowledge base is not accurate or up-to-date, there is a risk of inaccurate or outdated data being retrieved, resulting in factually inaccurate answers.
He explains to Computable: “When building an embedding vector database, it is important to carefully decide what constitutes a single document. For example, a PDF, chapter, page or single paragraph? Weigh the need to capture enough information by choosing larger documents against the risk of introducing noise and the difficulty of finding specific details.”
Adopting a responsible, ethical approach to AI for successful outcomes
As increasing numbers of AI and ML solutions and use cases emerge, more organizations are adopting AI to automate and create processes or outputs. However, there is still work to be done to ensure the source data and resulting products are accurate, effective and ethical.
In the new CGI chatbot architecture, for example, there are multiple instances in which data is sent to third-party software tools. This necessitates thorough evaluation of these organizations’ privacy policies and of the sensitivity of your data. Van Kan also recommends consideration of “data storage policies, as the location of the data center where these tools store their data may vary, along with the associated laws they adhere to.”
In addition to technical elements, ethical considerations such as fairness, limitation of bias and equal treatment of user groups are critical in the development of interactive technologies. Particular attention should be paid to ‘explainability,’ or ensuring that the reasoning behind decisions or behaviors of an AI or ML tool can be understood by humans.
Van Kan is aware of his responsibility in creating a tool that relies on AI and ML to provide actionable information to human end-users. His take? “You generate text with a language model, so you always have to check the outcome with your own expertise.”
Visit the original Computable article in Dutch