Building and Optimizing a Knowledge Base for AI Assistants

Background and Purpose

This SOP outlines best practices for creating and configuring a knowledge base to enable AI assistants to deliver accurate, domain-specific, and business-specific answers. The process ensures consistency and reliability in handling FAQs, pricing, objection handling, and other contextual information.

Concepts

Knowledge is provided as context on a per-interaction basis when engaging with leads or contacts. The system processes the contact's message and compares it to the information in your knowledge base to generate specific responses.
The most effective information for a knowledge base includes key-value pairs or cause-effect relationships, such as FAQs, objection handling strategies, and domain-specific or business-related questions.
The AI does not inherently "know" the knowledge; instead, it uses the information fed to it as context to assist the language model in generating suggested answers during conversations. The LLM does not recognize it as a knowledge base but understands it as contextual input.
The quality of the knowledge output is directly tied to the quality of the input. Poor, inaccurate, or unclear data in the knowledge base will lead to similarly poor responses.

Best Practices

When creating a knowledge base for a client or for production use, it is important to ensure high-quality information and outputs. Using the FAQ input section is recommended as it allows you to monitor and manage the relevant information being added to the knowledge base.
Raw text and FAQ inputs are considered the "best result" or preferred input types for knowledge. This is because document uploads and website scrapes may introduce conflicting or outdated information, which can lead to irrelevant or incorrect answers. While these input types are still valuable and recommended, raw text and FAQ inputs are typically more reliable, especially for enterprise deployments.
You can source raw text from your website, documentation, YouTube transcripts, community posts (ensuring the information is accurate), or any other relevant materials. These can then be used in a general AI tool like ChatGPT or Claude to generate FAQs in various styles and languages. This could include tailoring content for sales, support, or success use cases, or translating into different languages like English or Spanish. The objective input you provide ensures equally objective output.
For businesses that rely on precise information, consider setting the AI's temperature to 0–0.2. This helps ensure deterministic responses based on the provided context.
You can instruct the AI to prioritize using the Context section (injected on the backend) to deliver accurate answers to users.

How do I monitor my knowledge base output?

In the conversation logs, we show a couple of logs that can help you see what is being fed to the AI. In the conversation view, click on the bracket icon “{ }” to open the transparency logs:

Then you you will see logs labeled as “Embedding” and “Embed complete”. You want to open up “Embed complete” to see your knowledge base output for the contact’s message:

If the information is not exactly what you wanted it to be or you want to elaborate on that information in the future - go back to your knowledge base and build an FAQ type around the question that was asked so there is more context the next time this question is asked to give your ideal answer.

How does knowledge really work?

Where things go for reliable and consistent output:

Prompt: Defines personality (identity), response guidelines, style guardrails, important points, and the instruction set. Keep your prompt concise, detailed, and focused.

Knowledge Base: Includes domain-specific knowledge, business-related information, FAQs, objection handling, basic pricing details, and key-value pair outputs (e.g., FAQs).

Tools: Use tools to enhance and scale your prompt. These can support context injection, enable conditional instructions or logic, retrieve data (e.g., appointment booking), and perform specific actions. Tools also allow scaling by calling specific parameters as needed.

What about live data or very complex pricing/data?

Tool calls are highly effective for tasks like retrieving live data or handling complex pricing information. It is good to understand that, AI is very smart, but not intuitive," so issues often arise when attempting to manage quoting or complex operations. In such cases, using a tool call to fetch data or integrating with a third-party service to better structure your data can be a valuable solution. For more details, refer to the custom tool crash course.

Steps to Build and Optimize the Knowledge Base

Prepare the Knowledge Base Framework:
- Log into your AI configuration platform.
- Create a blank assistant with no pre-configured prompts or tools.
- Set the temperature to zero for deterministic and objective answers.
Gather Domain-Specific Content:
- Collect source materials such as website content, FAQ documents, or internal guides.
- Scrape website data or documentation selectively to avoid inaccuracies.
Generate FAQs:
- Paste relevant content into an AI tool (e.g., Claude or ChatGPT) and request:
  - "Generate FAQs to be vector-stored for my chatbot."
- Review and refine the generated questions and answers for accuracy.
Organize and Upload FAQs:
- Separate FAQs into categories:
  - E.g., “FAQ Homepage,” “FAQ Custom Tools,” “FAQ Pricing,” etc.
- Use clear naming conventions to maintain organization.
- Copy-paste each FAQ pair (question and answer) into the knowledge base.
Enhance with Elaborated Responses:
- Use the AI tool to elaborate on FAQs:
  - “Can you expand on these FAQs with a focus on sales-specific and implementation-related questions?”
- Upload the expanded responses under corresponding FAQ categories.
Refine and Validate Data:
- Audit the information in the knowledge base for:
  - Relevance: Ensure answers align with the current business model.
  - Accuracy: Avoid outdated or incorrect data.
- Replace or delete any misaligned entries as needed.
Test Knowledge Base Functionality:
- Ask the assistant sample questions, such as:
  - "How much are voice minutes?"
  - "What plans do you offer?"
- Verify that answers are drawn directly from the knowledge base.
Optimize Answers Through Refinement:
- Refine knowledge base responses based on feedback or observed inaccuracies.
- Add specific details for common user inquiries, e.g.:
  - "What plans do you have?" → Add pricing details and plan differences.
Integrate with Tools for Advanced Features (Optional):
- For live data or complex processes:
  - Integrate with tools like Airtable, Google Sheets, or API endpoints.
  - Use tool calls to retrieve and present live information.
Deploy and Monitor:
- Set response wait time:
  - Zero seconds during testing.
  - 15-20 seconds for production to mimic human-like interaction.
- Regularly review logs to assess embedding and context usage.

Definition of Done

Knowledge base fully populated with accurate, domain-specific data.
AI consistently delivers objective and relevant answers with no prompts.
Testing confirms robust and scalable outputs.
Logs validate that answers are being pulled correctly from the knowledge base.

FAQs

Why is temperature set to zero?
To ensure deterministic outputs, eliminating randomness in answers.
What types of content should go into the knowledge base?
- Pricing details.
- FAQs and objection handling guides.
- Sales and support scripts.
- Domain-specific documentation.
How can I test knowledge base accuracy?
- Ask targeted questions directly related to the uploaded content.
- Check logs to ensure responses are pulled from the correct sources.
What if my website data is outdated?
- Avoid direct scrapes.
- Manually review and refine content before uploading.
Can I update the knowledge base later?
Yes, frequently audit and replace outdated information for sustained accuracy.

Summary: This SOP provides a structured approach to building a robust knowledge base for AI assistants. By leveraging tools like Claude or ChatGPT, you can efficiently generate FAQs and elaborate answers. Organize content with clear naming conventions and validate it rigorously for accuracy. Set the AI assistant's temperature to zero for consistent outputs, and integrate tools for live data if required. Regular monitoring ensures scalability and reliability of the assistant's responses.