Developing a Chatbot using Internal Knowledge Bases

NLP

Research, design, and implement a RAG-based AI chatbot for efficient internal information retrieval

Published

October 1, 2024

Introduction

Employees in information-dense environments like utility companies often struggle to find precise information scattered across internal applications, documents, and databases. Traditional keyword search methods are frequently inefficient, leading to delays and reduced productivity. The recent advancements in Large Language Models (LLMs) and Artificial Intelligence (AI) present a significant opportunity to revolutionize these internal information retrieval processes. By implementing a Retrieval-Augmented Generation (RAG) system, companies can develop intelligent chatbots capable of understanding natural language queries and providing swift, accurate, and contextually relevant answers from internal knowledge sources.

This master thesis aims to research, design, and implement a cutting-edge, RAG-based AI chatbot for an enterprise environment. The goal is to create a secure, scalable system that facilitates efficient retrieval of company-specific data for employees. The research focuses on developing a proof of concept by leveraging open-source frameworks and models to build a chatbot that can navigate vast internal data collections, enhancing workplace productivity and streamlining access to information.

Research Objectives

Design and Implement a Modular RAG System: Develop a modular and scalable RAG system using open-source frameworks and LLMs, ensuring that sensitive corporate data is handled securely on-premise without reliance on proprietary third-party services.
Create a Specialized Knowledge Base (KB): Construct a comprehensive Knowledge Base from a complex internal source, the Wiki. This involves creating a full data pipeline for acquiring, preprocessing, cleaning, and ingesting varied data formats (HTML, PDFs) into a vector database.
Integrate with Existing Enterprise Infrastructure: Integrate the RAG system into the company’s existing, well-adopted chatbot platform, to ensure a seamless user experience and facilitate easy access for all employees through established authentication mechanisms.
Evaluate System Performance and User Acceptance: Assess the chatbot’s effectiveness through a two-fold evaluation process: 1) conducting a quantitative, metrics-based evaluation of the system’s retrieval accuracy and response quality, and 2) gathering qualitative feedback on usability and performance directly from employees through a user survey.

Methodology

Data Processing and Knowledge Base Creation: The project begins with the selection of the Wiki as the primary data source for the proof of concept. A data acquisition pipeline is built to crawl the wiki and download all relevant HTML and PDF documents. This is followed by a rigorous preprocessing phase, which includes cleaning irrelevant content, converting hyperlinks to markdown format for better LLM interpretation, and filtering out overly complex or old files. The cleaned data is then split into manageable chunks, embedded into vectors, and loaded into a Weaviate vector database to create the final, searchable Knowledge Base.
System Design and Architecture: The system is designed as a micro-service intended to integrate with the existing chatbot infrastructure. The architecture prioritizes security, privacy, and scalability by exclusively using self-hosted, open-source components. Key technologies selected include:
- LangChain: An AI orchestration framework used to structure the RAG pipeline, from data loading to prompt construction.
- Weaviate: A vector database chosen for its powerful multi-tenancy feature, which allows for the creation of multiple, isolated KBs within a single instance, enabling role-based access for different user groups.
- Ollama: A tool for serving LLMs and embedding models locally on the company’s internal GPU servers, ensuring that no sensitive data leaves the network.
Evaluation Framework: A two-fold evaluation approach is used to assess the system.
- Qualitative User Feedback: The chatbot is tested by members of the service center team, who then provide feedback through a detailed survey on aspects like user-friendliness, response quality, and overall system intelligence.
- Quantitative Metrics Evaluation: A specialized evaluation dataset of question-answer pairs is created with help from domain experts. The system’s retrieval performance is measured by checking if the correct source document was returned. The RAGAs framework is used to calculate metrics on the generated answers, including faithfulness, answer correctness, and context relevance.

Expected Contributions

A functional and secure RAG chatbot demonstrating the viability of using open-source, self-hosted LLMs to securely and effectively access internal knowledge bases.
A replicable data-to-KB pipeline for processing semi-structured, heterogeneous data from an internal company wiki.
A scalable, multi-tenant architecture that integrates with existing enterprise software and supports different departments with isolated, secure access to their specific knowledge bases.
A comprehensive evaluation framework combining quantitative metrics with qualitative user feedback.