Design and Implementation of a Question Answering System Using Vector Indexing and Large Language Models

Natalia Alonso Bercebal. (2025). Design and Implementation of a Question Answering System Using Vector Indexing and Large Language Models. Trabajo Fin de Titulación (TFG). Universidad Politécnica de Madrid, ETSI Telecomunicación.

Abstract:
With the enormous amount of digital information available today, accessing the knowledge we need has become much easier. Platforms like Wikipedia have become essential tools for learning and finding information. However, the search engines we commonly use often struggle to truly understand what we mean with our questions. As a result, the answers we get are not always the most relevant or accurate. This project focuses on the design and implementation of a question-answering system, with a knowledge base built from Wikipedia articles filtered by a specific category. The main goal is to allow any user to make natural language queries and receive clear and contextualized responses.To make this possible, the system uses Large Language Models (LLMs), semantic retrieval techniques using embeddings, and vector database storage, all integrated into an architecture known as RAG (Retrieval-Augmented Generation). The system retrieves relevant text fragments, analyzes them, and generates a response using the LLama 3 model, all through a user-friendly web interface built with Streamlit. Additionally, several features have been implemented, such as article management (adding or removing content) and a conversational memory to help the system better understand the context of each question, allowing the user to interact with it in a simple and intuitive way. To test the system, the category of contemporary European history was selected. However, it is designed to be adaptable to other topics in the future. Results show that the system is capable of generating coherent and reliable responses, even when dealing with questions outside the original domain. Thanks to its modular design, it can be easily extended and updated without implementing complex changes.