RAG-Based Chatbot Assistant System

Abstract

This research presents a comprehensive evaluation of Retrieval-Augmented Generation (RAG) based chatbots for educational support in higher education. Our system leverages multiple Large Language Models (LLMs) including GPT-3.5, Gemini, LLaMA, Mistral, and DeepSeek to provide personalized academic assistance to students across various disciplines. The project addresses the limitations of existing research by conducting a comparative study to evaluate the effectiveness of different LLMs in educational contexts, specifically focusing on developing a chatbot to assist students with course materials across various subjects offered at NTNU.

Through systematic experiments across three distinct university courses, we demonstrate that GPT-RAG consistently outperforms other models in answer correctness and relevancy, while RAG-Gemini shows superior faithfulness scores. Our pilot study with real students validates the practical effectiveness of the system in educational settings.

Research Questions

RQ1: To what extent does the use of a RAG-based chatbot improve the accuracy and relevance of responses compared to traditional intent-based chatbots in a higher education setting?

RQ2: In the context of student queries, how effectively can a RAG-based chatbot retrieve relevant information from course materials and generate contextually appropriate responses?

RQ3: Does the integration of RAG with generative AI models enhance the chatbot's ability to handle complex or ambiguous queries from students, offering more personalized and insightful answers?

System Architecture

RAG-based Chatbot Framework: Our system implements a sophisticated RAG architecture supporting multiple file formats, vector storage, and multi-LLM integration

                Core Components:
                
                    Document Processing: Supports PDF, TXT,
                    PPTX, CSV, XLSX, DOCX formats
                  
                    Vector Storage: Uses FAISS and Chroma for
                    efficient similarity search
                  
                    Multi-LLM Support: Integrates GPT-3.5,
                    Gemini, LLaMA, Mistral, and DeepSeek models
                  
                    Web Interface: Provides an intuitive
                    web-based interface for student interactions

FastAPI LangChain OpenAI API Google Generative AI FAISS Chroma RAGAS Python 3.8+

Experimental Design

Course Selection & Evaluation:

IDATG2204

Data Modeling and Database Systems

NTNU Gjøvik, Norway

Computer Science Department

1IK172

Introduction to Data Analytics

Linnaeus University (LNU), Sweden

CS102

Programming Fundamentals

Sukkur IBA University (SIBAU), Pakistan

Pilot Study Setup: Deployment to Bachelor students in Computer Science, NTNU Gjøvik

Generated Dataset Evaluation Results

Key Finding: GPT-RAG consistently outperformed other models across most evaluation metrics, achieving superior answer correctness and relevancy. RAG-Gemini demonstrated high faithfulness scores, while RAG-Mistral showed excellent context precision.

Dataset	Model	Answer Correctness	Answer Relevancy	Faithfulness	Context Precision	Context Recall
Data Modeling & Database Systems	GPT-RAG	0.71	0.98	0.68	0.73	0.81
	RAG-Gemini	0.63	0.87	0.82	0.73	0.80
	RAG-Llama	0.50	0.90	0.61	0.73	0.80
	RAG-Mistral	0.53	0.94	0.66	0.88	0.79
	RAG-Deepseek	0.57	0.73	0.56	0.77	0.78
Data Analytics	GPT-RAG	0.66	0.95	0.73	0.60	0.72
	RAG-Gemini	0.53	0.78	0.84	0.60	0.70
	RAG-Llama	0.47	0.94	0.66	0.71	0.75
	RAG-Mistral	0.45	0.93	0.74	0.65	0.67
	RAG-Deepseek	0.53	0.75	0.69	0.66	0.74
Programming Fundamentals	GPT-RAG	0.70	0.88	0.61	0.67	0.69
	RAG-Gemini	0.65	0.78	0.72	0.64	0.68
	RAG-Llama	0.58	0.76	0.54	0.68	0.68
	RAG-Mistral	0.65	0.84	0.56	0.73	0.68
	RAG-Deepseek	0.63	0.64	0.56	0.67	0.71

Pilot Study Results

Real-world Validation: Our pilot study with Bachelor students at NTNU Gjøvik demonstrated practical effectiveness across different query types. The system successfully handled both content inquiry and exam preparation scenarios.

Query Type	Records	Model	Answer Relevancy	Faithfulness
Content Inquiry	93	GPT-RAG	0.90	0.62
		Gemini-RAG	0.78	0.79
		Mistral-RAG	0.87	0.63
		LLama-RAG	0.80	0.55
Exam Preparation	38	GPT-RAG	0.79	0.62
		Gemini-RAG	0.62	0.66
		Mistral-RAG	0.78	0.57
		LLama-RAG	0.61	0.60

Key Features of Pilot Study:

Dual Response System: General responses vs. Course-specific RAG responses
User Feedback System: Emotional rating icons (😠, 😢, 😐, 😊, 😃)
Bias Mitigation: Unlabeled response presentation to avoid LLM preference bias
Course Material Access: Students can download courses from the sidebar

Key Contributions

🎯 Personalized Educational Support

Development of a RAG-based chatbot specifically designed to provide personalized educational support to students in higher education.

🔬 Comprehensive LLM Evaluation

Assessment of the effectiveness of RAG-based chatbots in comparison to traditional intent-based systems, focusing on accuracy, relevance, and context-awareness.

🚀 Novel Integration Methodology

Introduction of a novel methodology for integrating course materials into a RAG-based chatbot, demonstrating potential for personalized student support.

📊 Cross-institutional Validation

Evaluation across three different universities and academic systems, ensuring generalizability of findings.

Implementation & Live Demo

Try the System: Our RAG-based educational chatbot is available for demonstration. The system features a modern web interface with real-time chat functionality and supports multiple LLM backends.

Launch Demo

System Features:

📄 Multi-format Support

PDF, TXT, PPTX, CSV, XLSX, DOCX

🤖 Multiple LLMs

GPT-3.5, Gemini, LLaMA, Mistral, DeepSeek

🔍 Vector Search

FAISS and Chroma integration

📊 RAGAS Evaluation

Comprehensive metrics framework

Future Directions

This research establishes a foundation for advanced educational AI systems that bridge the gap between course content and personalized student support. Future work will focus on:

Multi-modal Support: Integration of image and video content for richer educational experiences
Advanced Personalization: User behavior learning and adaptive response generation
Real-time Collaboration: Multi-user study sessions and peer learning support
Mobile Application: Native mobile app development for enhanced accessibility
Advanced Analytics: Detailed learning analytics dashboard for educators
Cross-institutional Deployment: Scaling across multiple universities and educational systems

Contact & Acknowledgments

Research Team

Lead Researcher: Ali Shariq Imran (NTNU)

Researchers: Abdul Manaf (NTNU), Nimra Mughal (Sukkur IBA), Zenun Kastrati (Linnaeus), Sher Muhammad Daudpota (Sukkur IBA)

Institutions: NTNU Gjøvik (Norway), Sukkur IBA University (Pakistan), Linnaeus University (Sweden)

Project Information

Project Website: NORPART Connect

Live Demo: Student AI Navigator

Code Repository: GitHub

Status: Under Review

This research is part of ongoing work in educational technology and AI-assisted learning at NTNU Gjøvik, Norway.

🎓 Enhancing Student Support with a RAG-Based Chatbot Assistant System