First Name*

Last Name*

Email ID

Phone*

College - Where did you study?*

One of the IITs

One of the NITs

One of the BITs

One of the IIITs

One of the NIDs

Agnel Charities' FR. C. Rodrigues Institute of Technology, Vashi, Navi Mumbai

Atal Bihari Vajpayee Indian Institute of Information Technology & Management Gwalior (IIIT)

B M S College of Engineering Basavanagudi,Bangalore(BMSCE)

B.R.A.C.T's Vishwakarma Institute of Information Technology, Kondhwa(VIIT)

Bansilal Ramnath Agarawal Charitable Trust's Vishwakarma Institute of Technology, Bibwewadi, Pune (VIT Pune)

Bhartiya Vidya Bhavan's Sardar Patel Institute of Technology , Andheri, Mumbai (SPIT)

Bhilai Institute of Technology, Bhilai House, Durg(BIT)

Bhilai Institute of Technology.

Birla Institute of Technology, Goa

Birla Institute of Technology, Hydrabad

Birla Institute of Technology, Mesra, Ranchi

Birla Institute of Technology, Pilani, Rajasthan

CHAITANYA BHARATHI INSTITUTE OF TECHNOLOGY(CBIT)

Coimbatore Institute Of Technology(CIT) (Autonomous)

College of Engineering, Pune (COEP)

CV Raman Global University

Dayananda Sagar College of Engineering Bangalore (DSCE)

Delhi Technological University, DTU Delhi

Desai University, (DDU), Nadiad

Dhirubhai Ambani Institute of Info. & Comm. Tech.,(DA-IICT)

Don Bosco Institute of Technology, Mumbai

Dr. Ambedkar Institute Of Technology Bangalore

Faculty Of Technology & Engineering(MSU), Vadodara

Faculty Of Technology And Engineering(GIA), Dharmsinh

Fr. Conceicao Rodrigues College of Engineering, Bandra,Mumbai

Garv Institute of Management & Technology.

Government College of Engineering, Amravati

Govt Engineering College, Bilaspur.

Govt Engineering College, Raipur.

Govt. Engineering College, Raipur (GEC Raipur)

IIIT Hyderabad

Indian Institute of Art and Design(IIAD), Delhi

Indian Institute of Engineering Science and Technology, Shibpur (IIEST Shibpur)

Indian Institute of Information Technology (IIIT) Pune

Indian Institute of Information Technology (IIIT)Kota, Rajasthan

Indian Institute of Information Technology Surat (IIIT)

Indian Institute of Information Technology(IIIT) Kilohrad, Sonepat, Haryana

Indian Institute of Information Technology(IIIT), Vadodara, Gujrat

Indian Institute of Information Technology, Design & Manufacturing, Kancheepuram (IIIT)

Indian Institute of Technology (BHU) Varanasi

Indian Institute of Technology (ISM) Dhanbad

Indian Institute of Technology Bhilai

Indian Institute of Technology Bhubaneswar

Indian Institute of Technology Bombay

Indian Institute of Technology Delhi

Indian Institute of Technology Dharwad

Indian Institute of Technology Gandhinagar

Indian Institute of Technology Goa

Indian Institute of Technology Guwahati

Indian Institute of Technology Hyderabad

Indian Institute of Technology Indore

Indian Institute of Technology Jammu

Indian Institute of Technology Jodhpur

Indian Institute of Technology Kanpur

Indian Institute of Technology Kharagpur

Indian Institute of Technology Madras

Indian Institute of Technology Mandi

Indian Institute of Technology Palakkad

Indian Institute of Technology Patna

Indian Institute of Technology Roorkee

Indian Institute of Technology Ropar

Indian Institute of Technology Tirupati

Indraprastha Institute of Information Technology Delhi (IIIT-Delhi)

INSTITUTE OF ENGINEERING & TECHNOLOGY,LUCKNOW (0052)(IET Lucknow)

Institute of Engineering and Management, Kolkata

Institute of Engineering and Technology, DAVV, Indore (1996)

Institute Of Technology, Nirma University Of Science & Technology, Ahmedabad

International Institute of Information Technology, Bhubaneswar

International Institute of Information Technology, Naya Raipur

Jabalpur Engineering College, Jabalpur, (JEC) (1947)

Jadavpur Uni

Jadavpur University

JSS Science and Technology University(Formerly SJCE) Mysore

K J Somaiya Institute of Engineering and Information Technology, Sion, Mumbai

K.J.Somaiya College of Engineering, Vidyavihar, Mumbai

Kalinga Institute of Industrial Technology

L.D.College Of Engineering, Ahmedabad (LDCE)

M S Ramaiah Institute of Technology Bangalore (MSRIT)

Madhav Institute of Technology & Science, Gwalior (1957)

MAEER’S MIT, Pune

Maharashtra Academy of Engineering and Educational Research

Maharashtra Institute of Technology (MIT)

Malaviya National Institute of Technology Jaipur

Manipal Institute of Technology (MIT)

Maulana Abul Kalam Azad University of Technology, Kolkata

Maulana Azad National Institute of Tehnology Bhopal

MIT Academy of Engineering,Alandi, Pune

MKSSS's Cummins College of Engineering for Women, Karvenagar,Pune

Motilal Nehru National Institute of Technology Allahabad

National Institute of Design(NID)

National Institute of Technology Calicut

National Institute of Technology Delhi

National Institute of Technology Durgapur

National Institute of Technology Hamirpur

National Institute of Technology Jalandhar

National Institute of Technology Karnataka, Surathkal

National Institute of Technology Patna

National Institute of Technology Raipur

National Institute of Technology, Andhra Pradesh

National Institute of Technology, Jamshedpur

National Institute of Technology, Kurukshreta

National Institute of Technology, Rourkela

National Institute of Technology, Silchar

National Institute of Technology, Tiruchirappalli

National Institute of Technology, Warangal

Netaji Subhas University of Technology, New Delhi (NSUT Delhi)

O U COLLEGE OF ENGG HYDERABAD (UCE)

P E S University (Electronic City Campus) Bangalore(PES)

P E S University (Ring Road Campus) Bangalore(PES)

Pandit Deendayal Petroleum University ,Gandhinagar(PDPU)

Pimpri Chinchwad Education Trust, Pimpri Chinchwad College of Engineering, Pune(PCCOE)

PSG College of Engineering and Technology

Pt. Dwarka Prasad Mishra Indian Institute of Information Technology, Design & Manufacture Jabalpur

Pune Institute of Computer Technology, Dhankavdi, Pune(PICT)

Punjab Engineering College, Chandigarh (PEC)

R. V. College of Engineering Bangalore(RVCE)

Sardar Patel Institute of Technology, Andheri, Mumbai

Sardar Vallabhbhai National Institute of Technology, Surat

School of Engineering and Applied Science, Ahmedabad (SEAS)

Shri G.S. Institute of Technology & Science, Indore (M.P.) (1952)

Shri Guru Gobind Singhji Institute of Engineering and Technology, Nanded

Shri Shankaracharya Technical Campus,(Shri Shankaracharya Group of Institutions).

Shri Vile Parle Kelvani Mandal's Dwarkadas J. Sanghvi College of Engineering, Vile Parle,Mumbai (DJSCE)

Silicon Institute of Technology

Sir M.Visveswaraya Institute of Technology Hunasemaranahalli,Bangalore,

SOA ITER, Bhubaneshwar

Sri Jayachamarajendra College of Engineering(Const. of JSS Univ.) Mysore

Sri Sivasubramaniya Nadar College Of Engg (Autonomous) (SSN)

Srishti Institute of Art and Design, Bangaluru

SSN CoE, Kalavakkam

Symbiosis Institute of Design(SID),Pune

The National Institute of Engineering Mysore (NIE)

Thiagarajar College Of Engineering (Autonomous) (TCE)

University Institute of Technology RGPV, Bhopal (1986)

University of Kalyani, Kalyani

University Visveswariah College of Engineering Bangalore (UVCE)

VASAVI COLLEGE OF ENGINEERING (VCE)

Veer Surendra Sai University of Technology

Veermata Jijabai Technological Institute(VJTI), Matunga, Mumbai

Vellore Institute of Technology(VIT Vellore)

Vidyalankar Institute of Technology,Wadala, Mumbai

Vishwakarma Government Engineering College, Chandkheda,Gandhinagar (VGECG)

Visvesvaraya National Institute of Technology, Nagpur

Vivekanand Education Society's Institute of Technology, Chembur, Mumbai

Walchand College of Engineering, Sangli (WCE)

Field of Study (Graduation)*

BTech

BDES/MDES

BCA

BSc

Others

Upload your CV*

Yes, I would like Talentica Software to contact me. Click here to read our full Privacy Policy.

First Name*

Last Name*

Email ID

Phone*

Message

Yes, I would like Talentica Software to contact me. Click here to read our full Privacy Policy.

How to Implement RAG Pipeline Using Spring AI

January 31, 2025

Madhumitha Madhuramani

Application Development Lead

January 31, 2025

Madhumitha Madhuramani

Application Development Lead

Recently, NLP has made remarkable advancements with models like GPT and BERT. However, the static data these models are trained on make much of the answer either outdated or generalized. This makes the models unfit for domains demanding precise and up-to-date data. It is here that Retrieval-Augmented Generation, or RAG, comes in.

Users asking for particular information is not so uncommon. While building chatbots, I have seen people looking for specific data from technical documents. I have used RAG to solve these problem and yes, it can transform these chatbots by generating dynamic, contextual, and more relevant, up-to-date responses by directly accessing and understanding content from these documents.

Since it has proven to be so beneficial to me, I wanted to share it with other developers. In this article, I will discuss how to implement RAG pipeline using Spring AI and its challenges, along with some practical tips.

What is RAG pipeline?

With a RAG pipeline, the accuracy and quality of responses of AI models increase. RAG pipelines deliver this by merging the response retrieval and generation processes.

To find the most relevant information based on the user’s query, the system first searches a vast database or knowledge base, such as the Internet, articles, or a custom knowledge base.

After that, it generates a final answer using a language model (such as GPT or T5). But this time the model incorporates the information that was obtained to make the answer more accurate and informative.

How to implement RAG pipeline using Spring AI

The Retrieval-Augmented Generation (RAG) pipeline combines external knowledge retrieval with powerful language models (LLMs) to enhance their responses.

It’s made up of two key modules: the ETL (Extract, Transform, Load) module and the RAG (Retrieval-Augmented Generation) module. Together, these modules allow for efficient document processing and enhanced AI-driven responses. Let’s break them down:

The ETL module: Extract, transform, load

The ETL module is the first step in preparing documents for the RAG pipeline. It involves three key phases: Extract, Transform, and Load.

Document reader (Extract)

The first phase is the Extract step, where documents are read and parsed. Whether they come from PDFs, Word documents, PowerPoint presentations (PPT), or web pages, the content is extracted and converted into a Document object, which is essentially a structured format that holds the document’s content along with metadata.

Each document is tagged with basic information, like the source (e.g., the file name), and can also include custom metadata, such as a file version or document author. For example, with a PDF, each paragraph or page can be treated as a separate document object, parsed with tools like ParagraphPdfDocumentReader or PagePdfDocumentReader.

Transformers (Transform)

Next, the Transform step kicks in. Here, documents are broken down into manageable chunks. Since most AI models have a fixed context window (i.e., a limit to how much text they can process at once), it’s important to split documents to fit within this window. This is where tools like the TextSplitter and TokenTestSplitter come into play. They divide a document into smaller sections based on context size or token count.

Additionally, tools like KeywordMetadataEnricher and SummaryMetadataEnricher can add rich metadata, such as keywords or summaries of each chunk, to further enhance the document’s value during the RAG process.

Loading data into a vector database (Load)

Finally, in the Load phase, the transformed document chunks are stored in a vector database. Unlike traditional databases, which store data as text or records, vector databases store documents as embedding vectors—numerical representations of the document’s content.

Instead of relying on exact text matches, a similarity search is conducted within the vector database. When a query is posed, the vector database compares the user’s input to the stored vectors and retrieves the most similar documents based on these embeddings.

The RAG module: Retrieval and generation

Now, let’s look at how the RAG module kicks in. This part of the process involves retrieving relevant information from the vector database and using it to generate a highly accurate response.

Step 1: Loading data into the vector database

As mentioned earlier, the first step in the RAG process is to load documents into the vector database. Once documents are embedded and stored as vectors, they become accessible for similarity searches. This is the foundation upon which the RAG module operates.

Step 2: Retrieving relevant documents

When a user asks a question, the system doesn’t rely solely on the model’s internal knowledge. Instead, the query is converted into an embedding vector, which is then compared against the vectors stored in the vector database. The most relevant documents based on similarity are retrieved and used as context for answering the user’s question.

Step 3: Passing context to the AI model

Once the relevant documents are retrieved, they are sent to the AI model alongside the user’s query. The AI model uses this context—real, up-to-date information from the documents—to generate a more precise, relevant response. This ensures that the answer is grounded in external data, offering more accurate and informed results than if the AI had only relied on its pre-existing knowledge.

Reference GitHub Repository – https://github.com/Talentica/RAG-SpringAI.git

Key tips for using RAG:

Tip 1: Ensure all information within the documents is correct and up to date.

Tip 2: When a document is modified, re-generate its embedding and update its representation in the vector database.

Challenges I faced

One of the biggest challenges I encountered was hallucination, when the language model generated incorrect or meaningless answers due to irrelevant or incomplete context. To fix this, I set a similarity threshold of 0.6 for document retrieval. If no relevant documents met the threshold, the system displayed a default message instead of passing incomplete data to the language model.

Another challenge was ensuring data privacy while working with sensitive documents. To fix this, I replaced sensitive details with placeholders during the transformation process to prevent potential data leaks.

Conclusion

RAG is a powerful technique that combines information retrieval with language generation to improve AI models. Using Spring AI, I implemented a RAG pipeline and thereby improved the accuracy, relevance, and contextuality of chatbot responses.

This approach can potentially transform not just chatbots but any AI system that needs dynamic, real-time access to external knowledge. If you need to bridge static knowledge with dynamic user needs, a RAG pipeline might just be the answer you are looking for.