RAG
Retrieval-Augmented Generation (RAG) is a technique where you give the LLM extra context from external data sources (like PDFs, websites, or text files) so it can answer questions better — especially when the info is not present in the model’s training data.
Informally, imagine the LLM is like a student answering questions.
Traditional LLM: Answers from memory
RAG LLM: First opens a textbook, goes to the right chapter, then answers.
We will use OpenAI and langchain to demonstrate RAG.
Collecting langchain-openai Downloading langchain_openai-0.3.30-py3-none-any.whl.metadata (2.4 kB) Requirement already satisfied: langchain-core<1.0.0,>=0.3.74 in /usr/local/lib/python3.11/dist-packages (from langchain-openai) (0.3.74) Requirement already satisfied: openai<2.0.0,>=1.99.9 in /usr/local/lib/python3.11/dist-packages (from langchain-openai) (1.99.9) Requirement already satisfied: tiktoken<1,>=0.7 in /usr/local/lib/python3.11/dist-packages (from langchain-openai) (0.11.0) Requirement already satisfied: langsmith>=0.3.45 in /usr/local/lib/python3.11/dist-packages (from langchain-core<1.0.0,>=0.3.74->langchain-openai) (0.4.14) Requirement already satisfied: tenacity!=8.4.0,<10.0.0,>=8.1.0 in /usr/local/lib/python3.11/dist-packages (from langchain-core<1.0.0,>=0.3.74->langchain-openai) (9.1.2) Requirement already satisfied: jsonpatch<2.0,>=1.33 in /usr/local/lib/python3.11/dist-packages (from langchain-core<1.0.0,>=0.3.74->langchain-openai) (1.33) Requirement already satisfied: PyYAML>=5.3 in /usr/local/lib/python3.11/dist-packages (from langchain-core<1.0.0,>=0.3.74->langchain-openai) (6.0.2) Requirement already satisfied: typing-extensions>=4.7 in /usr/local/lib/python3.11/dist-packages (from langchain-core<1.0.0,>=0.3.74->langchain-openai) (4.14.1) Requirement already satisfied: packaging>=23.2 in /usr/local/lib/python3.11/dist-packages (from langchain-core<1.0.0,>=0.3.74->langchain-openai) (25.0) Requirement already satisfied: pydantic>=2.7.4 in /usr/local/lib/python3.11/dist-packages (from langchain-core<1.0.0,>=0.3.74->langchain-openai) (2.11.7) Requirement already satisfied: anyio<5,>=3.5.0 in /usr/local/lib/python3.11/dist-packages (from openai<2.0.0,>=1.99.9->langchain-openai) (4.10.0) Requirement already satisfied: distro<2,>=1.7.0 in /usr/local/lib/python3.11/dist-packages (from openai<2.0.0,>=1.99.9->langchain-openai) (1.9.0) Requirement already satisfied: httpx<1,>=0.23.0 in /usr/local/lib/python3.11/dist-packages (from openai<2.0.0,>=1.99.9->langchain-openai) (0.28.1) Requirement already satisfied: jiter<1,>=0.4.0 in /usr/local/lib/python3.11/dist-packages (from openai<2.0.0,>=1.99.9->langchain-openai) (0.10.0) Requirement already satisfied: sniffio in /usr/local/lib/python3.11/dist-packages (from openai<2.0.0,>=1.99.9->langchain-openai) (1.3.1) Requirement already satisfied: tqdm>4 in /usr/local/lib/python3.11/dist-packages (from openai<2.0.0,>=1.99.9->langchain-openai) (4.67.1) Requirement already satisfied: regex>=2022.1.18 in /usr/local/lib/python3.11/dist-packages (from tiktoken<1,>=0.7->langchain-openai) (2024.11.6) Requirement already satisfied: requests>=2.26.0 in /usr/local/lib/python3.11/dist-packages (from tiktoken<1,>=0.7->langchain-openai) (2.32.3) Requirement already satisfied: idna>=2.8 in /usr/local/lib/python3.11/dist-packages (from anyio<5,>=3.5.0->openai<2.0.0,>=1.99.9->langchain-openai) (3.10) Requirement already satisfied: certifi in /usr/local/lib/python3.11/dist-packages (from httpx<1,>=0.23.0->openai<2.0.0,>=1.99.9->langchain-openai) (2025.8.3) Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.11/dist-packages (from httpx<1,>=0.23.0->openai<2.0.0,>=1.99.9->langchain-openai) (1.0.9) Requirement already satisfied: h11>=0.16 in /usr/local/lib/python3.11/dist-packages (from httpcore==1.*->httpx<1,>=0.23.0->openai<2.0.0,>=1.99.9->langchain-openai) (0.16.0) Requirement already satisfied: jsonpointer>=1.9 in /usr/local/lib/python3.11/dist-packages (from jsonpatch<2.0,>=1.33->langchain-core<1.0.0,>=0.3.74->langchain-openai) (3.0.0) Requirement already satisfied: orjson>=3.9.14 in /usr/local/lib/python3.11/dist-packages (from langsmith>=0.3.45->langchain-core<1.0.0,>=0.3.74->langchain-openai) (3.11.2) Requirement already satisfied: requests-toolbelt>=1.0.0 in /usr/local/lib/python3.11/dist-packages (from langsmith>=0.3.45->langchain-core<1.0.0,>=0.3.74->langchain-openai) (1.0.0) Requirement already satisfied: zstandard>=0.23.0 in /usr/local/lib/python3.11/dist-packages (from langsmith>=0.3.45->langchain-core<1.0.0,>=0.3.74->langchain-openai) (0.23.0) Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.11/dist-packages (from pydantic>=2.7.4->langchain-core<1.0.0,>=0.3.74->langchain-openai) (0.7.0) Requirement already satisfied: pydantic-core==2.33.2 in /usr/local/lib/python3.11/dist-packages (from pydantic>=2.7.4->langchain-core<1.0.0,>=0.3.74->langchain-openai) (2.33.2) Requirement already satisfied: typing-inspection>=0.4.0 in /usr/local/lib/python3.11/dist-packages (from pydantic>=2.7.4->langchain-core<1.0.0,>=0.3.74->langchain-openai) (0.4.1) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests>=2.26.0->tiktoken<1,>=0.7->langchain-openai) (3.4.3) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.11/dist-packages (from requests>=2.26.0->tiktoken<1,>=0.7->langchain-openai) (2.5.0) Downloading langchain_openai-0.3.30-py3-none-any.whl (74 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 74.4/74.4 kB 2.8 MB/s eta 0:00:00 Installing collected packages: langchain-openai Successfully installed langchain-openai-0.3.30
from langchain_openai import ChatOpenAI
import os
os.environ["OPENAI_API_KEY"] = openai_api_key
llm = ChatOpenAI(model="gpt-3.5-turbo", #or gpt-4
temperature=0.7, #optional to pass
# openai_api_key=openai_api_key #could also be passed here if you do not want to set the environemnt variable
)
# Now you can use it in a chain, or call it directly as below
response = llm.invoke("When was Acme Inc. founded?")
print(response.content)There have been several companies with the name Acme Inc. founded throughout history, so it depends on which specific company you are referring to. Can you please provide more context or details?
Sample outputs obtained from the above: - “It is unclear which specific company named Acme Inc. you are referring to, as there are many companies with similar names. Can you please provide more information or context so I can accurately answer your question?”
- “It is not possible to determine the specific founding date of Acme Inc. as it is a fictional company commonly used in cartoons, comic strips, and other forms of media. The name”Acme” is often used as a generic placeholder for a company in popular culture.”
Other questions that may be asked:
- What new product line was launched in 2024 in Acme Inc?
- How frequently does the HR team of Acme Inc meet and what do they assess?
If it is a very specific or new company, the LLM may or may not be able to answer correctly. Hence, we will use RAG.
RAG steps
- Read the data for retrieval
- Split the text
- Produce Embeddings for splits
- Store the embeddings in a vectorDB
- Create Retriever to retrieve the embeddings from the VectorDB
- Combine the LLM and the retriever, and produce results.
#1.Read the data for retrieval
Before doing RAG, we need data. There are different ways to read the data:
1.1. Using simple text
1.2. Using a text file
1.3. Using a pdf file
1.1 Using a simple text
###Document class
LangChain wraps all content in Document objects, which hold both text and optional metadata.
Each doc is a Document object with attributes: page_content and metadata
1.2 Using a txt file
document_loaders
Loaders help you load documents from .txt, .pdf, .csv, URLs, etc.
Requirement already satisfied: langchain in /usr/local/lib/python3.11/dist-packages (0.3.27) Collecting langchain-community Downloading langchain_community-0.3.27-py3-none-any.whl.metadata (2.9 kB) Requirement already satisfied: langchain-core<1.0.0,>=0.3.72 in /usr/local/lib/python3.11/dist-packages (from langchain) (0.3.74) Requirement already satisfied: langchain-text-splitters<1.0.0,>=0.3.9 in /usr/local/lib/python3.11/dist-packages (from langchain) (0.3.9) Requirement already satisfied: langsmith>=0.1.17 in /usr/local/lib/python3.11/dist-packages (from langchain) (0.4.14) Requirement already satisfied: pydantic<3.0.0,>=2.7.4 in /usr/local/lib/python3.11/dist-packages (from langchain) (2.11.7) Requirement already satisfied: SQLAlchemy<3,>=1.4 in /usr/local/lib/python3.11/dist-packages (from langchain) (2.0.43) Requirement already satisfied: requests<3,>=2 in /usr/local/lib/python3.11/dist-packages (from langchain) (2.32.3) Requirement already satisfied: PyYAML>=5.3 in /usr/local/lib/python3.11/dist-packages (from langchain) (6.0.2) Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in /usr/local/lib/python3.11/dist-packages (from langchain-community) (3.12.15) Requirement already satisfied: tenacity!=8.4.0,<10,>=8.1.0 in /usr/local/lib/python3.11/dist-packages (from langchain-community) (9.1.2) Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community) Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB) Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community) Downloading pydantic_settings-2.10.1-py3-none-any.whl.metadata (3.4 kB) Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain-community) Downloading httpx_sse-0.4.1-py3-none-any.whl.metadata (9.4 kB) Requirement already satisfied: numpy>=1.26.2 in /usr/local/lib/python3.11/dist-packages (from langchain-community) (2.0.2) Requirement already satisfied: aiohappyeyeballs>=2.5.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain-community) (2.6.1) Requirement already satisfied: aiosignal>=1.4.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain-community) (1.4.0) Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain-community) (25.3.0) Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.11/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain-community) (1.7.0) Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.11/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain-community) (6.6.4) Requirement already satisfied: propcache>=0.2.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain-community) (0.3.2) Requirement already satisfied: yarl<2.0,>=1.17.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain-community) (1.20.1) Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community) Downloading marshmallow-3.26.1-py3-none-any.whl.metadata (7.3 kB) Collecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community) Downloading typing_inspect-0.9.0-py3-none-any.whl.metadata (1.5 kB) Requirement already satisfied: jsonpatch<2.0,>=1.33 in /usr/local/lib/python3.11/dist-packages (from langchain-core<1.0.0,>=0.3.72->langchain) (1.33) Requirement already satisfied: typing-extensions>=4.7 in /usr/local/lib/python3.11/dist-packages (from langchain-core<1.0.0,>=0.3.72->langchain) (4.14.1) Requirement already satisfied: packaging>=23.2 in /usr/local/lib/python3.11/dist-packages (from langchain-core<1.0.0,>=0.3.72->langchain) (25.0) Requirement already satisfied: httpx<1,>=0.23.0 in /usr/local/lib/python3.11/dist-packages (from langsmith>=0.1.17->langchain) (0.28.1) Requirement already satisfied: orjson>=3.9.14 in /usr/local/lib/python3.11/dist-packages (from langsmith>=0.1.17->langchain) (3.11.2) Requirement already satisfied: requests-toolbelt>=1.0.0 in /usr/local/lib/python3.11/dist-packages (from langsmith>=0.1.17->langchain) (1.0.0) Requirement already satisfied: zstandard>=0.23.0 in /usr/local/lib/python3.11/dist-packages (from langsmith>=0.1.17->langchain) (0.23.0) Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.11/dist-packages (from pydantic<3.0.0,>=2.7.4->langchain) (0.7.0) Requirement already satisfied: pydantic-core==2.33.2 in /usr/local/lib/python3.11/dist-packages (from pydantic<3.0.0,>=2.7.4->langchain) (2.33.2) Requirement already satisfied: typing-inspection>=0.4.0 in /usr/local/lib/python3.11/dist-packages (from pydantic<3.0.0,>=2.7.4->langchain) (0.4.1) Collecting python-dotenv>=0.21.0 (from pydantic-settings<3.0.0,>=2.4.0->langchain-community) Downloading python_dotenv-1.1.1-py3-none-any.whl.metadata (24 kB) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2->langchain) (3.4.3) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2->langchain) (3.10) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2->langchain) (2.5.0) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2->langchain) (2025.8.3) Requirement already satisfied: greenlet>=1 in /usr/local/lib/python3.11/dist-packages (from SQLAlchemy<3,>=1.4->langchain) (3.2.4) Requirement already satisfied: anyio in /usr/local/lib/python3.11/dist-packages (from httpx<1,>=0.23.0->langsmith>=0.1.17->langchain) (4.10.0) Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.11/dist-packages (from httpx<1,>=0.23.0->langsmith>=0.1.17->langchain) (1.0.9) Requirement already satisfied: h11>=0.16 in /usr/local/lib/python3.11/dist-packages (from httpcore==1.*->httpx<1,>=0.23.0->langsmith>=0.1.17->langchain) (0.16.0) Requirement already satisfied: jsonpointer>=1.9 in /usr/local/lib/python3.11/dist-packages (from jsonpatch<2.0,>=1.33->langchain-core<1.0.0,>=0.3.72->langchain) (3.0.0) Collecting mypy-extensions>=0.3.0 (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7,>=0.5.7->langchain-community) Downloading mypy_extensions-1.1.0-py3-none-any.whl.metadata (1.1 kB) Requirement already satisfied: sniffio>=1.1 in /usr/local/lib/python3.11/dist-packages (from anyio->httpx<1,>=0.23.0->langsmith>=0.1.17->langchain) (1.3.1) Downloading langchain_community-0.3.27-py3-none-any.whl (2.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.5/2.5 MB 33.4 MB/s eta 0:00:00 Downloading dataclasses_json-0.6.7-py3-none-any.whl (28 kB) Downloading httpx_sse-0.4.1-py3-none-any.whl (8.1 kB) Downloading pydantic_settings-2.10.1-py3-none-any.whl (45 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.2/45.2 kB 2.8 MB/s eta 0:00:00 Downloading marshmallow-3.26.1-py3-none-any.whl (50 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 50.9/50.9 kB 3.4 MB/s eta 0:00:00 Downloading python_dotenv-1.1.1-py3-none-any.whl (20 kB) Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB) Downloading mypy_extensions-1.1.0-py3-none-any.whl (5.0 kB) Installing collected packages: python-dotenv, mypy-extensions, marshmallow, httpx-sse, typing-inspect, pydantic-settings, dataclasses-json, langchain-community Successfully installed dataclasses-json-0.6.7 httpx-sse-0.4.1 langchain-community-0.3.27 marshmallow-3.26.1 mypy-extensions-1.1.0 pydantic-settings-2.10.1 python-dotenv-1.1.1 typing-inspect-0.9.0
# docs1 is a list of Document objects
for doc in docs1:
print(doc.page_content) # The text
print(doc.metadata) # File name, etc.Acme Inc. was founded in 1987 in Helsinki, Finland. It specializes in anti-gravity footwear and rocket-powered pogo sticks.
In 2024, Acme released a new product line: "Jet Sneakers", designed for low-orbit recreational use.
As per internal policy, Acme's HR team meets every 2 weeks to assess wellness metrics of staff based on holographic surveys.
{'source': 'RAG_file.txt'}
Even when loading a single .txt file, TextLoader.load() returns a list of one Document object — for consistency across all loaders in LangChain.
LangChain is designed to treat everything as a list of documents, whether you load:
one .txt file or multiple files at once.
1.3 Using a pdf file
Collecting pypdf Downloading pypdf-6.0.0-py3-none-any.whl.metadata (7.1 kB) Downloading pypdf-6.0.0-py3-none-any.whl (310 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 310.5/310.5 kB 5.3 MB/s eta 0:00:00 Installing collected packages: pypdf Successfully installed pypdf-6.0.0
# again docs2 is a list of Document objects
for doc in docs2:
print(doc.page_content) # The text
print(doc.metadata) # File name, etc.Acme Inc. was founded in 1987 in Helsinki, Finland. It specializes in anti-gravity footwear and
rocket-powered pogo sticks. In 2024, Acme released a new product line: "Jet Sneakers",
designed for low-orbit recreational use. As per internal policy, Acme's HR team meets every 2
weeks to assess wellness metrics of staff based on holographic surveys.
{'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}
Note: - We can also read csv files using CSVLoader from langchain.document_loaders or langchain_community.document_loaders. - We can also read from html files using UnstructuredHTMLLoader from langchain.document_loaders or langchain_community.document_loaders. - We can also read from online PDF files using OnlinePDFLoader from langchain.document_loaders or langchain_community.document_loaders.
1.4 Mixing Different File Types
You can load different formats separately and then combine them
# Assuming file1, file2, file3, file4 are available
# from langchain.document_loaders import TextLoader, PyPDFLoader, CSVLoader, UnstructuredHTMLLoader
# # Loaders for different file types
# txt_docs = TextLoader("file1.txt").load()
# pdf_docs = PyPDFLoader("file2.pdf").load()
# csv_docs = CSVLoader("file3.csv").load()
# html_docs = UnstructuredHTMLLoader("file4.html").load()
# # Merge all into one list
# all_docs = txt_docs + pdf_docs + csv_docs + html_docsall_docs is just a list of Document objects → ready for splitting, embedding, and vector storage.
2.Split the text
LLMs have token limits, so long files need to be split.
[Document(metadata={'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}, page_content='Acme Inc. was founded in 1987 in Helsinki, Finland. It specializes in anti-gravity footwear and'),
Document(metadata={'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}, page_content='rocket-powered pogo sticks. In 2024, Acme released a new product line: "Jet Sneakers",'),
Document(metadata={'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}, page_content="designed for low-orbit recreational use. As per internal policy, Acme's HR team meets every 2"),
Document(metadata={'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}, page_content='weeks to assess wellness metrics of staff based on holographic surveys.')]
Like split_documents(), ther is also a function split_text() which can directly split the text. Let me use that below and show the effect of the parameters chunk_size and chunk_overlap
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks1 = splitter.split_text("Very long document text here...")
chunks1['Very long document text here...']
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=3, chunk_overlap=1)
chunks2 = splitter.split_text("Very long document text here...")
chunks2['Ver',
'ry',
'lo',
'ong',
'do',
'ocu',
'ume',
'ent',
'te',
'ext',
'he',
'ere',
'e..',
'..']
3.Produce Embeddings for splits
For each chunk, we generate embeddings, which are numerical vectors that capture semantic meaning, such that,
For similar texts, embeddings are closeby.
And for dissimilar texts, embeddings are far apart.
4.Store the embeddings in a vectorDB
The embeddings (and chunks) are stored in a vector database (FAISS, Pinecone, Weaviate, Chroma, etc.).
This lets us search semantically, not just by keywords.
(Later, when a user asks a query, the query itself is embedded. DB finds the nearest chunk embeddings and retrieves relevant chunks.)
In this demo, we will be using FAISS VectorDB.
FAISS stands for Facebook AI Similarity Search
Open-source library from Meta (Facebook AI).
It is optimized for fast similarity search in high-dimensional vectors (like embeddings).
It is used for:
- Nearest neighbor search
- Clustering
- Efficient retrieval in RAG pipelines
Collecting faiss-cpu Downloading faiss_cpu-1.12.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (5.1 kB) Requirement already satisfied: numpy<3.0,>=1.25.0 in /usr/local/lib/python3.11/dist-packages (from faiss-cpu) (2.0.2) Requirement already satisfied: packaging in /usr/local/lib/python3.11/dist-packages (from faiss-cpu) (25.0) Downloading faiss_cpu-1.12.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (31.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 31.4/31.4 MB 48.3 MB/s eta 0:00:00 Installing collected packages: faiss-cpu Successfully installed faiss-cpu-1.12.0
Access stored documents (Optional)
# Get all documents back (but embeddings are inside the FAISS index)
all_docs = vectorstore.docstore._dict
for doc_id, doc in all_docs.items():
print("ID:", doc_id)
print("Content:", doc.page_content)
print("Metadata:", doc.metadata)
print("-" * 40)ID: 3f6cd2c9-42c5-4e95-a56e-5e5c5fa1da1e
Content: Acme Inc. was founded in 1987 in Helsinki, Finland. It specializes in anti-gravity footwear and
Metadata: {'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}
----------------------------------------
ID: 3e61c826-0d6c-4c82-a4f3-e42efde3bf43
Content: rocket-powered pogo sticks. In 2024, Acme released a new product line: "Jet Sneakers",
Metadata: {'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}
----------------------------------------
ID: 499a4a3b-9497-4a44-acf0-78b2a09eea04
Content: designed for low-orbit recreational use. As per internal policy, Acme's HR team meets every 2
Metadata: {'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}
----------------------------------------
ID: 0770a2d5-f717-424a-a917-b5c63bcd0dfc
Content: weeks to assess wellness metrics of staff based on holographic surveys.
Metadata: {'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}
----------------------------------------
Save & reload FAISS (Optional)
After creating a vectorstore, we can do similarity search using similarity_search() which uses the following process:
The query is converted into an embedding vector using the same embedding model you used for your documents.
FAISS computes similarity between the query embedding and all stored embeddings.
It returns the top-k most similar chunks (Document objects).
Output = a list of Documents (List[Document]).
# extra
retrieved_docs = vectorstore.similarity_search('What was the launch date?', k = 2) # k is the number of documents to retrieve
retrieved_docs[Document(id='3e61c826-0d6c-4c82-a4f3-e42efde3bf43', metadata={'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}, page_content='rocket-powered pogo sticks. In 2024, Acme released a new product line: "Jet Sneakers",'),
Document(id='0770a2d5-f717-424a-a917-b5c63bcd0dfc', metadata={'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}, page_content='weeks to assess wellness metrics of staff based on holographic surveys.')]
How similarity is measured?
FAISS uses vector distance metrics (like cosine similarity, L2 distance).
Embeddings of the query and chunks are compared.
Smaller distance = higher similarity.
Another variant:
similarity_search_with_score: Also gives similarity score along with the document
results = vectorstore.similarity_search_with_score("What was the launch date?", k=2)
for doc, score in results:
print(doc.page_content, score)rocket-powered pogo sticks. In 2024, Acme released a new product line: "Jet Sneakers", 0.44364873
weeks to assess wellness metrics of staff based on holographic surveys. 0.52889407
The previous four steps were to prepare the data for RAG. Now we can do the retrieval.
#5.Create Retriever to retrieve the embeddings from the VectorDB
A retriever is a wrapper around the vector store that defines how to fetch documents given a query.
(Optional) Retriever is a standardized interface that implements get_relevant_documents(query)
docs = retriever.get_relevant_documents("What was the launch date?")
for d in docs:
print(d.page_content, d.metadata)
print()rocket-powered pogo sticks. In 2024, Acme released a new product line: "Jet Sneakers", {'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}
weeks to assess wellness metrics of staff based on holographic surveys. {'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}
designed for low-orbit recreational use. As per internal policy, Acme's HR team meets every 2 {'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}
#6.Combine the LLM and the retriever, and produce results.
Finally, LLM and the retriever are combined into a RetrievalQA chain.
RetrievalQA
It is a LangChain chain designed specifically for retrieval-augmented generation (RAG).
It combines the following:
- Retriever: pulls back the most relevant documents from your vector database.
- LLM: takes the retrieved documents + user’s query, and generates an answer.
So instead of the LLM hallucinating, it grounds its answers on your documents.
.invoke({"query": query})
Runs the chain.
Input: a dictionary where the key is “query”.
Output: dictionary with keys like:
- “result” : final LLM-generated answer.
- “source_documents” (if enabled): list of docs retrieved.
query = "When and where was Acme Inc. founded?"
response = qa_chain.invoke({"query": query})
print(response["result"])Acme Inc. was founded in 1987 in Helsinki, Finland.
[Document(id='3f6cd2c9-42c5-4e95-a56e-5e5c5fa1da1e', metadata={'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}, page_content='Acme Inc. was founded in 1987 in Helsinki, Finland. It specializes in anti-gravity footwear and'), Document(id='3e61c826-0d6c-4c82-a4f3-e42efde3bf43', metadata={'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}, page_content='rocket-powered pogo sticks. In 2024, Acme released a new product line: "Jet Sneakers",'), Document(id='499a4a3b-9497-4a44-acf0-78b2a09eea04', metadata={'producer': 'PDFium', 'creator': 'PDFium', 'creationdate': 'D:20250730135006', 'source': 'RAG_file.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}, page_content="designed for low-orbit recreational use. As per internal policy, Acme's HR team meets every 2")]