Vector RAG Engine

Retrieval-Augmented Generation (RAG)

This application implements a custom RAG engine without relying on high-level frameworks like LangChain. By building it from scratch using Python, OpenAI Embeddings, and Supabase's pgvector, we maintain full control over the retrieval math, context window injection, and guardrails.

1. Data Ingestion & Chunking

Macroeconomic PDFs (like the BII Global Outlook and OECD Report) are parsed using Python's pypdf.

Because LLMs cannot read 50-page PDFs at once, the text is split into sequential Chunks of 1500 characters with a 200-character overlap.

Raw PDF Text

Chunk 1

Chunk 2

2. Vector Search (Cosine Similarity)

The Math

Every chunk is mapped into a 1536-dimensional array using OpenAI's text-embedding-3-small model. When a user asks a question, the question is also embedded into vector space.

pgvector Execution

We use a custom Stored Procedure (RPC) match_documents in Supabase. It uses the <=> operator to calculate the Cosine Distance, instantly returning the top 3 most semantically similar paragraphs.