I. Static Knowledge vs. Dynamic Grounding
Large Language Models suffer from temporal cutoff—their internal knowledge is frozen at the time of training. Lewis et al. (2020) introduced Retrieval-Augmented Generation (RAG) to solve this by providing the model with a "contextual window" to the external world.
In this paradigm, the LLM is not treated as a database, but as a reasoning engine. We retrieve relevant documents from a private or live dataset and inject them into the prompt, drastically reducing hallucinations and enabling the use of proprietary data.
Data Freshness
Linking models to real-time APIs and document stores without re-training.
Primary Sources & Further Reading
- Lewis et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.
- Gao et al. (2022). Precise Zero-Shot Dense Retrieval without Relevance Labels (HyDE).
- Pinecone Engineering. Vector Databases: A Beginner's Guide.
- Malkov & Yashunin (2018). Efficient and Robust Approximate Nearest Neighbor Search using HNSW.
- Nomic AI (2024). Matryoshka Embeddings: Dynamic Dimensions.
- Hugging Face (2024). LangChain and LlamaIndex Architecture Guides.