Most UK SMEs and charities already have enough knowledge to power useful AI copilots — it’s just buried across SharePoint sites, network drives, email attachments and PDFs. Retrieval‑Augmented Generation (RAG) makes AI answers cite your own documents, not just the public web. This article gives a non‑technical, 30‑day plan to get “RAG‑ready” without buying heavyweight platforms you may not need yet.
We’ll cover what RAG is in plain English, the minimum viable data hygiene to make it work, low‑risk ways to test quality, procurement questions to keep you portable, and simple KPIs your board can track.
RAG in plain English (and why it fails without foundations)
- RAG connects an AI model to your documents. It searches for relevant passages first, then asks the model to answer using those passages. That makes answers fresher and grounded in your organisation’s “source of truth”. cloud.google.com
- Hybrid search usually works best: a mix of meaning‑based matching (semantic vectors) and keyword matching (BM25/sparse). Systems merge both lists using techniques like reciprocal rank fusion to lift the best of each. docs.cloud.google.com
- Re‑ranking improves precision by scoring the top results again with a specialist ranker before anything is sent to the model, reducing tokens and cost. docs.cloud.google.com
- Embeddings are just numbers that represent the meaning of your text so “similar things are close together”. They power semantic search in many tools. platform.openai.com
RAG disappoints when files are messy (scanned images, duplicates, version soup), when chunks are cut mid‑sentence, or when you send too much text to the model. Modern guidance stresses better chunking and retrieval over “more tokens”. learn.microsoft.com
Your 30‑day RAG‑readiness plan
Week 1 — Pick a narrow, valuable slice
- Choose one business area with clear FAQs and documents you own (for example: HR policies, finance procedures, safeguarding guidance, product specs).
- Write 25 “golden questions”. Real questions staff ask today. Include a one‑line “ideal answer” and where it currently lives.
- List data sources for this area: the top 5 folders, pages or drives. Keep scope tight enough to scan and clean in a week.
- Name an owner for the corpus — the person who approves what’s “source of truth”.
Week 2 — Clean, convert, and label
Do these once and you’ll improve search and human readability even before any AI is added.
- De‑duplicate and archive “v1‑v12 FINAL” sprawl. Keep a single canonical copy.
- Prefer machine‑readable formats. Use DOCX or tagged PDF/PDF‑A over scanned images. The UK National Archives maintains practical guidance on preservable formats — a useful north star for SMEs too. nationalarchives.gov.uk
- Fix filenames so people and machines can understand them. Pattern: Area_DocType_Title_Version_Date (e.g., HR_Policy_ParentalLeave_v7_2025‑09).
- Add lightweight metadata in a spreadsheet: title, owner, date, status (approved/draft), confidentiality (open/internal/restricted), and a 1‑sentence summary.
- Redact before you index. UK government guidance is crystal clear: do not put sensitive or personal data into generative AI tools; treat outputs with caution and check them. Apply the same bar to your RAG pipeline. gov.uk
Week 3 — Slice documents the smart way
RAG tools “chunk” documents into small, searchable passages. Naive “every 1,000 tokens” chunking often splits ideas mid‑paragraph, hurts relevance and increases cost. Use semantic boundaries: headings, sections, tables, and lists — each chunk should stand alone. Recent guidance from enterprise teams shows large accuracy gains and lower token use when chunks follow concepts, not fixed sizes. techcommunity.microsoft.com
- Start with a simple rule: one chunk per subsection (H2/H3), up to about a page of text; include the section title as context.
- Keep citations by storing source file path, page/section, and last modified date alongside each chunk, so your copilot can show its working.
- Plan for hybrid retrieval: keep both the clean text (for vectors) and a keyword index (for acronyms, product codes and legal terms). docs.cloud.google.com
Week 4 — Prove quality with a tiny pilot
You don’t need engineers to test whether your corpus is “good enough”. Many platforms let you upload a small set, run a RAG chat, and export results for review. In Azure and Google ecosystems, you can trial RAG capabilities (search, hybrid retrieval, re‑ranking) with built‑in tools before committing. learn.microsoft.com
Run your 25 golden questions. For each answer, record:
- Hit rate@5: Did the right document appear in the top 5 retrieved chunks?
- Groundedness: Is the answer strictly supported by the cited chunks?
- Citation quality: Does it include a file name and section/page?
- Cost proxy: How many words were retrieved per question?
- Time‑to‑update: If you change a source file, how quickly do answers reflect it?
Decide: do we need a vector database now?
Often, the answer is “not yet”. Many SMEs can pilot with built‑in search connectors, then scale later. Consider a dedicated vector DB when you have one or more of:
- Large volume: 100k+ chunks or millions of rows where latency matters.
- Complex filters: e.g., role‑based access plus business facets across multiple tenants.
- Advanced ranking: you need hybrid retrieval plus custom re‑ranking strategies. docs.cloud.google.com
- Strict SLAs and observability: search quality dashboards, recall/precision alerts, detailed logs.
If you do explore cloud‑native RAG services, check how they handle hybrid search, re‑ranking, and security boundaries in your region. learn.microsoft.com
Cost guardrails you can set on day one
| Driver | Why it matters | Guardrail |
|---|---|---|
| Tokens retrieved per question | More text sent to the model = higher spend and slower responses. | Limit to the top 3–5 chunks; use a re‑ranker before generation to trim further. docs.cloud.google.com |
| Embedding volume | Every chunk you index generates an embedding vector. | Index only approved sources; exclude images and drafts. Use modern, efficient embeddings. platform.openai.com |
| Update frequency | Continuous background indexing can churn costs. | Batch weekly for low‑change areas; event‑triggered re‑index for policies and price lists. |
| Search quality | Weak retrieval forces you to send more context to make answers usable. | Adopt hybrid retrieval and tune ranking before scaling user access. docs.cloud.google.com |
Simple KPI: £ per grounded answer. As retrieval improves, the cost per helpful answer falls even if model prices stay the same.
Risks and mitigations (board‑friendly)
| Risk | Impact | Mitigation you can evidence |
|---|---|---|
| Sensitive data leaks into the index | Regulatory, reputational and staff harm | Redact before indexing; use “internal only” scope; follow GOV.UK guidance to avoid putting sensitive data into generative tools. gov.uk |
| Supplier shifts or lock‑in | Unexpected costs, rework | Insist on export of your embeddings and metadata; keep a simple TSV/CSV of chunks and citations; see our guide on avoiding lock‑in. Read |
| Model or API behaviour changes | Quality drifts, prompts break | Introduce a reranking step you control; version prompts; define a “go‑live gate” and periodic review. See the go‑live gate |
| Poor security hygiene around AI | Data exposure, integrity issues | Adopt the UK’s AI Cyber Security Code of Practice principles (access control, logging, testing, supply chain). gov.uk |
Procurement questions to ask before you buy
Use these to compare like‑for‑like in a light, two‑demo bake‑off.
- Data scope: Which connectors are native (SharePoint, OneDrive, Google Drive, Confluence)? How do you filter confidential folders?
- Retrieval: Do you support hybrid search out of the box? Which re‑ranking options are available and how are they priced and logged? docs.cloud.google.com
- Chunking: Can we define chunk boundaries by headings and tables? Can we store section titles and page numbers for citations? techcommunity.microsoft.com
- Observability: Can we see query logs, the retrieved chunks, and relevance scores for each answer?
- Portability: How do we export embeddings, chunk text and metadata in a single file? What’s the plan if we change cloud?
- Security: Where is data stored and processed? How do you enforce access controls? Do you align with the UK AI Cyber Security Code of Practice? gov.uk
- Change control: What notices do we get if ranking models or APIs change? How do you help us re‑test quality?
If you want a structured approach to supplier shortlisting, our AI vendor due‑diligence pack and two‑week vendor bake‑off are good starting points.
KPIs your exec team can track
- Coverage: % of documents in the chosen area that are cleaned, approved, and indexed.
- Quality: Hit rate@5 ≥ 80% on the 25 golden questions after reranking. docs.cloud.google.com
- Groundedness: ≥ 90% of answers include citations to the correct section/page.
- Freshness: Average time from source edit to updated answer under 24 hours.
- Unit cost: £ per grounded answer falling month‑on‑month.
Operational playbook: who does what
Operations manager
- Own the 30‑day plan; set scope and success criteria.
- Nominate document owners; block time for Week 2 clean‑up.
- Run a pilot with 10–20 users in shadow mode before wider rollout. Shadow‑mode guide
Legal/DPO
- Define redaction rules and “internal only” sources.
- Approve data retention and deletion for the index and logs.
- Review supplier responses on security controls and change notifications. gov.uk
IT
- Enable read‑only service accounts for source repositories.
- Set batch indexing windows and access control mappings.
- Monitor latency, error rates and re‑index failures.
Knowledge owners
- Approve canonical documents and archive duplicates.
- Maintain the simple metadata sheet.
- Help write and review the 25 golden questions each quarter.
What good looks like after 30 days
- One business area with clean, approved documents and light metadata.
- A pilot retrieval set with hybrid search and re‑ranking enabled; logs show top chunks per question. docs.cloud.google.com
- A brief report with hit‑rate, groundedness, freshness, and cost per answer trends.
- A go/no‑go decision: expand to a second area, or fix gaps first.
Getting RAG‑ready is mainly information management, not model magic. If you invest a fortnight in tidy, well‑labelled, preservable documents and a week in testing retrieval quality, your copilot will feel useful without changing your entire stack — and you’ll be in a strong position to scale later. nationalarchives.gov.uk