Vendor selection & procurement

Nine AI procurement traps UK SMEs can avoid in three weeks

Published 24 Oct 2025 • 11–13 min read

Buying AI platforms and copilots now feels less like choosing software and more like signing up to a utility: variable consumption, fast-moving model line‑ups, and terms that quietly shift each quarter. For busy UK SME and charity leaders, the risk isn’t just overpaying—it’s locking your team into a deal that’s hard to exit when your real‑world usage turns out different.

This guide is a practical buyer’s checklist. We highlight nine traps we see in AI contracts, with fast fixes you can apply in your next procurement, plus a three‑week down‑select plan you can run without hiring a squad of consultants.

If you want deeper evaluation methods after you shortlist, pair this with our quality playbooks: the 10 tests that predict AI quality, and the 5‑day evaluation sprint for model selection. Both are designed for non‑technical teams and map neatly onto the process below. See: 10 tests that predict AI quality and 5‑day AI evaluation sprint.

Why AI procurement feels harder than SaaS

Pricing changes with usage. Most platforms bill per “token” or similar unit—small pieces of text—so bills fluctuate with demand and prompt length. A simple rule of thumb is that 100 tokens ≈ 75 words, helpful for estimating content workloads. help.openai.com
Terms evolve fast. New models arrive, old ones are deprecated, and “included” features move into paid tiers mid‑contract.
Assurance is not standardised. The UK government is building out an AI assurance ecosystem; procurement should ask vendors how they align to recognised assurance techniques. gov.uk
Lock‑in risk. Integration kits, proprietary file formats, and closed embeddings can make migration expensive unless you insist on open standards up front. gov.uk

The three‑week down‑select plan (works for SMEs and charities)

Week 1: Frame the decision

Define one or two “money moments” you want the AI to improve (for example, reducing time to respond to customer emails; summarising tender packs).
Fix the acceptance bar: the minimum quality, latency, and cost per task you’ll accept. Borrow from our 10 tests.
Draft the risks you won’t accept: training on your prompts, no export path, or no uptime commitment beyond “best efforts”.

Week 2: Ask short, pointed questions

Send a 2‑page request with your acceptance bar and a standard question set (see “Procurement questions” below). Ask for written answers and a 45‑minute demo focussed only on your money moments.
Request a simple cost simulation using your volumes.
Ask vendors to map their controls to recognised UK approaches such as the NCSC Cloud Security Principles and supplier assurance questions. gov.uk

Week 3: Run a bake‑off and negotiate

Run a one‑day bake‑off using the same inputs for all vendors; score against your acceptance bar. For examples of tasks and scoring sheets, see our 5‑day evaluation sprint.
Negotiate the 5 levers that de‑risk costs: price hold, downgrade rights, minimum‑commitment caps, token caps, and exit‑assist (more below).

The nine traps (and fast fixes)

1) Double‑charging for access and usage

Trap: Paying per seat and per token for the same workflow (e.g., a per‑user “copilot” fee plus metered usage). Fast fix: pick one basis. If you must accept both, cap the variable part monthly and insist on automatic downgrade if seats are unused for 60 days.

2) Silent model swaps and SKU drift

Trap: The vendor reserves the right to replace the model powering your use case, changing quality or cost. Fast fix: lock named models and maximum per‑1,000‑token rates into the order form. If a model is withdrawn, you choose the replacement or can terminate that module without penalty.

3) Minimum commitments that don’t fit seasonal demand

Trap: Twelve‑month minimums based on optimistic forecasts. Fast fix: insist on quarterly true‑ups and the ability to bank unused credits for 12 months. If they won’t move, reduce the minimum by 40% and add token caps in the admin console.

4) SLA credits that don’t cover your losses

Trap: Uptime looks strong on paper, but a 99.9% “three nines” SLA still permits roughly 44 minutes of downtime per month—and the remedy is usually a service credit, not a cash refund. Fast fix: convert service credits to cash on termination and add incident reporting time limits. acalculator.co.uk

5) Data training and retention surprises

Trap: Consumer plans may use your chats to train models by default, with longer retention; enterprise plans often exclude training by default. Fast fix: get it in writing—no use of your prompts or outputs for training; define retention; and ensure you can opt out at tenant level. Microsoft’s “commercial data protection” for Entra ID users is an example of explicit non‑training terms; some vendors recently shifted consumer defaults to opt‑out. microsoft.com

6) Hidden lock‑in through formats and “free” SDKs

Trap: Proprietary vector stores, prompt graphs or export‑hostile formats that make switching costly. Fast fix: require open standards for APIs and data formats in your requirements and contract. UK government guidance highlights using open standards to reduce vendor lock‑in—good practice for SMEs too. gov.uk

7) Support, SSO and audit logs treated as add‑ons

Trap: Enterprise essentials (SSO, SCIM, audit logging) cost extra and only appear in top tiers. Fast fix: include them in the base price and define response times for P1 incidents.

8) Weak assurance story

Trap: Vague claims of “secure by design”, no third‑party assurance, or certifications “in progress”. Fast fix: ask for their roadmap to recognised standards and assurance. For example, the UK is piloting accreditation for ISO/IEC 42001 (AI management systems); ask whether they’re pursuing it and who audits them today. ukas.com

9) No exit plan

Trap: You can’t export prompts, logs and embeddings in a usable format or within a reasonable timeframe. Fast fix: bake in a 30‑day exit‑assist clause, bulk export formats, and a named migration contact. Ask for a costed exit plan up front.

Your standard procurement questions (copy/paste)

Pricing. What are the unit prices we’ll actually pay for our shortlisted workflows (per 1,000 tokens or per task)? What overage protections exist?
Rate limits. What are default and burst limits? Can we reserve capacity for peak hours?
Model stability. Which named models will power our use cases? What notice and choice do we have if they change?
Quality guarantees. What KPIs will you sign for accuracy or containment of off‑topic answers for our specific tasks?
Data usage. Confirm in the order form: no training on our prompts or outputs; retention and deletion timelines; who can access data and why. Note: some vendors differentiate between consumer and enterprise defaults—be explicit. microsoft.com
Security. Map your controls to the NCSC Cloud Security Principles and answer supplier assurance questions or equivalent. Provide recent pen‑test and SOC 2/ISO 27001 reports. gov.uk
Open standards. Which open standards do your APIs and export formats follow, and how will you support migration? gov.uk
SLA and support. What’s the monthly uptime commitment and remedy? We prefer credits convertible to cash on termination. Confirm P1 response/restore times. aws.amazon.com
Assurance roadmap. Are you pursuing ISO/IEC 42001 or equivalent, and with which certifier? ukas.com
Admin controls. Can we set tenant‑wide content filters, token caps, and data‑loss prevention rules?
Audit. Do we get per‑user audit logs and API logs retained for at least 180 days?
Sub‑processors. List all core sub‑processors and regions used for our workloads; notify us 30 days before changes.
Exit. Confirm bulk export of prompts, outputs, and embeddings in open formats; 30‑day exit‑assist on request.
Training & enablement. Provide a success plan with named adoption milestones for our teams.
Value proof. Run a one‑week pilot at your cost using our tasks; we’ll measure saved minutes and error rates.

Costing quickly: three simple scenarios

Use the token rule of thumb to size costs for your shortlisted tasks. As context, 100 tokens ≈ 75 words. Drafts, summaries and email replies often land between 150–400 tokens each way; longer analyses can run to a few thousand. help.openai.com

Scenario	Volume	Rough token footprint	What to ask the vendor
Customer email replies	1,000 replies/month	250–600 tokens per reply round‑trip	Price per 1,000 tokens; caching/“memory” pricing; monthly caps; latency at peak hours.
Tender summarisation	40 documents/month	5,000–20,000 tokens per document	Model context limits; batch pricing; ability to swap to cheaper models for bulk summarisation.
Knowledge search (RAG)	20 users, daily	500–2,000 tokens per query	Hybrid search options; vector store egress fees; export format for embeddings; per‑query caching.

For SLAs, sanity‑check the minutes behind the percentages. “Three nines” permits around 44 minutes of downtime monthly; decide if that’s tolerable during your busiest trading windows, and agree maintenance windows and incident comms. acalculator.co.uk

Governance that won’t slow you down

KPIs to review monthly

Cost per task and per team (trend and forecast to year‑end).
Adoption (active users, repeat use).
Quality (pass rate on your acceptance tests; accuracy on a 10‑item gold set).
Incidents (P1/P2 count; time to resolve; SLA breaches and credits earned). aws.amazon.com

Lightweight controls

Tenant‑wide token caps and rate limits in admin settings.
Named approver for any model change or feature enabling public data connections.
Quarterly vendor review covering assurance posture. If a vendor claims alignment to recognised UK security principles, ask for the mapping and any independent assurance. gov.uk

Negotiation levers most SMEs can win

Price hold Freeze unit pricing for 12 months; floor on any reductions.
Downgrade rights If usage is 25% below forecast for two months, you can reduce the tier immediately.
Minimum cap Limit minimum commitments to 60–70% of forecast; quarterly true‑ups.
SLA remedy Convert service credits to cash on termination and add credits for response‑time breaches, not only availability. aws.amazon.com
Exit‑assist 30 days of migration support plus bulk export in open formats. gov.uk
Assurance path Commitment to a timeline for third‑party assurance (e.g., ISO/IEC 42001) and to share audit summaries annually. ukas.com

Who does what (so it actually ships)

Executive sponsor: signs the acceptance bar and the money moments.
Operations lead: runs the three‑week plan, coordinates the bake‑off and scores vendors against the same tasks.
Legal/Procurement: inserts the fixes above into order forms and schedules, not just the master agreement. For a deeper contract checklist and clause ideas, see The UK SME Buyer’s Playbook for AI Contracts.
DPO/Security lead: reviews data usage, retention and export; asks vendors to map to NCSC cloud principles and supplier assurance questions; reviews any sub‑processor changes. gov.uk

Useful references when vendors push back

Open standards reduce lock‑in: UK Government guidance. gov.uk
NCSC Cloud Security Principles and supplier assurance as a baseline for cloud/SaaS due diligence. gov.uk
AI assurance is growing: use it to structure proportionate checks in procurement. gov.uk
ISO/IEC 42001 accreditation in the UK is progressing via a UKAS pilot—ask where the vendor is on that journey. ukas.com

With these anchors—and a simple, time‑boxed process—you can get to a sensible, safe decision in three weeks, with costs you can predict and an exit plan you control.

Book a 30‑min call Or email: team@youraiconsultant.london