Why this guide, and why now
If you plan to renew or buy AI tooling in Q1, the biggest financial risks are not model accuracy or features — they’re contract terms that fix you into costs, throughput caps and data handling you can’t change later. Most AI platforms now bill by usage (typically the “tokens” your prompts and outputs consume), and they frequently update prices and tiers. Your goal is to buy outcomes with flexibility, not lines of compute you can’t use. openai.com
Capacity can bite, too. Vendors publish rate limits and tokens‑per‑minute quotas. These underpin response times at peak. If you don’t tie contract value to minimum capacity and clear service credits, your pilot may impress but production will stall. Microsoft’s Azure OpenAI documentation explicitly calls out tokens‑per‑minute (TPM) and requests‑per‑minute (RPM) quotas, with SLAs noted separately. Treat these as negotiable service levels, not a footnote. learn.microsoft.com
Finally, data control varies by product. Business‑grade offerings often default to “no training on your data,” while consumer apps can have different defaults. Understand the exact product tier you’re buying, and capture data retention and training terms in the order form, not just the website. openai.com
The Big 8 commercial levers to negotiate
1) Model portability and swap rights
- What it is: The right to change the underlying model (e.g., from Vendor A’s flagship model to a cheaper or faster alternative) without penalty.
- Why it matters: Models improve monthly; prices change. Multi‑model platforms like Amazon Bedrock aggregate 100+ models and providers under one API, so contractual portability is realistic. aws.amazon.com
- Ask for: Named “model swap” clause with: no re‑implementation fees, 30‑day change window, and a commitment to preserve existing integrations.
2) Price protection and benchmarking
- What it is: Caps and downward‑only adjustments tied to public price pages.
- Why it matters: Token pricing differs for input vs output and shifts by model family; you want automatic reductions if list prices fall. openai.com
- Ask for: “Meet or beat” clause against the vendor’s own published pricing; optional right to move to batch/async tiers if introduced.
3) Capacity guarantees (TPM/RPM) and service credits
- What it is: Minimum tokens‑per‑minute and requests‑per‑minute in your region, with credits if breached.
- Why it matters: Quotas and rate limits determine actual throughput; treat them like an SLA line item. learn.microsoft.com
- Ask for: Named metrics per deployment/region, burst headroom for launches, and step‑ups when monthly volume is hit.
4) Data isolation and private networking
- What it is: Private connectivity (for example, AWS PrivateLink to Bedrock) and clear statements that your prompts and outputs are not used for training. docs.aws.amazon.com
- Why it matters: Reduces leakage and narrows incident blast radius.
- Ask for: Mandatory private egress, no internet‑routed calls, zero‑data‑retention options where supported, and explicit training opt‑out in the order form.
5) IP indemnity for outputs and training data
- What it is: Vendor defends you if generated output or underlying training data triggers third‑party IP claims.
- Why it matters: Major cloud vendors now offer forms of indemnity for generated output and training data; secure this in your contract and confirm pre‑conditions (e.g., leaving safety filters on). cloud.google.com
- Ask for: Indemnity to cover claims related to both output and training data; no carve‑outs that swallow the protection.
6) Exit, transition and data return
- What it is: A practical exit plan that includes data export, prompt libraries, evaluation artefacts and UI copy for handover.
- Why it matters: UK public sector playbooks highlight exit management to avoid legacy lock‑in — the principle applies equally to SMEs. Bake exit into the initial spec. gov.uk
- Ask for: Call‑off style schedule listing deliverables, assisted transition hours, and capped fees for export/wipe.
7) Transparency of retention and human review
- What it is: Which logs are retained, for how long, who can view them, and whether humans review prompts/outputs.
- Why it matters: Defaults differ between enterprise and consumer products; if you use a consumer app at work, training/retention may be opt‑out. openai.com
- Ask for: Retention < 30 days for logs, opt‑out confirmed in writing, and deletion SLAs post‑termination.
8) Governance alignment (lightweight)
- What it is: A modest commitment to align with recognised AI management practices (e.g., ISO/IEC 42001) without turning this into a compliance project. bsigroup.com
- Why it matters: Gives you a handle for audits and change control while staying proportionate.
- Ask for: Annual attestations and a named contact for risk reviews.
What good looks like: quick reference table
| Clause | What good looks like | Watch‑outs |
|---|---|---|
| Model swap | Right to change to any generally available model in platform with no re‑implementation fees. | “Mutual agreement” language only; re‑certification charges each swap. |
| Price protection | Downward‑only alignment with public pricing; batch or cache discounts if launched mid‑term. | Fixed rates above public list; output tokens priced far higher than input with no cap. openai.com |
| Capacity | Named TPM/RPM per region; credits for breaches; launch bursts allowed. | “Best efforts” throughput; shared pool with no priority. learn.microsoft.com |
| Networking | Private connectivity (e.g., AWS PrivateLink) mandated; no public internet path. docs.aws.amazon.com | “Private by design” marketing slides; no binding wording. |
| Data usage | No training on your business data by default; opt‑out recorded in order form and console. openai.com | Consumer app terms used for work; human review permitted by default. theverge.com |
| IP indemnity | Coverage for both generated output and training data, subject to using safety filters. cloud.google.com | Output‑only indemnity or broad exclusions that void cover. cloud.google.com |
| Exit | Export of data, prompts, evals and configs; assisted transition; certified data wipe. | “Delete on request” with no timeframe; exit support at time‑and‑materials only. gov.uk |
KPIs that keep you honest post‑signing
Track these monthly; set a red/amber line for each:
- Cost per assisted outcome: e.g., cost per resolved ticket, per qualified lead, per approved grant letter.
- Output token ratio: output tokens as a % of input tokens; sudden spikes often signal prompt drift.
- Throughput achieved vs contracted: actual TPM/RPM at peak compared to the minimum in the contract. learn.microsoft.com
- Cache/batch hit rate: % of calls benefiting from vendor caching or batch discounts, if available. platform.openai.com
- Incident minutes: user‑visible degradation or timeouts per month; tie to credits where possible. learn.microsoft.com
If your KPIs worsen two months in a row, use your benchmarking and swap clauses. For a structured way to wire these into operations, see our 30‑day AI observability sprint and AI unit economics 30‑60‑90 plan.
Your 30‑day pilot plan with exit built in
Days 1–5: Define scope and accept‑or‑reject criteria
- Pick one high‑volume flow (e.g., top 10 customer questions) and set success criteria: response quality, handle time, and cost per outcome.
- List your integration points and data sources; mandate private networking where supported. docs.aws.amazon.com
Days 6–15: Build, measure, and cap cost
- Use a capped budget and alerting on token use; plan for both input and output tokens. openai.com
- Test at peak traffic; record the real TPM/RPM achieved so you can negotiate capacity credibly. learn.microsoft.com
Days 16–30: Decide go/no‑go and secure terms
- If quality and unit cost meet your thresholds, convert to a contract that carries over pilot prompts, evaluation data and capacity figures as schedules.
- If not, exercise exit: export data, wipe logs, and apply learning to an alternative model under your swap rights.
For a fast way to compare platforms, see our companion piece, The 2026 AI Vendor Scorecard.
Procurement questions to copy‑paste into your RFI
- Pricing and capacity
- Confirm input vs output pricing and any batch or cache discounts. Provide a calculator or worked examples. openai.com
- State guaranteed TPM and RPM per region for our anticipated volumes; attach standard rate‑limit tiers and SLA. learn.microsoft.com
- Data and networking
- Confirm whether our prompts/outputs will ever be used for model training at our chosen tier; include the policy URL and opt‑out method. openai.com
- Describe private connectivity options (e.g., VPC endpoints/PrivateLink) and what traffic, if any, traverses the public internet. docs.aws.amazon.com
- IP and compliance
- Provide generated‑output and training‑data indemnity wording, including any pre‑conditions (e.g., safety filters). cloud.google.com
- Outline any alignment to recognised management standards such as ISO/IEC 42001 and what evidence you can share. bsigroup.com
- Exit and portability
- List the artefacts we receive on exit (prompt libraries, eval datasets, configuration files) and the timeline and cost caps.
- Explain support for running on alternative models or providers without refactoring. Reference any marketplace or model‑garden capabilities. aws.amazon.com
Common pitfalls we still see
- Buying the wrong product tier: teams pilot an enterprise‑grade API but sign up for a consumer app where training defaults differ. Write the exact SKU in the order form. openai.com
- Throughput aspirations without numbers: “Supports peak” is meaningless; insist on TPM/RPM commitments and credits. learn.microsoft.com
- Output‑only indemnity: ensure training data is covered too; some vendors split the two. cloud.google.com
- No exit schedule: without a checklist and hours set aside for transition, exit will be slow and expensive. Public sector guidance on avoiding lock‑in is a good template even for SMEs. gov.uk
When things do go wrong, run a short, blameless review and capture contractual learnings. Use our 60‑minute AI incident review to make that painless.
Who should own what
Directors and trustees
- Approve business‑level KPIs and red lines (price protection, exit, IP cover).
- Agree the minimal governance stance (e.g., align to ISO/IEC 42001 “lite”). bsigroup.com
Operations and service owners
- Define peak demand profiles and measure achieved TPM/RPM in pilots. learn.microsoft.com
- Maintain the model‑swap backlog: when would a cheaper/faster model actually save money without hurting outcomes?
Final word: flexibility beats features
Most AI vendors now publish token‑based pricing, evolving tiers, and clear capacity documentation. Combine that transparency with a contract that protects price, performance and portability, and you will avoid the three killers of SME AI projects: bill shock, throttling at peak, and hard lock‑in. The practical steps in this guide — model swap rights, capacity SLAs, private networking, robust IP indemnity and an exit schedule — will carry you through 2026, whatever the model leaderboard looks like in six months’ time. openai.com