AI Transform API
Preview feature. AI Transform is in active preview — the API surface, rule semantics, rate limits, and per-token pricing may change before general availability. Suitable for evaluation and non-critical workloads. Production deployments should pin a specific SDK version, monitor token spend in the dashboard, and avoid putting AI-rewritten responses on user-blocking critical paths until GA. See the Cachely status page for any active limitations.
AI Transforms let you rewrite, translate, shorten, or otherwise adapt CMS JSON responses at the Cachely proxy layer. The transformed result is what your frontend receives — the source CMS entry is never modified.
How it works
- You define one or more rules for a project. Each rule targets specific JSON fields and describes a transformation (translate, shorten, rewrite tone, or a custom instruction).
- Rules are grouped by an optional profile label (e.g.
"de","editorial"). - At request time, adding
?transform=<profile>to an API proxy request activates all active rules with that profile. - The proxy executes the transformation server-side and returns the adapted JSON to the client.
- Transformed responses are cached at the edge — repeat requests with the same profile serve from cache without re-running the LLM.
AI Transforms only operate on application/json upstream responses. Asset responses, HTML, and non-JSON payloads are never transformed.
Rule configuration
Each rule requires:
| Field | Required | Description |
|---|---|---|
fieldSelector | Yes | Dot-notation path to one or more fields (e.g. fields.title, data.items[*].description) |
presetId | Yes | The transformation type — see presets below |
routePattern | No | Path prefix filter; only requests matching this prefix trigger the rule |
profile | No | A string label used to group rules and activate them at runtime via ?transform=<profile> |
Preset modes
| Preset | What it does | Additional options |
|---|---|---|
rewrite_tone | Rewrites content in a different tone | tone (e.g. formal, casual, playful) |
shorten | Condenses the field to fewer words | — |
translate | Translates the field to another language | targetLanguage (e.g. "de", "fr") |
custom | Applies a free-form GPT instruction | customInstruction (text); optionally constraints and an example input/output pair |
Array fields use [*] notation — for example, items[*].title targets the title field of every item in an array.
Rule statuses
| Status | Preview | Runtime |
|---|---|---|
active | Yes | Yes |
draft | Yes | No — never runs on live traffic |
paused | No | No |
draft is the default status for newly created rules. Activate a rule explicitly before it affects production traffic.
Preview vs. runtime
Preview
Preview lets you test a transformation against sample content before activating a rule.
- Endpoint:
POST /api/tenants/{slug}/ai-transform/preview - Runs immediately; no production traffic is affected
- Fail-closed: if token limits are exceeded, the request returns HTTP
429. The error is shown in the admin panel. - Processes up to 5 fields per request
- Recorded in the run log with
mode: "preview"
Runtime
Runtime transforms run on live API proxy traffic when a request includes ?transform=<profile>.
GET /~api/entries?transform=editorial
GET /~api/entries?transform=de
The profile value must match the profile field on at least one active rule. If no active rule matches, the request passes through untransformed.
- Fail-open: if token limits are exceeded or the LLM call fails, the original (untransformed) JSON is returned with
status: "passthrough". No error is returned to the end client. - Processes up to 20 fields per rule (subject to the tenant-level
maxFieldsPerResponselimit) - Recorded in the run log with
mode: "runtime"
Usage and limits
| Limit | Default | Notes |
|---|---|---|
| Monthly token cap | none | Total input + output tokens per calendar month (UTC) |
| Daily token cap | none | Total input + output tokens per UTC day |
| Max fields per response | 20 (runtime), 5 (preview) | Global across all rules in one request |
| Max characters per field | 8,000 | Configurable down, not up |
When a runtime transform is skipped because a limit is exceeded, the run log records status: "passthrough" with a reason string. Check the run detail to diagnose limit-related skips.
Run status values
| Status | Meaning |
|---|---|
applied | All targeted fields were transformed successfully |
partial | Some fields transformed; others failed (e.g. per-field LLM error) |
passthrough | Original content returned unchanged — no matching active rules, non-JSON upstream response, or limits exceeded |
error | Execution failed entirely (LLM error, timeout, etc.) |
Observability
The Recent Runs panel in the admin dashboard shows:
- Timestamp, mode (
preview/runtime), profile, status, fields attempted, rules evaluated, latency, token usage - Filters by mode, status, and profile
- Expanding a row shows rule summary, per-field error details, and limit reason where applicable
What AI Transforms do not do
- They do not modify images, video, or any non-text content
- They do not transform asset proxy responses
- They do not process HTML — only
application/jsonupstream responses - They do not run on non-GET requests
- They do not guarantee deterministic output — LLM responses vary between runs
draftrules never run in production, regardless of profile name
Common issues
Transforms not running in production: check that the rule status is active (not draft) and that the ?transform=<profile> value matches the rule's profile field exactly — case-sensitive.
Wrong fields being transformed: array fields require [*] notation (e.g. items[*].title, not items.title).
Runtime returning passthrough unexpectedly: check the run detail for the reason — exceeded token limits are the most common cause.
Preview works but runtime does not: preview evaluates draft + active rules; runtime evaluates only active rules.
API reference
All endpoints require a Clerk session. Role requirements are noted per endpoint.
GET /api/tenants/{slug}/ai-transform/rules
List all AI transform rules for a tenant.
- Auth: Clerk session + tenant access (viewer+)
- Response:
{ rules[] } - Errors:
404tenant not found
POST /api/tenants/{slug}/ai-transform/rules
Create a new AI transform rule.
- Auth: Clerk session + tenant access (editor+)
- Body:
{ presetId, fieldSelector, routePattern?, profile?, targetLanguage?, tone?, customInstruction? } - Response: Created rule object
- Errors:
400validation,403insufficient role,404tenant not found
PUT /api/tenants/{slug}/ai-transform/rules/{id}
Update an existing rule.
- Auth: Clerk session + tenant access (editor+)
- Body: Changed fields (partial update)
- Response: Updated rule object
- Errors:
400validation,403,404rule not found
DELETE /api/tenants/{slug}/ai-transform/rules/{id}
Delete a rule.
- Auth: Clerk session + tenant access (editor+)
- Response:
{ ok } - Errors:
403,404rule not found
POST /api/tenants/{slug}/ai-transform/rules/{id}/activate
Toggle the active/draft status of a rule.
- Auth: Clerk session + tenant access (editor+)
- Response: Updated rule with new status
- Errors:
403,404rule not found
POST /api/tenants/{slug}/ai-transform/preview
Preview a transformation without saving or affecting production traffic.
- Auth: Clerk session + tenant access (editor+)
- Body: Rule config + sample content
- Response:
{ original, transformed } - Errors:
400validation,403,404,429token limit exceeded
POST /api/tenants/{slug}/ai-transform/regenerate
Force-refresh the edge cache for a single (profile, apiPath) pair. The handler fetches the upstream CMS response, applies the same URL rewrites the Worker would, runs the active rules through the runtime AI pipeline, and bumps a per-(tenant, profile) cache version so the next public ?transform=<profile> request misses and stores the regenerated body.
Use this when an active rule has been edited and you want the next visitor to see the new AI output immediately, or when the LLM produced a one-off bad answer that you want to overwrite without changing any rule. Cache invalidation is scoped to the (tenant, profile) pair — no global cache wipe.
- Auth: Clerk session + tenant access (editor+)
- Body:
{ profile: string, apiPath: string }—apiPathmust start with/; an optional leading/~apiprefix is stripped - Response:
{ profile, apiPath, regenVersion, status, skipReason, errorCode, stats, body }—bodyis the regenerated JSON,regenVersionis the new per-profile counter,statusmirrors the run status values above - Errors:
400validation or non-JSON upstream,403insufficient role,404tenant not found / no active rules / no rules match the path,502upstream CMS error
GET /api/tenants/{slug}/ai-transform/runs
List recent transform runs with optional filters.
- Auth: Clerk session + tenant access (viewer+)
- Query params:
limit,offset,mode(preview|runtime),status,profile - Response:
{ runs[], total } - Errors:
404tenant not found
GET /api/tenants/{slug}/ai-transform/usage
Return current token usage and configured limits for the tenant.
- Auth: Clerk session + tenant access (viewer+)
- Response:
{ monthly: TokenUsage, daily: TokenUsage, limits } - Errors:
404tenant not found
GET /api/tenants/{slug}/ai-transform/profiles
Return the list of distinct profile slugs that have at least one active rule for the tenant. Used by the AI Experiments variant editor to populate the profile dropdown — activation validation in the experiments API enforces the same constraint server-side.
- Auth: Clerk session + tenant access (viewer+)
- Response:
{ profiles: string[] } - Errors:
404tenant not found
POST /api/ai-transform/execute
Internal execution endpoint called by the edge Worker for live transform profiles. Not intended for direct use.
- Auth:
X-AI-Transform-Keyheader — shared secret between the proxy Worker and Nitro - Body: Transform profile, request URL, and JSON payload
- Response: Transformed JSON payload
- Errors:
401invalid key,400missing fields,500transform failure
AI Experiments API
AI Experiments let you A/B test content variants at the proxy layer. An experiment defines a route pattern and a set of named variants; the Worker selects a variant per request and returns the matching content.
GET /api/tenants/{slug}/ai-experiments
List all non-archived AI experiments and their variants for the tenant.
- Auth: Clerk session + tenant access (viewer+)
- Response:
{ experiments[] } - Errors:
404tenant not found
POST /api/tenants/{slug}/ai-experiments
Create a new AI experiment in draft status. Pass an optional variants array to set the initial variant set in the same call.
- Auth: Clerk session + tenant access (editor+)
- Body:
{ name, routePattern?, routeMatch?, variants?: [{ key, weight?, config? }] } - Response: Created experiment object with variants
- Errors:
400validation,403insufficient role,404tenant not found
GET /api/tenants/{slug}/ai-experiments/{id}
Return a single experiment and its current variants.
- Auth: Clerk session + tenant access (viewer+)
- Response: Experiment object with variants
- Errors:
400invalid id,404experiment or tenant not found
PATCH /api/tenants/{slug}/ai-experiments/{id}
Update an experiment's name, route pattern, route match, or status. Transitioning status to active runs activation validation (variant count, profile existence, route overlap). On success the proxy KV config is re-synced so the Worker picks up the change within its cache TTL.
- Auth: Clerk session + tenant access (editor+)
- Body: Partial update — any combination of
{ name?, routePattern?, routeMatch?, status? } - Response: Updated experiment object
- Errors:
400validation or activation validation failure,403,404experiment not found
DELETE /api/tenants/{slug}/ai-experiments/{id}
Soft-archive an experiment (sets status to archived). The row and variants are preserved so historical request log entries remain joinable.
- Auth: Clerk session + tenant access (editor+)
- Response: Archived experiment object
- Errors:
400invalid id,403,404experiment not found
PUT /api/tenants/{slug}/ai-experiments/{id}/variants
Transactionally replace the full variant set for an experiment. All existing variants are deleted and replaced with the supplied list in a single operation. On success the proxy KV config is re-synced.
- Auth: Clerk session + tenant access (editor+)
- Body:
{ variants: [{ key, weight?, config? }] } - Response: Updated experiment object with new variants
- Errors:
400validation,403,404experiment not found
GET /api/tenants/{slug}/ai-experiments/{id}/exposures
Return aggregate exposure metrics for one experiment, sourced from request logs. Reports how many requests were routed to each variant. Phase 2A: all-time totals, no time-window filtering.
- Auth: Clerk session + tenant access (viewer+)
- Response: Exposure summary with per-variant counts
- Errors:
400invalid id,404experiment or tenant not found