AI Transform API

Preview feature. AI Transform is in active preview — the API surface, rule semantics, rate limits, and per-token pricing may change before general availability. Suitable for evaluation and non-critical workloads. Production deployments should pin a specific SDK version, monitor token spend in the dashboard, and avoid putting AI-rewritten responses on user-blocking critical paths until GA. See the Cachely status page for any active limitations.

AI Transforms let you rewrite, translate, shorten, or otherwise adapt CMS JSON responses at the Cachely proxy layer. The transformed result is what your frontend receives — the source CMS entry is never modified.

How it works

You define one or more rules for a project. Each rule targets specific JSON fields and describes a transformation (translate, shorten, rewrite tone, or a custom instruction).
Rules are grouped by an optional profile label (e.g. "de", "editorial").
At request time, adding ?transform=<profile> to an API proxy request activates all active rules with that profile.
The proxy executes the transformation server-side and returns the adapted JSON to the client.
Transformed responses are cached at the edge — repeat requests with the same profile serve from cache without re-running the LLM.

AI Transforms only operate on application/json upstream responses. Asset responses, HTML, and non-JSON payloads are never transformed.

Rule configuration

Each rule requires:

Field	Required	Description
`fieldSelector`	Yes	Dot-notation path to one or more fields (e.g. `fields.title`, `data.items[*].description`)
`presetId`	Yes	The transformation type — see presets below
`routePattern`	No	Path prefix filter; only requests matching this prefix trigger the rule
`profile`	No	A string label used to group rules and activate them at runtime via `?transform=<profile>`

Preset modes

Preset	What it does	Additional options
`rewrite_tone`	Rewrites content in a different tone	`tone` (e.g. `formal`, `casual`, `playful`)
`shorten`	Condenses the field to fewer words	—
`translate`	Translates the field to another language	`targetLanguage` (e.g. `"de"`, `"fr"`)
`custom`	Applies a free-form GPT instruction	`customInstruction` (text); optionally `constraints` and an `example` input/output pair

Array fields use [*] notation — for example, items[*].title targets the title field of every item in an array.

Rule statuses

Status	Preview	Runtime
`active`	Yes	Yes
`draft`	Yes	No — never runs on live traffic
`paused`	No	No

draft is the default status for newly created rules. Activate a rule explicitly before it affects production traffic.

Preview vs. runtime

Preview

Preview lets you test a transformation against sample content before activating a rule.

Endpoint: POST /api/tenants/{slug}/ai-transform/preview
Runs immediately; no production traffic is affected
Fail-closed: if token limits are exceeded, the request returns HTTP 429. The error is shown in the admin panel.
Processes up to 5 fields per request
Recorded in the run log with mode: "preview"

Runtime

Runtime transforms run on live API proxy traffic when a request includes ?transform=<profile>.

GET /~api/entries?transform=editorial
GET /~api/entries?transform=de

The profile value must match the profile field on at least one active rule. If no active rule matches, the request passes through untransformed.

Fail-open: if token limits are exceeded or the LLM call fails, the original (untransformed) JSON is returned with status: "passthrough". No error is returned to the end client.
Processes up to 20 fields per rule (subject to the tenant-level maxFieldsPerResponse limit)
Recorded in the run log with mode: "runtime"

Usage and limits

Limit	Default	Notes
Monthly token cap	none	Total input + output tokens per calendar month (UTC)
Daily token cap	none	Total input + output tokens per UTC day
Max fields per response	20 (runtime), 5 (preview)	Global across all rules in one request
Max characters per field	8,000	Configurable down, not up

When a runtime transform is skipped because a limit is exceeded, the run log records status: "passthrough" with a reason string. Check the run detail to diagnose limit-related skips.

Run status values

Status	Meaning
`applied`	All targeted fields were transformed successfully
`partial`	Some fields transformed; others failed (e.g. per-field LLM error)
`passthrough`	Original content returned unchanged — no matching active rules, non-JSON upstream response, or limits exceeded
`error`	Execution failed entirely (LLM error, timeout, etc.)

Observability

The Recent Runs panel in the admin dashboard shows:

Timestamp, mode (preview / runtime), profile, status, fields attempted, rules evaluated, latency, token usage
Filters by mode, status, and profile
Expanding a row shows rule summary, per-field error details, and limit reason where applicable

What AI Transforms do not do

They do not modify images, video, or any non-text content
They do not transform asset proxy responses
They do not process HTML — only application/json upstream responses
They do not run on non-GET requests
They do not guarantee deterministic output — LLM responses vary between runs
draft rules never run in production, regardless of profile name

Common issues

Transforms not running in production: check that the rule status is active (not draft) and that the ?transform=<profile> value matches the rule's profile field exactly — case-sensitive.

Wrong fields being transformed: array fields require [*] notation (e.g. items[*].title, not items.title).

Runtime returning passthrough unexpectedly: check the run detail for the reason — exceeded token limits are the most common cause.

Preview works but runtime does not: preview evaluates draft + active rules; runtime evaluates only active rules.

API reference

All endpoints require a Clerk session. Role requirements are noted per endpoint.

`GET /api/tenants/{slug}/ai-transform/rules`

List all AI transform rules for a tenant.

Auth: Clerk session + tenant access (viewer+)
Response: { rules[] }
Errors: 404 tenant not found

`POST /api/tenants/{slug}/ai-transform/rules`

Create a new AI transform rule.

Auth: Clerk session + tenant access (editor+)
Body: { presetId, fieldSelector, routePattern?, profile?, targetLanguage?, tone?, customInstruction? }
Response: Created rule object
Errors: 400 validation, 403 insufficient role, 404 tenant not found

`PUT /api/tenants/{slug}/ai-transform/rules/{id}`

Update an existing rule.

Auth: Clerk session + tenant access (editor+)
Body: Changed fields (partial update)
Response: Updated rule object
Errors: 400 validation, 403, 404 rule not found

`DELETE /api/tenants/{slug}/ai-transform/rules/{id}`

Delete a rule.

Auth: Clerk session + tenant access (editor+)
Response: { ok }
Errors: 403, 404 rule not found

`POST /api/tenants/{slug}/ai-transform/rules/{id}/activate`

Toggle the active/draft status of a rule.

Auth: Clerk session + tenant access (editor+)
Response: Updated rule with new status
Errors: 403, 404 rule not found

`POST /api/tenants/{slug}/ai-transform/preview`

Preview a transformation without saving or affecting production traffic.

Auth: Clerk session + tenant access (editor+)
Body: Rule config + sample content
Response: { original, transformed }
Errors: 400 validation, 403, 404, 429 token limit exceeded

`POST /api/tenants/{slug}/ai-transform/regenerate`

Force-refresh the edge cache for a single (profile, apiPath) pair. The handler fetches the upstream CMS response, applies the same URL rewrites the Worker would, runs the active rules through the runtime AI pipeline, and bumps a per-(tenant, profile) cache version so the next public ?transform=<profile> request misses and stores the regenerated body.

Use this when an active rule has been edited and you want the next visitor to see the new AI output immediately, or when the LLM produced a one-off bad answer that you want to overwrite without changing any rule. Cache invalidation is scoped to the (tenant, profile) pair — no global cache wipe.

Auth: Clerk session + tenant access (editor+)
Body: { profile: string, apiPath: string } — apiPath must start with /; an optional leading /~api prefix is stripped
Response: { profile, apiPath, regenVersion, status, skipReason, errorCode, stats, body } — body is the regenerated JSON, regenVersion is the new per-profile counter, status mirrors the run status values above
Errors: 400 validation or non-JSON upstream, 403 insufficient role, 404 tenant not found / no active rules / no rules match the path, 502 upstream CMS error

`GET /api/tenants/{slug}/ai-transform/runs`

List recent transform runs with optional filters.

Auth: Clerk session + tenant access (viewer+)
Query params: limit, offset, mode (preview | runtime), status, profile
Response: { runs[], total }
Errors: 404 tenant not found

`GET /api/tenants/{slug}/ai-transform/usage`

Return current token usage and configured limits for the tenant.

Auth: Clerk session + tenant access (viewer+)
Response: { monthly: TokenUsage, daily: TokenUsage, limits }
Errors: 404 tenant not found

`GET /api/tenants/{slug}/ai-transform/profiles`

Return the list of distinct profile slugs that have at least one active rule for the tenant. Used by the AI Experiments variant editor to populate the profile dropdown — activation validation in the experiments API enforces the same constraint server-side.

Auth: Clerk session + tenant access (viewer+)
Response: { profiles: string[] }
Errors: 404 tenant not found

`POST /api/ai-transform/execute`

Internal execution endpoint called by the edge Worker for live transform profiles. Not intended for direct use.

Auth: X-AI-Transform-Key header — shared secret between the proxy Worker and Nitro
Body: Transform profile, request URL, and JSON payload
Response: Transformed JSON payload
Errors: 401 invalid key, 400 missing fields, 500 transform failure

AI Experiments API

AI Experiments let you A/B test content variants at the proxy layer. An experiment defines a route pattern and a set of named variants; the Worker selects a variant per request and returns the matching content.

`GET /api/tenants/{slug}/ai-experiments`

List all non-archived AI experiments and their variants for the tenant.

Auth: Clerk session + tenant access (viewer+)
Response: { experiments[] }
Errors: 404 tenant not found

`POST /api/tenants/{slug}/ai-experiments`

Create a new AI experiment in draft status. Pass an optional variants array to set the initial variant set in the same call.

Auth: Clerk session + tenant access (editor+)
Body: { name, routePattern?, routeMatch?, variants?: [{ key, weight?, config? }] }
Response: Created experiment object with variants
Errors: 400 validation, 403 insufficient role, 404 tenant not found

`GET /api/tenants/{slug}/ai-experiments/{id}`

Return a single experiment and its current variants.

Auth: Clerk session + tenant access (viewer+)
Response: Experiment object with variants
Errors: 400 invalid id, 404 experiment or tenant not found

`PATCH /api/tenants/{slug}/ai-experiments/{id}`

Update an experiment's name, route pattern, route match, or status. Transitioning status to active runs activation validation (variant count, profile existence, route overlap). On success the proxy KV config is re-synced so the Worker picks up the change within its cache TTL.

Auth: Clerk session + tenant access (editor+)
Body: Partial update — any combination of { name?, routePattern?, routeMatch?, status? }
Response: Updated experiment object
Errors: 400 validation or activation validation failure, 403, 404 experiment not found

`DELETE /api/tenants/{slug}/ai-experiments/{id}`

Soft-archive an experiment (sets status to archived). The row and variants are preserved so historical request log entries remain joinable.

Auth: Clerk session + tenant access (editor+)
Response: Archived experiment object
Errors: 400 invalid id, 403, 404 experiment not found

`PUT /api/tenants/{slug}/ai-experiments/{id}/variants`

Transactionally replace the full variant set for an experiment. All existing variants are deleted and replaced with the supplied list in a single operation. On success the proxy KV config is re-synced.

Auth: Clerk session + tenant access (editor+)
Body: { variants: [{ key, weight?, config? }] }
Response: Updated experiment object with new variants
Errors: 400 validation, 403, 404 experiment not found

`GET /api/tenants/{slug}/ai-experiments/{id}/exposures`

Return aggregate exposure metrics for one experiment, sourced from request logs. Reports how many requests were routed to each variant. Phase 2A: all-time totals, no time-window filtering.

Auth: Clerk session + tenant access (viewer+)
Response: Exposure summary with per-variant counts
Errors: 400 invalid id, 404 experiment or tenant not found