AI Transform API

Preview feature. AI Transform is in active preview — the API surface, rule semantics, rate limits, and per-token pricing may change before general availability. Suitable for evaluation and non-critical workloads. Production deployments should pin a specific SDK version, monitor token spend in the dashboard, and avoid putting AI-rewritten responses on user-blocking critical paths until GA. See the Cachely status page for any active limitations.

AI Transforms let you rewrite, translate, shorten, or otherwise adapt CMS JSON responses at the Cachely proxy layer. The transformed result is what your frontend receives — the source CMS entry is never modified.


How it works

  1. You define one or more rules for a project. Each rule targets specific JSON fields and describes a transformation (translate, shorten, rewrite tone, or a custom instruction).
  2. Rules are grouped by an optional profile label (e.g. "de", "editorial").
  3. At request time, adding ?transform=<profile> to an API proxy request activates all active rules with that profile.
  4. The proxy executes the transformation server-side and returns the adapted JSON to the client.
  5. Transformed responses are cached at the edge — repeat requests with the same profile serve from cache without re-running the LLM.

AI Transforms only operate on application/json upstream responses. Asset responses, HTML, and non-JSON payloads are never transformed.


Rule configuration

Each rule requires:

FieldRequiredDescription
fieldSelectorYesDot-notation path to one or more fields (e.g. fields.title, data.items[*].description)
presetIdYesThe transformation type — see presets below
routePatternNoPath prefix filter; only requests matching this prefix trigger the rule
profileNoA string label used to group rules and activate them at runtime via ?transform=<profile>

Preset modes

PresetWhat it doesAdditional options
rewrite_toneRewrites content in a different tonetone (e.g. formal, casual, playful)
shortenCondenses the field to fewer words
translateTranslates the field to another languagetargetLanguage (e.g. "de", "fr")
customApplies a free-form GPT instructioncustomInstruction (text); optionally constraints and an example input/output pair

Array fields use [*] notation — for example, items[*].title targets the title field of every item in an array.


Rule statuses

StatusPreviewRuntime
activeYesYes
draftYesNo — never runs on live traffic
pausedNoNo

draft is the default status for newly created rules. Activate a rule explicitly before it affects production traffic.


Preview vs. runtime

Preview

Preview lets you test a transformation against sample content before activating a rule.

  • Endpoint: POST /api/tenants/{slug}/ai-transform/preview
  • Runs immediately; no production traffic is affected
  • Fail-closed: if token limits are exceeded, the request returns HTTP 429. The error is shown in the admin panel.
  • Processes up to 5 fields per request
  • Recorded in the run log with mode: "preview"

Runtime

Runtime transforms run on live API proxy traffic when a request includes ?transform=<profile>.

GET /~api/entries?transform=editorial
GET /~api/entries?transform=de

The profile value must match the profile field on at least one active rule. If no active rule matches, the request passes through untransformed.

  • Fail-open: if token limits are exceeded or the LLM call fails, the original (untransformed) JSON is returned with status: "passthrough". No error is returned to the end client.
  • Processes up to 20 fields per rule (subject to the tenant-level maxFieldsPerResponse limit)
  • Recorded in the run log with mode: "runtime"

Usage and limits

LimitDefaultNotes
Monthly token capnoneTotal input + output tokens per calendar month (UTC)
Daily token capnoneTotal input + output tokens per UTC day
Max fields per response20 (runtime), 5 (preview)Global across all rules in one request
Max characters per field8,000Configurable down, not up

When a runtime transform is skipped because a limit is exceeded, the run log records status: "passthrough" with a reason string. Check the run detail to diagnose limit-related skips.


Run status values

StatusMeaning
appliedAll targeted fields were transformed successfully
partialSome fields transformed; others failed (e.g. per-field LLM error)
passthroughOriginal content returned unchanged — no matching active rules, non-JSON upstream response, or limits exceeded
errorExecution failed entirely (LLM error, timeout, etc.)

Observability

The Recent Runs panel in the admin dashboard shows:

  • Timestamp, mode (preview / runtime), profile, status, fields attempted, rules evaluated, latency, token usage
  • Filters by mode, status, and profile
  • Expanding a row shows rule summary, per-field error details, and limit reason where applicable

What AI Transforms do not do

  • They do not modify images, video, or any non-text content
  • They do not transform asset proxy responses
  • They do not process HTML — only application/json upstream responses
  • They do not run on non-GET requests
  • They do not guarantee deterministic output — LLM responses vary between runs
  • draft rules never run in production, regardless of profile name

Common issues

Transforms not running in production: check that the rule status is active (not draft) and that the ?transform=<profile> value matches the rule's profile field exactly — case-sensitive.

Wrong fields being transformed: array fields require [*] notation (e.g. items[*].title, not items.title).

Runtime returning passthrough unexpectedly: check the run detail for the reason — exceeded token limits are the most common cause.

Preview works but runtime does not: preview evaluates draft + active rules; runtime evaluates only active rules.


API reference

All endpoints require a Clerk session. Role requirements are noted per endpoint.

GET /api/tenants/{slug}/ai-transform/rules

List all AI transform rules for a tenant.

  • Auth: Clerk session + tenant access (viewer+)
  • Response: { rules[] }
  • Errors: 404 tenant not found

POST /api/tenants/{slug}/ai-transform/rules

Create a new AI transform rule.

  • Auth: Clerk session + tenant access (editor+)
  • Body: { presetId, fieldSelector, routePattern?, profile?, targetLanguage?, tone?, customInstruction? }
  • Response: Created rule object
  • Errors: 400 validation, 403 insufficient role, 404 tenant not found

PUT /api/tenants/{slug}/ai-transform/rules/{id}

Update an existing rule.

  • Auth: Clerk session + tenant access (editor+)
  • Body: Changed fields (partial update)
  • Response: Updated rule object
  • Errors: 400 validation, 403, 404 rule not found

DELETE /api/tenants/{slug}/ai-transform/rules/{id}

Delete a rule.

  • Auth: Clerk session + tenant access (editor+)
  • Response: { ok }
  • Errors: 403, 404 rule not found

POST /api/tenants/{slug}/ai-transform/rules/{id}/activate

Toggle the active/draft status of a rule.

  • Auth: Clerk session + tenant access (editor+)
  • Response: Updated rule with new status
  • Errors: 403, 404 rule not found

POST /api/tenants/{slug}/ai-transform/preview

Preview a transformation without saving or affecting production traffic.

  • Auth: Clerk session + tenant access (editor+)
  • Body: Rule config + sample content
  • Response: { original, transformed }
  • Errors: 400 validation, 403, 404, 429 token limit exceeded

POST /api/tenants/{slug}/ai-transform/regenerate

Force-refresh the edge cache for a single (profile, apiPath) pair. The handler fetches the upstream CMS response, applies the same URL rewrites the Worker would, runs the active rules through the runtime AI pipeline, and bumps a per-(tenant, profile) cache version so the next public ?transform=<profile> request misses and stores the regenerated body.

Use this when an active rule has been edited and you want the next visitor to see the new AI output immediately, or when the LLM produced a one-off bad answer that you want to overwrite without changing any rule. Cache invalidation is scoped to the (tenant, profile) pair — no global cache wipe.

  • Auth: Clerk session + tenant access (editor+)
  • Body: { profile: string, apiPath: string }apiPath must start with /; an optional leading /~api prefix is stripped
  • Response: { profile, apiPath, regenVersion, status, skipReason, errorCode, stats, body }body is the regenerated JSON, regenVersion is the new per-profile counter, status mirrors the run status values above
  • Errors: 400 validation or non-JSON upstream, 403 insufficient role, 404 tenant not found / no active rules / no rules match the path, 502 upstream CMS error

GET /api/tenants/{slug}/ai-transform/runs

List recent transform runs with optional filters.

  • Auth: Clerk session + tenant access (viewer+)
  • Query params: limit, offset, mode (preview | runtime), status, profile
  • Response: { runs[], total }
  • Errors: 404 tenant not found

GET /api/tenants/{slug}/ai-transform/usage

Return current token usage and configured limits for the tenant.

  • Auth: Clerk session + tenant access (viewer+)
  • Response: { monthly: TokenUsage, daily: TokenUsage, limits }
  • Errors: 404 tenant not found

GET /api/tenants/{slug}/ai-transform/profiles

Return the list of distinct profile slugs that have at least one active rule for the tenant. Used by the AI Experiments variant editor to populate the profile dropdown — activation validation in the experiments API enforces the same constraint server-side.

  • Auth: Clerk session + tenant access (viewer+)
  • Response: { profiles: string[] }
  • Errors: 404 tenant not found

POST /api/ai-transform/execute

Internal execution endpoint called by the edge Worker for live transform profiles. Not intended for direct use.

  • Auth: X-AI-Transform-Key header — shared secret between the proxy Worker and Nitro
  • Body: Transform profile, request URL, and JSON payload
  • Response: Transformed JSON payload
  • Errors: 401 invalid key, 400 missing fields, 500 transform failure

AI Experiments API

AI Experiments let you A/B test content variants at the proxy layer. An experiment defines a route pattern and a set of named variants; the Worker selects a variant per request and returns the matching content.

GET /api/tenants/{slug}/ai-experiments

List all non-archived AI experiments and their variants for the tenant.

  • Auth: Clerk session + tenant access (viewer+)
  • Response: { experiments[] }
  • Errors: 404 tenant not found

POST /api/tenants/{slug}/ai-experiments

Create a new AI experiment in draft status. Pass an optional variants array to set the initial variant set in the same call.

  • Auth: Clerk session + tenant access (editor+)
  • Body: { name, routePattern?, routeMatch?, variants?: [{ key, weight?, config? }] }
  • Response: Created experiment object with variants
  • Errors: 400 validation, 403 insufficient role, 404 tenant not found

GET /api/tenants/{slug}/ai-experiments/{id}

Return a single experiment and its current variants.

  • Auth: Clerk session + tenant access (viewer+)
  • Response: Experiment object with variants
  • Errors: 400 invalid id, 404 experiment or tenant not found

PATCH /api/tenants/{slug}/ai-experiments/{id}

Update an experiment's name, route pattern, route match, or status. Transitioning status to active runs activation validation (variant count, profile existence, route overlap). On success the proxy KV config is re-synced so the Worker picks up the change within its cache TTL.

  • Auth: Clerk session + tenant access (editor+)
  • Body: Partial update — any combination of { name?, routePattern?, routeMatch?, status? }
  • Response: Updated experiment object
  • Errors: 400 validation or activation validation failure, 403, 404 experiment not found

DELETE /api/tenants/{slug}/ai-experiments/{id}

Soft-archive an experiment (sets status to archived). The row and variants are preserved so historical request log entries remain joinable.

  • Auth: Clerk session + tenant access (editor+)
  • Response: Archived experiment object
  • Errors: 400 invalid id, 403, 404 experiment not found

PUT /api/tenants/{slug}/ai-experiments/{id}/variants

Transactionally replace the full variant set for an experiment. All existing variants are deleted and replaced with the supplied list in a single operation. On success the proxy KV config is re-synced.

  • Auth: Clerk session + tenant access (editor+)
  • Body: { variants: [{ key, weight?, config? }] }
  • Response: Updated experiment object with new variants
  • Errors: 400 validation, 403, 404 experiment not found

GET /api/tenants/{slug}/ai-experiments/{id}/exposures

Return aggregate exposure metrics for one experiment, sourced from request logs. Reports how many requests were routed to each variant. Phase 2A: all-time totals, no time-window filtering.

  • Auth: Clerk session + tenant access (viewer+)
  • Response: Exposure summary with per-variant counts
  • Errors: 400 invalid id, 404 experiment or tenant not found
Need help understanding this?Ask Cachely Copilot about features, setup, or integrations.
Ask Copilot →