Routing & Resilience

Smart retries

Most retryable upstream failures never reach your client. Meridian Blue retries them inside the gateway with bounded budgets and jittered exponential backoff.

What gets retried automatically

The router classifies every upstream error as retryable or not, then either retries internally or surfaces the failure depending on the chain configuration.

Retryable signals:

  • HTTP 408 Request Timeout
  • HTTP 429 Too Many Requests (general — when not specifically marked non-retryable by upstream code, e.g. OpenAI insufficient_quota)
  • HTTP 500, 502, 503, 504
  • Network errors: ECONNRESET, ECONNABORTED, ETIMEDOUT
  • Anthropic-specific: overloaded_error, api_error, rate_limit_error

Retry policy

SettingDefault
Max retries per provider2 (so up to 3 attempts total per chain entry)
Backoff base500 ms
Backoff cap4 000 ms
Backoff curveexponential, delay = min(base * 2^attempt, cap) + jitter

Once the retry budget for an entry is spent, the router advances to the next entry in the chain (if any). When the entire chain is exhausted, the response is 502 all_providers_failed with the full provider_attempts array.

Single-shot mode

When the chain has exactly one model (the caller passed model: "..." or models: ["..."] with one entry), retries are disabled. The single attempt either succeeds or fails. This honours the principle that a one-model request is a deliberate "no fallback" choice — Meridian Blue won't second-guess it.

If you want the router to retry a single provider, supply a chain that lists the same model twice — it will be tried twice with the full backoff curve in between.

Client-side retries

When a 4xx or an exhausted-chain 502 reaches your client, follow the rule documented under Errors → Retry strategy: use error.retryable as the source of truth, never the status code alone.