AICore — Multi-Provider LLM Gateway
AICore—Multi-ProviderLLMGateway
A single SDK and edge proxy that puts OpenAI, Anthropic, Gemini, and Groq behind one contract — every call comes back normalized, with its cost, tokens, and latency attached.
The Problem
Shipping AI features means re-learning four different SDKs, having no visibility into what any single call costs, and rebuilding the same plumbing — retries, error handling, logging — for every provider. AICore collapses that into one contract: one SDK call returns a normalized response with per-call cost, token count, and latency, regardless of which provider served it.
Architecture Thinking
The whole system hangs off one idea: if every provider returns the same shape, then routing, error handling, cost accounting, and telemetry can each be written once. So the center of the codebase isn't a provider — it's a contract. A pnpm + Turborepo monorepo with a shared-types spine (every package depends on it; it depends on nothing) makes that contract enforceable at build time. A thin SDK calls a Cloudflare Workers edge proxy that runs a composed middleware chain and routes to one of four provider adapters.
The Provider Contract
Every adapter returns the same discriminated union and the same usage shape, so the proxy builds the response envelope and the telemetry row exactly once:
export interface ProviderCallUsage {
provider: string;
model: string;
inputTokens: number;
outputTokens: number;
costCents: number;
latencyMs: number;
}
// Discriminated union returned by every provider adapter.
export type ProviderCallResult =
| { ok: true; data: unknown; usage: ProviderCallUsage }
| { ok: false; error: AICoreError };
Adding a fifth provider is a closed change: implement the interface, register it, done.
Composed Middleware
Each concern — logging, auth, validation, circuit-breaker — is an isolated layer, run as an onion where every layer calls the next:
export function compose(...middlewares: Middleware[]) {
return async (request, env, ctx) => {
const context = { request, env, ctx, state: {}, startTime: Date.now() };
let index = -1;
async function dispatch(i: number): Promise<Response> {
if (i <= index) throw new Error("next() called multiple times");
index = i;
const fn = middlewares[i];
if (i === middlewares.length) return new Response("Not Found", { status: 404 });
return fn(context, () => dispatch(i + 1)); // hand off to the next layer
}
return dispatch(0);
};
}
Hot-Path Safety
Telemetry, logging, and spend tracking run off the request path via ctx.waitUntil — best-effort, fire-and-forget — so observability can never throw into a user's AI call. Each provider's wildly different error shapes are normalized to one AICoreError, guarded by a contract test suite all four adapters must pass.
Telemetry as a Contract
Usage rows land in a monthly-partitioned, row-level-secured Supabase Postgres table. Because partitioned tables are painful to migrate, the schema reserves its full future shape — agent id, workflow id, cache hit, quality score — from day one; later features fill columns instead of running risky migrations.
Constraints
It sits on the hot path of someone else's product, so nothing it does can add latency or a failure mode to their AI call — hence edge deployment and fire-and-forget telemetry. It spans independent deploy targets (worker, dashboard, SDK, CLI) sharing one contract, so the type spine and eight ADRs exist to keep them honest. And it's cost-reporting software, so the cost math has to be exact — a unit bug there is the whole value proposition gone.
What It Demonstrates
- Contract-first design — one discriminated-union result type lets four providers share routing, error handling, and telemetry, and makes a fifth a closed change.
- Hot-path safety —
ctx.waitUntilkeeps all observability off the user's request, so logging can never break a call. - Error normalization — four providers' incompatible error shapes collapse into one
AICoreError, enforced by a shared contract test suite. - Schema foresight — a partitioned telemetry table that reserves future columns up front, avoiding risky migrations later.
- Monorepo discipline — a shared-types spine plus eight written ADRs keep independently-deployed services in sync.
Outcome
Four working provider adapters, a unified SDK, accurate sub-cent cost reporting, eight architecture-decision records, and a blocking CI gate of strict type-checking plus green tests across every package. I built it as a production-grade reference for how a multi-provider AI layer should be structured — contracts first, observability built in, decisions written down.