One AI gateway, both directions
Barbacane is the open-source, Rust-native AI gateway that handles AI traffic on the way out and on the way in. Route your application's LLM calls through OpenAI, Anthropic, or Ollama with provider fallback and policy routing. Expose your existing APIs to AI agents as typed MCP tools. Same gateway, same middleware, same spec.
AI traffic has two directions
Most AI gateways only handle one. Barbacane handles both, from the same spec, with the same middleware chain.
Outbound: your app calls LLMs
The ai-proxy dispatcher routes requests to
OpenAI, Anthropic, Ollama, or any OpenAI-compatible endpoint. Unified
OpenAI-compatible API surface, pinned provider contracts, provider fallback,
named targets for policy routing.
Inbound: agents call your APIs
Your OpenAPI spec compiles into an MCP tool server. Agents discover tools via JSON-RPC 2.0, tool calls pass through your existing auth and rate-limit middleware as regular HTTP requests. Opt-out per operation.
Jump to inboundRoute your LLM calls, with fallback and policy
Clients send OpenAI-format requests. Barbacane's ai-proxy
dispatcher translates per provider, handles streaming where the provider
supports it, pins the upstream API version, and falls back to a secondary
provider when the primary fails.
paths:
/v1/chat/completions:
post:
operationId: chatCompletion
summary: Route LLM chat completion requests
x-barbacane-dispatch:
name: ai-proxy
config:
provider: openai
model: gpt-4o
api_key: "${OPENAI_API_KEY}"
timeout: 120
# Tried in order on 5xx / timeout
fallback:
- provider: anthropic
model: claude-sonnet-4-20250514
api_key: "${ANTHROPIC_API_KEY}"
- provider: ollama
model: llama3
base_url: http://ollama:11434
# Named targets for policy-driven routing
targets:
local:
provider: ollama
model: mistral
base_url: http://ollama:11434
premium:
provider: anthropic
model: claude-opus-4-6
api_key: "${ANTHROPIC_API_KEY}"
default_target: local Providers
OpenAI, Anthropic, and Ollama out of the box. Any OpenAI-compatible server
(vLLM, TGI, LocalAI, Azure OpenAI) works via provider: openai
plus a custom base_url.
Pinned contracts
Provider adapters are built against a pinned API version. Upstream breaking changes are a conscious, tested plugin upgrade, not a silent production surprise.
Policy-driven routing
Named targets combine with the cel
middleware: stack CEL rules (consumer tier, scopes, headers) that set
ai.target in context, and the dispatcher
picks the matching target. Credentials stay in dispatcher config, never
in context.
Observability
Prometheus metrics for requests per provider, latency, token usage, and fallback events. OpenTelemetry spans linked to the upstream call.
Expose your APIs to agents as MCP tools
Enable MCP once at the spec root and every operation becomes an agent-callable tool. Authentication, rate limits, validation, and audit logging all apply to tool calls, because they are just HTTP requests in disguise.
# Enable MCP for the whole API
x-barbacane-mcp:
enabled: true
server_name: Orders API
server_version: 1.0.0
paths:
/orders/{id}:
get:
operationId: getOrder
summary: Fetch an order by id
security:
- bearerAuth: []
parameters:
- name: id
in: path
required: true
schema: { type: string }
responses:
'200':
description: Order
content:
application/json:
schema:
$ref: '#/components/schemas/Order'
/admin/orders/purge:
delete:
operationId: purgeOrders
summary: Purge archived orders
# Hide this one from agents
x-barbacane-mcp:
enabled: false Agents discover your tools
Barbacane exposes a JSON-RPC 2.0 MCP endpoint at
/__barbacane/mcp. Each operation becomes
a typed tool, with the name taken from operationId,
the description from summary, and input
and output schemas merged from your parameters, body, and responses.
Calls run through your middleware
Tool calls are HTTP requests. They pass through authentication, authorization, rate limits, validation, transformations, and observability like any other request. No shadow stack, no drift.
Opt out what agents should not see
By default every operation is a tool. Add
x-barbacane-mcp: { enabled: false }
to any operation you want to hide, such as admin endpoints or destructive
actions.
Shift-left lint
The vacuum ruleset
runs in your editor, pre-commit, or CI. Missing
operationIds and descriptions fail at
lint time, not at call time.
AI governance, both directions
Four middlewares compose around the dispatcher, regardless of whether traffic is outbound to an LLM or inbound from an agent. Named profiles, CEL expressions, fail-closed on misconfiguration.
ai-prompt-guard
PII redaction, regex allow and deny lists, message count and length limits, managed system-prompt templates with variable substitution. Regex is compiled at lint time.
ai-token-limit
Token-based sliding-window rate limiting per consumer, per model, per window.
Stack instances for minute-and-hour caps. Same partition keys on
on_request and
on_response.
ai-cost-tracker
Per-request USD metric derived from a configurable price table. Emits the
cost_dollars Prometheus counter labelled
by provider, model, and consumer.
ai-response-guard
PII regex redaction and blocked-pattern 502 rewrite on responses. Schema re-validation before the response leaves the gateway. Fail-closed on misconfiguration or invalid regex.
One gateway, not three
Most AI-gateway products specialize in one direction. The outbound specialists (Portkey, LiteLLM, Cloudflare AI Gateway) are good at LLM routing and cost control, and nothing else. The inbound specialists (the small but growing MCP gateway category) handle agent tool calls and ignore outbound traffic. The API gateways you already run (Kong, Tyk, Apigee) handle neither.
Running three separate products in the request path is a valid architecture. It is also three config files, three observability stacks, three blast radiuses, and three sets of middleware that look similar but are not composable with each other. Shadow stacks grow in that gap.
Barbacane collapses the three into one by treating AI gateway as composition: the same dispatchers, the same middleware plugins, the same spec-first config surface cover API gateway fundamentals, outbound AI proxy, and inbound MCP. You can still adopt one direction at a time. You do not have to adopt three separate products to do it.
Built for these teams
AI product builders
Ship agents that call your production APIs safely and call LLMs cost-effectively. One gateway for both sides of the agent loop.
- Provider fallback on the outbound side
- Typed MCP tools from OpenAPI on the inbound side
- Single cost dashboard across both
Platform teams
Give every product team policy-compliant AI infrastructure: outbound for their app's LLM calls, inbound for their APIs' agent exposure. One control plane, one audit trail.
- Central policy via CEL, OPA, or consumer ACLs
- Per-tenant cost attribution end-to-end
- No per-team MCP server sprawl
Regulated-industry CIOs
AI traffic your security, audit, and compliance teams can approve. Single gateway, clear blast radius, fail-closed defaults, FIPS 140-3 option.
- AGPLv3, no vendor lock-in
- Self-hostable on-prem or air-gapped
- Artifact provenance and drift detection
Frequently asked
How is this different from Portkey or LiteLLM?
They are outbound-only AI gateways. Barbacane handles both directions plus API gateway fundamentals, from one spec, with one middleware chain. Full comparison in this post.
Which LLM providers are supported on the outbound side?
OpenAI, Anthropic, and Ollama out of the box, with pinned upstream API
versions. Any OpenAI-compatible server (vLLM, TGI, LocalAI, Azure OpenAI,
self-hosted inference) works via provider: openai
plus a custom base_url.
Which MCP transport is supported on the inbound side?
JSON-RPC 2.0 over HTTP POST on /__barbacane/mcp.
Session termination via DELETE on the same path.
Can I adopt one direction without the other?
Yes. The two directions are independent. You can run Barbacane as an
outbound-only AI proxy, an inbound-only MCP gateway, or both, depending on
which x-barbacane-* extensions your spec
uses.
Do the AI governance middlewares apply to both directions?
Yes. ai-prompt-guard,
ai-token-limit,
ai-cost-tracker, and
ai-response-guard compose around any
dispatcher. The same four middlewares govern outbound LLM calls and inbound
MCP tool calls, so your cost attribution, token caps, and scrubbing policies
live in one place.
Do I have to rewrite my existing APIs?
No. Add x-barbacane-* extensions to your
existing OpenAPI spec. Barbacane compiles it into an artifact that speaks
both regular HTTP and MCP to the same upstream services.
Does authentication pass through?
On the inbound side, the Authorization
header from the MCP request is forwarded into the internal dispatch; your
jwt-auth,
apikey-auth, or OAuth2 middleware
validates it like any other request. On the outbound side, provider
credentials stay in dispatcher config and never leave it.
Ready to make AI one of your gateway's concerns, not a parallel stack?
Read the docs, star the repo, or bring Barbacane into an internal platform pilot.