AI gateway · MCP native · AGPLv3

One AI gateway, both directions

Barbacane is the open-source, Rust-native AI gateway that handles AI traffic on the way out and on the way in. Route your application's LLM calls through OpenAI, Anthropic, or Ollama with provider fallback and policy routing. Expose your existing APIs to AI agents as typed MCP tools. Same gateway, same middleware, same spec.

Outbound

Route your LLM calls, with fallback and policy

Clients send OpenAI-format requests. Barbacane's ai-proxy dispatcher translates per provider, handles streaming where the provider supports it, pins the upstream API version, and falls back to a secondary provider when the primary fails.

paths:
  /v1/chat/completions:
    post:
      operationId: chatCompletion
      summary: Route LLM chat completion requests
      x-barbacane-dispatch:
        name: ai-proxy
        config:
          provider: openai
          model: gpt-4o
          api_key: "${OPENAI_API_KEY}"
          timeout: 120

          # Tried in order on 5xx / timeout
          fallback:
            - provider: anthropic
              model: claude-sonnet-4-20250514
              api_key: "${ANTHROPIC_API_KEY}"
            - provider: ollama
              model: llama3
              base_url: http://ollama:11434

          # Named targets for policy-driven routing
          targets:
            local:
              provider: ollama
              model: mistral
              base_url: http://ollama:11434
            premium:
              provider: anthropic
              model: claude-opus-4-6
              api_key: "${ANTHROPIC_API_KEY}"
          default_target: local

Providers

OpenAI, Anthropic, and Ollama out of the box. Any OpenAI-compatible server (vLLM, TGI, LocalAI, Azure OpenAI) works via provider: openai plus a custom base_url.

Pinned contracts

Provider adapters are built against a pinned API version. Upstream breaking changes are a conscious, tested plugin upgrade, not a silent production surprise.

Policy-driven routing

Named targets combine with the cel middleware: stack CEL rules (consumer tier, scopes, headers) that set ai.target in context, and the dispatcher picks the matching target. Credentials stay in dispatcher config, never in context.

Observability

Prometheus metrics for requests per provider, latency, token usage, and fallback events. OpenTelemetry spans linked to the upstream call.

Inbound

Expose your APIs to agents as MCP tools

Enable MCP once at the spec root and every operation becomes an agent-callable tool. Authentication, rate limits, validation, and audit logging all apply to tool calls, because they are just HTTP requests in disguise.

# Enable MCP for the whole API
x-barbacane-mcp:
  enabled: true
  server_name: Orders API
  server_version: 1.0.0

paths:
  /orders/{id}:
    get:
      operationId: getOrder
      summary: Fetch an order by id
      security:
        - bearerAuth: []
      parameters:
        - name: id
          in: path
          required: true
          schema: { type: string }
      responses:
        '200':
          description: Order
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Order'

  /admin/orders/purge:
    delete:
      operationId: purgeOrders
      summary: Purge archived orders
      # Hide this one from agents
      x-barbacane-mcp:
        enabled: false

Agents discover your tools

Barbacane exposes a JSON-RPC 2.0 MCP endpoint at /__barbacane/mcp. Each operation becomes a typed tool, with the name taken from operationId, the description from summary, and input and output schemas merged from your parameters, body, and responses.

Calls run through your middleware

Tool calls are HTTP requests. They pass through authentication, authorization, rate limits, validation, transformations, and observability like any other request. No shadow stack, no drift.

Opt out what agents should not see

By default every operation is a tool. Add x-barbacane-mcp: { enabled: false } to any operation you want to hide, such as admin endpoints or destructive actions.

Shift-left lint

The vacuum ruleset runs in your editor, pre-commit, or CI. Missing operationIds and descriptions fail at lint time, not at call time.

AI governance, both directions

Four middlewares compose around the dispatcher, regardless of whether traffic is outbound to an LLM or inbound from an agent. Named profiles, CEL expressions, fail-closed on misconfiguration.

ai-prompt-guard

PII redaction, regex allow and deny lists, message count and length limits, managed system-prompt templates with variable substitution. Regex is compiled at lint time.

ai-token-limit

Token-based sliding-window rate limiting per consumer, per model, per window. Stack instances for minute-and-hour caps. Same partition keys on on_request and on_response.

ai-cost-tracker

Per-request USD metric derived from a configurable price table. Emits the cost_dollars Prometheus counter labelled by provider, model, and consumer.

ai-response-guard

PII regex redaction and blocked-pattern 502 rewrite on responses. Schema re-validation before the response leaves the gateway. Fail-closed on misconfiguration or invalid regex.

One gateway, not three

Most AI-gateway products specialize in one direction. The outbound specialists (Portkey, LiteLLM, Cloudflare AI Gateway) are good at LLM routing and cost control, and nothing else. The inbound specialists (the small but growing MCP gateway category) handle agent tool calls and ignore outbound traffic. The API gateways you already run (Kong, Tyk, Apigee) handle neither.

Running three separate products in the request path is a valid architecture. It is also three config files, three observability stacks, three blast radiuses, and three sets of middleware that look similar but are not composable with each other. Shadow stacks grow in that gap.

Barbacane collapses the three into one by treating AI gateway as composition: the same dispatchers, the same middleware plugins, the same spec-first config surface cover API gateway fundamentals, outbound AI proxy, and inbound MCP. You can still adopt one direction at a time. You do not have to adopt three separate products to do it.

Built for these teams

AI product builders

Ship agents that call your production APIs safely and call LLMs cost-effectively. One gateway for both sides of the agent loop.

  • Provider fallback on the outbound side
  • Typed MCP tools from OpenAPI on the inbound side
  • Single cost dashboard across both

Platform teams

Give every product team policy-compliant AI infrastructure: outbound for their app's LLM calls, inbound for their APIs' agent exposure. One control plane, one audit trail.

  • Central policy via CEL, OPA, or consumer ACLs
  • Per-tenant cost attribution end-to-end
  • No per-team MCP server sprawl

Regulated-industry CIOs

AI traffic your security, audit, and compliance teams can approve. Single gateway, clear blast radius, fail-closed defaults, FIPS 140-3 option.

  • AGPLv3, no vendor lock-in
  • Self-hostable on-prem or air-gapped
  • Artifact provenance and drift detection

Frequently asked

How is this different from Portkey or LiteLLM?

They are outbound-only AI gateways. Barbacane handles both directions plus API gateway fundamentals, from one spec, with one middleware chain. Full comparison in this post.

Which LLM providers are supported on the outbound side?

OpenAI, Anthropic, and Ollama out of the box, with pinned upstream API versions. Any OpenAI-compatible server (vLLM, TGI, LocalAI, Azure OpenAI, self-hosted inference) works via provider: openai plus a custom base_url.

Which MCP transport is supported on the inbound side?

JSON-RPC 2.0 over HTTP POST on /__barbacane/mcp. Session termination via DELETE on the same path.

Can I adopt one direction without the other?

Yes. The two directions are independent. You can run Barbacane as an outbound-only AI proxy, an inbound-only MCP gateway, or both, depending on which x-barbacane-* extensions your spec uses.

Do the AI governance middlewares apply to both directions?

Yes. ai-prompt-guard, ai-token-limit, ai-cost-tracker, and ai-response-guard compose around any dispatcher. The same four middlewares govern outbound LLM calls and inbound MCP tool calls, so your cost attribution, token caps, and scrubbing policies live in one place.

Do I have to rewrite my existing APIs?

No. Add x-barbacane-* extensions to your existing OpenAPI spec. Barbacane compiles it into an artifact that speaks both regular HTTP and MCP to the same upstream services.

Does authentication pass through?

On the inbound side, the Authorization header from the MCP request is forwarded into the internal dispatch; your jwt-auth, apikey-auth, or OAuth2 middleware validates it like any other request. On the outbound side, provider credentials stay in dispatcher config and never leave it.

Ready to make AI one of your gateway's concerns, not a parallel stack?

Read the docs, star the repo, or bring Barbacane into an internal platform pilot.