New: Audio API, Embeddings & Realtime WebSocket now available!
osmAPI LogoosmAPI

Anthropic API Compatibility

Use the Anthropic-compatible endpoint to access any LLM model through the familiar Anthropic API format.

Anthropic API Compatibility

osmAPI provides a native Anthropic-compatible endpoint at /v1/messages that allows you to use any model in our catalog while maintaining the familiar Anthropic API format This is especially useful for applications designed for Claude that you want to extend to use other models.

Overview

The Anthropic endpoint transforms requests from Anthropic's message format to the OpenAI-compatible format used by osmAPI, then transforms the responses back to Anthropic's format. This means you can:

  • Use any model available in osmAPI with Anthropic's API format
  • Maintain existing code that uses Anthropic's SDK or API format
  • Access models from OpenAI, Google, Cohere, and other providers through the Anthropic interface
  • Leverage osmAPI's routing, caching, and cost optimization features

Basic Usage

Configuration for Claude Code

This endpoint is perfect for configuring Claude Code to use any model available in osmAPI:

export ANTHROPIC_BASE_URL=https://api.osmapi.com
export ANTHROPIC_AUTH_TOKEN=osm_your_api_key_here
# optional: specify a model, otherwise it uses the default Claude model
export ANTHROPIC_MODEL=gpt-5  # or any model from our catalog

# now run claude!
claude

Need a detailed walkthrough? See the Claude Code + osmAPI Setup Guide for full installation instructions on macOS, Windows & Linux — including Node.js setup, permissions, and troubleshooting.

Choosing Models

You can use any model from the models page. Popular options for Claude Code include:

# Use OpenAI's latest model
export ANTHROPIC_MODEL=gpt-5

# Use a cost-effective alternative
export ANTHROPIC_MODEL=gpt-5-mini

# Use Google's Gemini
export ANTHROPIC_MODEL=google-ai-studio/gemini-2.5-pro

# Use Anthropic's actual Claude models
export ANTHROPIC_MODEL=anthropic/claude-sonnet-4-20250514

Environment Variables

When configuring Claude Code or other Anthropic-compatible applications, you can use these environment variables:

ANTHROPIC_MODEL

Specifies the main model to use for primary requests.

  • Default: claude-sonnet-4-20250514
  • Example: export ANTHROPIC_MODEL=gpt-5

ANTHROPIC_SMALL_FAST_MODEL

Specifies a smaller, faster model used for background functionality and internal operations.

  • Default: claude-haiku-4-5
  • Example: export ANTHROPIC_SMALL_FAST_MODEL=gpt-5-nano
# Example configuration
export ANTHROPIC_BASE_URL=https://api.osmapi.com
export ANTHROPIC_AUTH_TOKEN=osm_your_api_key_here
export ANTHROPIC_MODEL=gpt-5
export ANTHROPIC_SMALL_FAST_MODEL=gpt-5-nano

Authentication

The endpoint supports two authentication methods:

# Method 1: Authorization Bearer header (standard)
curl -X POST "https://api.osmapi.com/v1/messages" \
  -H "Authorization: Bearer $OSM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '...'

# Method 2: x-api-key header (Anthropic SDK native)
curl -X POST "https://api.osmapi.com/v1/messages" \
  -H "x-api-key: $OSM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '...'

Both methods work identically. Claude Code, Kilo Code, Cline, OpenCode, and other Anthropic SDK clients use x-api-key automatically — no extra configuration needed.

Supported Parameters

All standard Anthropic Messages API parameters are supported:

ParameterTypeDescription
modelstringRequired. The model to use
messagesarrayRequired. Input messages
max_tokensnumberRequired. Maximum tokens to generate
systemstring or arraySystem prompt (string or array of text blocks with cache_control)
streambooleanEnable SSE streaming
temperaturenumberSampling temperature (0–1)
top_pnumberNucleus sampling parameter
top_knumberTop-k sampling parameter
stop_sequencesarrayCustom stop sequences
toolsarrayTool definitions with name, description, input_schema
tool_choiceobjectControl tool use: auto, any, none, or specific tool
thinkingobjectExtended thinking: enabled, disabled, or adaptive with optional budget_tokens
metadataobjectRequest metadata (e.g. user_id)

The service_tier parameter is an explicit schema field and is passed through to the upstream provider. Other unknown parameters (e.g. container) are accepted and silently ignored, ensuring full compatibility with the latest Anthropic SDK versions.

Advanced Features

Making a manual request

curl -X POST "https://api.osmapi.com/v1/messages" \
  -H "x-api-key: $OSM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5",
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ],
    "max_tokens": 100
  }'

Tool Use

curl -X POST "https://api.osmapi.com/v1/messages" \
  -H "x-api-key: $OSM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5",
    "max_tokens": 500,
    "tools": [{
      "name": "get_weather",
      "description": "Get the current weather",
      "input_schema": {
        "type": "object",
        "properties": { "city": { "type": "string" } },
        "required": ["city"]
      }
    }],
    "tool_choice": {"type": "auto"},
    "messages": [
      {"role": "user", "content": "What is the weather in London?"}
    ]
  }'

Thinking / Reasoning

For reasoning-capable models, use the thinking parameter to enable extended thinking. When thinking is enabled without specifying budget_tokens, the reasoning effort defaults to medium.

curl -X POST "https://api.osmapi.com/v1/messages" \
  -H "x-api-key: $OSM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.5-397b-a17b",
    "max_tokens": 1000,
    "thinking": {"type": "enabled", "budget_tokens": 5000},
    "messages": [
      {"role": "user", "content": "Solve: what is the integral of x^2 * e^x?"}
    ]
  }'

The response includes a thinking content block before the text:

{
	"content": [
		{
			"type": "thinking",
			"thinking": "I need to use integration by parts twice..."
		},
		{
			"type": "text",
			"text": "The integral of x²eˣ is (x² - 2x + 2)eˣ + C"
		}
	]
}

Response Format

The endpoint returns responses in Anthropic's message format:

{
	"id": "msg_abc123",
	"type": "message",
	"role": "assistant",
	"model": "gpt-5",
	"content": [
		{
			"type": "text",
			"text": "Hello! I'm doing well, thank you for asking. How can I help you today?"
		}
	],
	"stop_reason": "end_turn",
	"stop_sequence": null,
	"usage": {
		"input_tokens": 13,
		"output_tokens": 20
	}
}

How is this guide?