Skip to main content
Uno LLM Gateway supports OpenAI’s Responses API for text generation, images, tool calling, and reasoning. You can use any OpenAI SDK by simply pointing it to Uno’s gateway endpoint. This allows you to leverage Uno’s features like virtual keys, provider management, and observability while using your existing OpenAI code.

Overview

The Uno LLM Gateway provides an endpoint at /api/gateway/openai/responses that implements OpenAI’s Responses API specification. This means you can use any OpenAI SDK (Python, JavaScript, Go, etc.) without modifying your code - just change the base URL.

Usage Examples

main.py
from openai import OpenAI

# Point the client to Uno's gateway endpoint
client = OpenAI(
    base_url="http://localhost:6060/api/gateway/openai",
    api_key="sk-amg-your-virtual-key-here",  # or your OpenAI API key
)

# Use the Responses API
response = client.responses.create(
    model="gpt-4.1-mini",
    instructions="You are a helpful assistant.",
    input="Hello, how are you?",
)

print(response.output_text)

Accessing Other Provider Models

The gateway allows you to access models from other providers (like Gemini, Anthropic, etc.) using the same OpenAI SDK. Simply prefix the model name with the provider name followed by a slash:
  • Gemini/gemini-3-flash-preview - Access Google Gemini models
  • Anthropic/claude-haiku-4-5 - Access Anthropic Claude models
multi_provider.py
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:6060/api/gateway/openai",
    api_key="sk-amg-your-virtual-key-here",
)

# Use Gemini model
response = client.responses.create(
    model="Gemini/gemini-3-flash-preview",
    instructions="You are a helpful assistant.",
    input="Hello, how are you?",
)

print(response.output_text)

# Use Anthropic Claude model
response = client.responses.create(
    model="Anthropic/claude-haiku-4-5",
    instructions="You are a helpful assistant.",
    input="What is the capital of France?",
)

print(response.output_text)

Streaming Support

The gateway supports streaming responses via Server-Sent Events (SSE). Use your SDK’s streaming methods as you normally would:
streaming.py
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:6060/api/gateway/openai",
    api_key="sk-amg-your-virtual-key-here",
)

stream = client.responses.create(
    model="gpt-4.1-mini",
    instructions="You are a helpful assistant.",
    input="Tell me a story.",
    stream=True,
)

for chunk in stream:
    if chunk.output_text:
        print(chunk.output_text, end="", flush=True)

Supported Features

The gateway currently supports the Responses API with:
  • Text generation - Standard text completions
  • Images - Image generation and processing
  • Tool calling - Function calling capabilities
  • Reasoning - Advanced reasoning models

Authentication

The gateway accepts authentication via the Authorization header with a Bearer token:
  • Virtual Key: Use a virtual key (starts with sk-amg-) for managed access control
  • Direct API Key: Use your OpenAI API key directly