Inference API

Inference Gates

OpenAI-compatible API, credit-billed. Point your existing SDK, query models, and pay per token from your razorBridge balance.

Getting Started

Three Steps to Inference

1

Create Key

rb gate keys create "Lab key"

Generate an API key from the CLI or dashboard.

2

Point SDK

client = OpenAI(
base_url="https://razorbridge.eu/api/v1/rb/gate",
api_key="rb-gate-..."
)

Use the standard OpenAI SDK with our endpoint.

3

Query Models

response = client.chat
.completions.create(
model="llama-3",
messages=[...]
)

Credits deducted per token, automatically.

Integration

Works With Your Stack

Drop-in replacement for any OpenAI-compatible client.

from openai import OpenAI
client = OpenAI(
base_url="https://razorbridge.eu/api/v1/rb/gate",
api_key="rb-gate-YOUR_KEY",
)
response = client.chat.completions.create(
model="llama-3",
messages=[
{"role": "user", "content": "Hello!"}
],
)
print(response.choices[0].message.content)
curl https://razorbridge.eu/api/v1/rb/gate/chat/completions \
-H "Authorization: Bearer rb-gate-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'
const response = await fetch(
"https://razorbridge.eu/api/v1/rb/gate/chat/completions",
{
method: "POST",
headers: {
"Authorization": "Bearer rb-gate-YOUR_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "llama-3",
messages: [
{ role: "user", content: "Hello!" }
],
}),
}
);
const data = await response.json();
console.log(data.choices[0].message.content);
Models

Available Models

Models route through Local (Outpost) or Cloud (OpenRouter) depending on availability.

Alias Local (Outpost) Cloud (OpenRouter)
Model information is currently loading.
Complete Workflow

Train and Serve From One Account

One credit balance covers both GPU Blades and Inference Gates.

Train

SSH into a GPU Blade with PyTorch, CUDA, and your datasets. Fine-tune or train from scratch.

rb blade ssh
nvidia-smi
python train.py --model llama-3

Serve

Create an API key and query any model via the OpenAI SDK. Credit-billed per token.

rb gate keys create "Lab key"
client = OpenAI(
  base_url="https://razorbridge.eu/api/v1/rb/gate"
)

Start querying models in minutes

Create an API key and use the OpenAI SDK you already know.