Inference API
Inference Gates
OpenAI-compatible API, credit-billed. Point your existing SDK, query models, and pay per token from your razorBridge balance.
Getting Started
Three Steps to Inference
1
Create Key
rb gate keys create "Lab key"
Generate an API key from the CLI or dashboard.
2
Point SDK
client = OpenAI(
base_url="https://razorbridge.eu/api/v1/rb/gate",
api_key="rb-gate-..."
)
Use the standard OpenAI SDK with our endpoint.
3
Query Models
response = client.chat
.completions.create(
model="llama-3",
messages=[...]
)
Credits deducted per token, automatically.
Integration
Works With Your Stack
Drop-in replacement for any OpenAI-compatible client.
from openai import OpenAI
client = OpenAI(
base_url="https://razorbridge.eu/api/v1/rb/gate",
api_key="rb-gate-YOUR_KEY",
)
response = client.chat.completions.create(
model="llama-3",
messages=[
{"role": "user", "content": "Hello!"}
],
)
print(response.choices[0].message.content)
curl https://razorbridge.eu/api/v1/rb/gate/chat/completions \
-H "Authorization: Bearer rb-gate-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'
const response = await fetch(
"https://razorbridge.eu/api/v1/rb/gate/chat/completions",
{
method: "POST",
headers: {
"Authorization": "Bearer rb-gate-YOUR_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "llama-3",
messages: [
{ role: "user", content: "Hello!" }
],
}),
}
);
const data = await response.json();
console.log(data.choices[0].message.content);
Models
Available Models
Models route through Local (Outpost) or Cloud (OpenRouter) depending on availability.
| Alias | Local (Outpost) | Cloud (OpenRouter) |
|---|---|---|
| Model information is currently loading. | ||
Complete Workflow
Train and Serve From One Account
One credit balance covers both GPU Blades and Inference Gates.
Train
SSH into a GPU Blade with PyTorch, CUDA, and your datasets. Fine-tune or train from scratch.
rb blade ssh nvidia-smi python train.py --model llama-3
Serve
Create an API key and query any model via the OpenAI SDK. Credit-billed per token.
rb gate keys create "Lab key" client = OpenAI( base_url="https://razorbridge.eu/api/v1/rb/gate" )
Start querying models in minutes
Create an API key and use the OpenAI SDK you already know.