Quickstart
This guide gets a local Fairvisor Edge instance running in Docker with a simple policy.
Prerequisites
- Docker 24+ (or Docker Desktop)
curlfor testing
Step 1 - Create policy
mkdir fairvisor-demo && cd fairvisor-demo
Create policy.json:
{
"bundle_version": 1,
"issued_at": "2026-01-01T00:00:00Z",
"policies": [
{
"id": "demo-rate-limit",
"spec": {
"selector": {
"pathPrefix": "/",
"methods": ["GET", "POST"]
},
"mode": "enforce",
"rules": [
{
"name": "global-rps",
"limit_keys": ["ip:address"],
"algorithm": "token_bucket",
"algorithm_config": {
"tokens_per_second": 5,
"burst": 10
}
}
]
}
}
],
"kill_switches": []
}
This policy allows every IP address 5 requests per second with a burst of up to 10. Requests over the limit receive a 429 response. It applies to all paths and both GET and POST methods.
Step 2 - Run container
docker run -d \
--name fairvisor \
-p 8080:8080 \
-v "$(pwd)/policy.json:/etc/fairvisor/policy.json:ro" \
-e FAIRVISOR_CONFIG_FILE=/etc/fairvisor/policy.json \
-e FAIRVISOR_MODE=decision_service \
ghcr.io/fairvisor/fairvisor-edge:v0.1.0
Step 3 - Verify
curl -sf http://localhost:8080/readyz
curl -s -w "\nHTTP %{http_code}\n" \
-H "X-Original-Method: GET" \
-H "X-Original-URI: /api/data" \
-H "X-Forwarded-For: 10.0.0.1" \
http://localhost:8080/v1/decision
Expected output for /readyz: {"status":"ok"}
A 200 response means the decision service is ready and a policy bundle is loaded.
Alternative: LLM token budget policy
Replace the policy.json content with an LLM-focused policy:
{
"bundle_version": 1,
"issued_at": "2026-01-01T00:00:00Z",
"policies": [
{
"id": "llm-budget",
"spec": {
"selector": { "pathPrefix": "/v1/chat" },
"mode": "enforce",
"rules": [
{
"name": "per-org-tpm",
"limit_keys": ["jwt:org_id"],
"algorithm": "token_bucket_llm",
"algorithm_config": {
"tokens_per_minute": 60000,
"tokens_per_day": 1200000,
"default_max_completion": 800
}
}
]
}
}
],
"kill_switches": []
}
This policy limits each organization (identified by the org_id claim in the JWT) to 60 000 LLM tokens per minute and 1 200 000 per day. Individual completions are capped at 800 tokens. Requests that would exceed the budget receive a 429 response with an OpenAI-compatible error body (error.type: "rate_limit_error"), so existing clients handle it without any changes.