Metrics

Fairvisor Edge exposes Prometheus metrics at GET /metrics. All metrics are prefixed with fairvisor_.

Scrape configuration

# prometheus.yml
scrape_configs:
  - job_name: fairvisor-edge
    static_configs:
      - targets: ['fairvisor-edge:8080']
    metrics_path: /metrics
    scrape_interval: 15s

Canonical metrics set

Decision and latency

Metric	Type	Labels	Description
`fairvisor_decisions_total`	Counter	`action`, `reason`, `policy`, `route`	Total decisions. `action` is `allow`, `reject`, or `throttle`.
`fairvisor_decision_duration_seconds`	Gauge	`route`, `mode`	Decision evaluation latency in seconds.
`fairvisor_handler_invocations_total`	Counter	—	Total calls to the access handler.

Rate limiting and token usage

Metric	Type	Labels	Description
`fairvisor_ratelimit_remaining`	Gauge	`window`, `policy`, `route`	Remaining rate-limit budget for emitted decisions.
`fairvisor_tokens_consumed_total`	Counter	`type`, `policy`, `route`	Token usage counters (`reserved`, `actual`).
`fairvisor_tokens_remaining`	Gauge	`window`, `policy`, `route`	Remaining token budget by window (`tpm`, `tpd`).
`fairvisor_token_estimation_accuracy_ratio`	Gauge	`route`, `estimator`	Actual/estimated token ratio after reconciliation.
`fairvisor_token_reservation_unused_total`	Counter	`route`	Refunded over-reserved tokens.

Protection and control-plane state

Metric	Type	Labels	Description
`fairvisor_loop_detected_total`	Counter	`route`	Loop-detection trigger count.
`fairvisor_circuit_state`	Gauge	`target`	Circuit state (`0` closed, `1` open, `0.5` half-open where emitted).
`fairvisor_kill_switch_active`	Gauge	`scope`	Kill-switch active flag for decision flow.
`fairvisor_shadow_mode_active`	Gauge	`scope`	Shadow-mode active flag for decision flow.
`fairvisor_global_shadow_active`	Gauge	—	Runtime top-level global shadow override active flag (`0/1`).
`fairvisor_kill_switch_override_active`	Gauge	—	Runtime top-level kill-switch override active flag (`0/1`).

SaaS and lifecycle

Metric	Type	Labels	Description
`fairvisor_saas_reachable`	Gauge	—	`1` if SaaS path is reachable, `0` otherwise.
`fairvisor_saas_calls_total`	Counter	`operation`, `status`	SaaS API calls by operation/status.
`fairvisor_events_sent_total`	Counter	`status`	Event flush outcomes (`success`, `error`).
`fairvisor_config_info`	Gauge	`version`, `hash`	Active config info metric (`1` for current labels).
`fairvisor_build_info`	Gauge	`version`	Build/runtime version info metric (`1`).

Additional operational metrics (still exported)

The runtime also emits module-level operational metrics used for troubleshooting, including:

fairvisor_limiter_result_total
fairvisor_route_matches_total
fairvisor_policy_evaluations_total
fairvisor_policy_lookup_miss_total
fairvisor_descriptor_missing_total
fairvisor_retry_after_bucket_total
fairvisor_evaluate_errors_total
fairvisor_global_shadow_decisions_total
fairvisor_kill_switch_override_skips_total

Use these for deep diagnostics and incident analysis.

Example output

fairvisor_decisions_total{action="allow",reason="all_rules_passed",policy="policy-a",route="/api/v1"} 45321
fairvisor_decision_duration_seconds{route="/api/v1",mode="enforce"} 0.000412
fairvisor_ratelimit_remaining{window="request",policy="policy-a",route="/api/v1"} 847
fairvisor_tokens_remaining{window="tpm",policy="policy-llm",route="/v1/chat/completions"} 91234
fairvisor_build_info{version="0.1.0"} 1

Key dashboards

Reject rate by reason

sum by (reason) (rate(fairvisor_decisions_total{action="reject"}[5m]))

Decision latency (route)

max by (route) (fairvisor_decision_duration_seconds)

Descriptor miss rate

rate(fairvisor_descriptor_missing_total[5m])

SaaS reachability

fairvisor_saas_reachable

Runtime override active

max_over_time(fairvisor_global_shadow_active[1m]) == 1