Metrics

Fairvisor Edge exposes Prometheus metrics at GET /metrics. All metrics are prefixed with fairvisor_.

Scrape configuration

# prometheus.yml
scrape_configs:
  - job_name: fairvisor-edge
    static_configs:
      - targets: ['fairvisor-edge:8080']
    metrics_path: /metrics
    scrape_interval: 15s

Canonical metrics set

Decision and latency

Metric Type Labels Description
fairvisor_decisions_total Counter action, reason, policy, route Total decisions. action is allow, reject, or throttle.
fairvisor_decision_duration_seconds Gauge route, mode Decision evaluation latency in seconds.
fairvisor_handler_invocations_total Counter Total calls to the access handler.

Rate limiting and token usage

Metric Type Labels Description
fairvisor_ratelimit_remaining Gauge window, policy, route Remaining rate-limit budget for emitted decisions.
fairvisor_tokens_consumed_total Counter type, policy, route Token usage counters (reserved, actual).
fairvisor_tokens_remaining Gauge window, policy, route Remaining token budget by window (tpm, tpd).
fairvisor_token_estimation_accuracy_ratio Gauge route, estimator Actual/estimated token ratio after reconciliation.
fairvisor_token_reservation_unused_total Counter route Refunded over-reserved tokens.

Protection and control-plane state

Metric Type Labels Description
fairvisor_loop_detected_total Counter route Loop-detection trigger count.
fairvisor_circuit_state Gauge target Circuit state (0 closed, 1 open, 0.5 half-open where emitted).
fairvisor_kill_switch_active Gauge scope Kill-switch active flag for decision flow.
fairvisor_shadow_mode_active Gauge scope Shadow-mode active flag for decision flow.
fairvisor_global_shadow_active Gauge Runtime top-level global shadow override active flag (0/1).
fairvisor_kill_switch_override_active Gauge Runtime top-level kill-switch override active flag (0/1).

SaaS and lifecycle

Metric Type Labels Description
fairvisor_saas_reachable Gauge 1 if SaaS path is reachable, 0 otherwise.
fairvisor_saas_calls_total Counter operation, status SaaS API calls by operation/status.
fairvisor_events_sent_total Counter status Event flush outcomes (success, error).
fairvisor_config_info Gauge version, hash Active config info metric (1 for current labels).
fairvisor_build_info Gauge version Build/runtime version info metric (1).

Additional operational metrics (still exported)

The runtime also emits module-level operational metrics used for troubleshooting, including:

  • fairvisor_limiter_result_total
  • fairvisor_route_matches_total
  • fairvisor_policy_evaluations_total
  • fairvisor_policy_lookup_miss_total
  • fairvisor_descriptor_missing_total
  • fairvisor_retry_after_bucket_total
  • fairvisor_evaluate_errors_total
  • fairvisor_global_shadow_decisions_total
  • fairvisor_kill_switch_override_skips_total

Use these for deep diagnostics and incident analysis.

Example output

fairvisor_decisions_total{action="allow",reason="all_rules_passed",policy="policy-a",route="/api/v1"} 45321
fairvisor_decision_duration_seconds{route="/api/v1",mode="enforce"} 0.000412
fairvisor_ratelimit_remaining{window="request",policy="policy-a",route="/api/v1"} 847
fairvisor_tokens_remaining{window="tpm",policy="policy-llm",route="/v1/chat/completions"} 91234
fairvisor_build_info{version="0.1.0"} 1

Key dashboards

Reject rate by reason

sum by (reason) (rate(fairvisor_decisions_total{action="reject"}[5m]))

Decision latency (route)

max by (route) (fairvisor_decision_duration_seconds)

Descriptor miss rate

rate(fairvisor_descriptor_missing_total[5m])

SaaS reachability

fairvisor_saas_reachable

Runtime override active

max_over_time(fairvisor_global_shadow_active[1m]) == 1