Loop Detection

Loop detection protects against agentic systems that repeatedly issue identical requests — a common failure mode when an LLM agent retries the same tool call in an infinite loop.

Configuration

Loop detection is defined per-policy inside spec.loop_detection:

{
  "id": "llm-policy",
  "spec": {
    "selector": { "pathExact": "/v1/chat/completions" },
    "loop_detection": {
      "enabled": true,
      "window_seconds": 60,
      "threshold_identical_requests": 5,
      "action": "reject",
      "similarity": "exact"
    },
    "rules": [ ... ]
  }
}

Fields

Field Type Required Default Description
enabled boolean yes Must be true to activate.
window_seconds integer yes if enabled Sliding window duration in seconds. Must be a positive integer.
threshold_identical_requests integer yes if enabled Number of identical requests within the window before action is taken. Must be ≥ 2.
action string no "reject" "reject", "throttle", or "warn".
similarity string no "exact" Currently only "exact" is supported.

How it works

Fingerprinting

For each request, a fingerprint is computed from:

  1. HTTP method (e.g., POST)
  2. Request path (e.g., /v1/chat/completions)
  3. Query parameters — sorted key=value pairs
  4. Body hash — CRC32 of the request body
  5. Descriptor values for the matched policy’s limit_keys — sorted key=value pairs

The components are joined with | and passed through ngx.crc32_short():

POST|/v1/chat/completions||<body_crc32>|org_id=org-abc|user_id=user-1

This means two requests are considered identical if and only if all five components match.

Sliding window counter

The fingerprint is used as a key in the shared dict with TTL = window_seconds:

loop:<fingerprint>  →  count (incremented atomically, TTL reset on each write)

When the count reaches threshold_identical_requests, the configured action fires.

Actions

Action Behaviour
reject Returns HTTP 429 with Retry-After = window_seconds
throttle Delays the request by count × 100 ms (escalating delay with each duplicate)
warn Allows the request but sets X-Fairvisor-Warning: loop_warn

Shadow mode

In shadow mode, loop detection uses a separate key namespace (shadow:<fingerprint>), so shadow counters never affect production state.

Example: Agentic loop protection

An LLM agent calling a tool endpoint 50 times per minute:

{
  "loop_detection": {
    "enabled": true,
    "window_seconds": 30,
    "threshold_identical_requests": 3,
    "action": "reject"
  }
}

After 3 identical requests within 30 seconds, subsequent identical requests are rejected. The Retry-After: 30 header signals the agent to wait for the window to reset.

Combining with token limits

Loop detection and token rate limiting work independently and both can be active on the same policy. Loop detection fires before rule evaluation, so it can block a loop before any token budget is consumed.

Tuning recommendations

Scenario window_seconds threshold action
Interactive user (acceptable to retry once) 5 3 throttle
Agentic tool call (no retry expected) 60 2 reject
Webhook receiver (allow burst but not loop) 10 5 warn
Canary rollout (observe without blocking) 60 3 warn