Loop Detection

Loop detection protects against agentic systems that repeatedly issue identical requests — a common failure mode when an LLM agent retries the same tool call in an infinite loop.

Configuration

Loop detection is defined per-policy inside spec.loop_detection:

{
  "id": "llm-policy",
  "spec": {
    "selector": { "pathExact": "/v1/chat/completions" },
    "loop_detection": {
      "enabled": true,
      "window_seconds": 60,
      "threshold_identical_requests": 5,
      "action": "reject",
      "similarity": "exact"
    },
    "rules": [ ... ]
  }
}

Fields

Field	Type	Required	Default	Description
`enabled`	boolean	yes	—	Must be `true` to activate.
`window_seconds`	integer	yes if enabled	—	Sliding window duration in seconds. Must be a positive integer.
`threshold_identical_requests`	integer	yes if enabled	—	Number of identical requests within the window before action is taken. Must be ≥ 2.
`action`	string	no	`"reject"`	`"reject"`, `"throttle"`, or `"warn"`.
`similarity`	string	no	`"exact"`	Currently only `"exact"` is supported.

How it works

Fingerprinting

For each request, a fingerprint is computed from:

HTTP method (e.g., POST)
Request path (e.g., /v1/chat/completions)
Query parameters — sorted key=value pairs
Body hash — CRC32 of the request body
Descriptor values for the matched policy’s limit_keys — sorted key=value pairs

The components are joined with | and passed through ngx.crc32_short():

POST|/v1/chat/completions||<body_crc32>|org_id=org-abc|user_id=user-1

This means two requests are considered identical if and only if all five components match.

Sliding window counter

The fingerprint is used as a key in the shared dict with TTL = window_seconds:

loop:<fingerprint>  →  count (incremented atomically, TTL reset on each write)

When the count reaches threshold_identical_requests, the configured action fires.

Actions

Action	Behaviour
`reject`	Returns HTTP 429 with `Retry-After = window_seconds`
`throttle`	Delays the request by `count × 100 ms` (escalating delay with each duplicate)
`warn`	Allows the request but sets `X-Fairvisor-Warning: loop_warn`

Shadow mode

In shadow mode, loop detection uses a separate key namespace (shadow:<fingerprint>), so shadow counters never affect production state.

Example: Agentic loop protection

An LLM agent calling a tool endpoint 50 times per minute:

{
  "loop_detection": {
    "enabled": true,
    "window_seconds": 30,
    "threshold_identical_requests": 3,
    "action": "reject"
  }
}

After 3 identical requests within 30 seconds, subsequent identical requests are rejected. The Retry-After: 30 header signals the agent to wait for the window to reset.

Combining with token limits

Loop detection and token rate limiting work independently and both can be active on the same policy. Loop detection fires before rule evaluation, so it can block a loop before any token budget is consumed.

Tuning recommendations

Scenario	`window_seconds`	`threshold`	`action`
Interactive user (acceptable to retry once)	5	3	`throttle`
Agentic tool call (no retry expected)	60	2	`reject`
Webhook receiver (allow burst but not loop)	10	5	`warn`
Canary rollout (observe without blocking)	60	3	`warn`