Loop Detection
Loop detection protects against agentic systems that repeatedly issue identical requests — a common failure mode when an LLM agent retries the same tool call in an infinite loop.
Configuration
Loop detection is defined per-policy inside spec.loop_detection:
{
"id": "llm-policy",
"spec": {
"selector": { "pathExact": "/v1/chat/completions" },
"loop_detection": {
"enabled": true,
"window_seconds": 60,
"threshold_identical_requests": 5,
"action": "reject",
"similarity": "exact"
},
"rules": [ ... ]
}
}
Fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
enabled |
boolean | yes | — | Must be true to activate. |
window_seconds |
integer | yes if enabled | — | Sliding window duration in seconds. Must be a positive integer. |
threshold_identical_requests |
integer | yes if enabled | — | Number of identical requests within the window before action is taken. Must be ≥ 2. |
action |
string | no | "reject" |
"reject", "throttle", or "warn". |
similarity |
string | no | "exact" |
Currently only "exact" is supported. |
How it works
Fingerprinting
For each request, a fingerprint is computed from:
- HTTP method (e.g.,
POST) - Request path (e.g.,
/v1/chat/completions) - Query parameters — sorted key=value pairs
- Body hash — CRC32 of the request body
- Descriptor values for the matched policy’s
limit_keys— sorted key=value pairs
The components are joined with | and passed through ngx.crc32_short():
POST|/v1/chat/completions||<body_crc32>|org_id=org-abc|user_id=user-1
This means two requests are considered identical if and only if all five components match.
Sliding window counter
The fingerprint is used as a key in the shared dict with TTL = window_seconds:
loop:<fingerprint> → count (incremented atomically, TTL reset on each write)
When the count reaches threshold_identical_requests, the configured action fires.
Actions
| Action | Behaviour |
|---|---|
reject |
Returns HTTP 429 with Retry-After = window_seconds |
throttle |
Delays the request by count × 100 ms (escalating delay with each duplicate) |
warn |
Allows the request but sets X-Fairvisor-Warning: loop_warn |
Shadow mode
In shadow mode, loop detection uses a separate key namespace (shadow:<fingerprint>), so shadow counters never affect production state.
Example: Agentic loop protection
An LLM agent calling a tool endpoint 50 times per minute:
{
"loop_detection": {
"enabled": true,
"window_seconds": 30,
"threshold_identical_requests": 3,
"action": "reject"
}
}
After 3 identical requests within 30 seconds, subsequent identical requests are rejected. The Retry-After: 30 header signals the agent to wait for the window to reset.
Combining with token limits
Loop detection and token rate limiting work independently and both can be active on the same policy. Loop detection fires before rule evaluation, so it can block a loop before any token budget is consumed.
Tuning recommendations
| Scenario | window_seconds |
threshold |
action |
|---|---|---|---|
| Interactive user (acceptable to retry once) | 5 | 3 | throttle |
| Agentic tool call (no retry expected) | 60 | 2 | reject |
| Webhook receiver (allow burst but not loop) | 10 | 5 | warn |
| Canary rollout (observe without blocking) | 60 | 3 | warn |