Loop Detector
The loop detector blocks (or slows) repeated identical requests inside a short time window by fingerprinting each request to detect duplicates. It is intended to stop agent/tool loops where a client keeps sending the same call without progress.
See Loop Detection configuration for schema. This page focuses on algorithm behavior.
How it works
Detection is based on an exact fingerprint of request identity. Fingerprint inputs:
- HTTP method
- request path
- sorted query parameters
- body hash (if available)
- sorted descriptor values used by policy evaluation
Runtime builds a canonical string and hashes it with ngx.crc32_short(...):
fingerprint = crc32(
method + "|" + path + "|" +
sorted(query k=v) + "|" +
body_hash + "|" +
sorted(descriptor k=v)
)
Implementation details:
- key-value segments are sorted for deterministic output
- module-level scratch buffers are reused to reduce allocations
similarityis currently onlyexact
On each request, the counter for the fingerprint is atomically incremented with a TTL of window_seconds:
dict:incr(loop_key, 1, 0, window_seconds)
This behaves like a sliding window because TTL is refreshed on every increment. Repeated identical traffic keeps the key alive.
Decision behavior
If count < threshold_identical_requests:
- no detection
- request proceeds normally
If count >= threshold_identical_requests, action is chosen by config:
reject: immediate reject with reasonloop_detected,Retry-After = window_secondsthrottle: allow with increasing delaycount * 100mswarn: allow, flagged as loop event
Runtime note:
- any throttle delay above 30s is capped by decision handler
State
State is stored in ngx.shared.fairvisor_counters:
Key: ld:{rule_name}:{fingerprint}
Value: integer counter
TTL: window_seconds (refreshed on every increment)
Because TTL is refreshed on each increment, repeated identical traffic keeps the key alive. A key expires automatically when no matching request arrives within window_seconds.
Configuration
"loop_detection": {
"enabled": true,
"window_seconds": 60,
"threshold_identical_requests": 4,
"action": "reject",
"similarity": "exact"
}
| Field | Required when enabled | Default | Validation |
|---|---|---|---|
enabled |
yes | - | boolean |
window_seconds |
yes | - | positive integer |
threshold_identical_requests |
yes | - | positive integer, >= 2 |
action |
no | reject |
reject, throttle, warn |
similarity |
no | exact |
currently must be exact |
Response headers
On throttle (request allowed with delay):
X-Fairvisor-Loop-Delay: <delay_ms>
On rejection:
HTTP 429 Too Many Requests
Retry-After: <window_seconds>
X-Fairvisor-Reason: loop_detected
Failure behavior
If shared dict increment fails (memory pressure, dict issue):
- detector returns non-detected
- request is allowed (fail-open)
- error is logged
This prevents accidental global blocking when storage is unhealthy.
Shadow mode
When policy runs in shadow mode, loop state is namespaced separately:
shadow:loop:{fingerprint}
So shadow experiments do not affect enforce-mode counters.
Accuracy and caveats
- CRC32 collisions are possible but rare; acceptable for loop-defense use
- if body hash is unavailable, body contribution may be empty
- detector is exact-match only; near-duplicate prompts are not grouped
Tuning
- Start with
window_seconds: 30-60 - Start with
threshold_identical_requests: 3-5 - Use
warnin shadow rollout first - Move to
throttleif you want graceful degradation - Use
rejectfor hard protection against runaway automation
Rule of thumb:
effective duplicate rate limit ~= threshold_identical_requests / window_seconds
Examples:
| threshold | window | trigger rate |
|---|---|---|
| 3 | 30s | 0.10 req/s |
| 4 | 60s | 0.067 req/s |
| 5 | 60s | 0.083 req/s |
Example
{
"id": "agent-tools",
"spec": {
"selector": { "pathPrefix": "/v1/tools/" },
"loop_detection": {
"enabled": true,
"window_seconds": 45,
"threshold_identical_requests": 4,
"action": "throttle"
},
"rules": [
{
"name": "base-rps",
"limit_keys": ["jwt:org_id", "header:x-tool-name"],
"algorithm": "token_bucket",
"algorithm_config": { "rps": 20, "burst": 40 }
}
]
}
}
In this setup, repeated identical tool calls are slowed progressively while normal traffic still uses standard rate-limiting.