Loop Detector

The loop detector blocks (or slows) repeated identical requests inside a short time window by fingerprinting each request to detect duplicates. It is intended to stop agent/tool loops where a client keeps sending the same call without progress.

See Loop Detection configuration for schema. This page focuses on algorithm behavior.

How it works

Detection is based on an exact fingerprint of request identity. Fingerprint inputs:

HTTP method
request path
sorted query parameters
body hash (if available)
sorted descriptor values used by policy evaluation

Runtime builds a canonical string and hashes it with ngx.crc32_short(...):

fingerprint = crc32(
  method + "|" + path + "|" +
  sorted(query k=v) + "|" +
  body_hash + "|" +
  sorted(descriptor k=v)
)

Implementation details:

key-value segments are sorted for deterministic output
module-level scratch buffers are reused to reduce allocations
similarity is currently only exact

On each request, the counter for the fingerprint is atomically incremented with a TTL of window_seconds:

dict:incr(loop_key, 1, 0, window_seconds)

This behaves like a sliding window because TTL is refreshed on every increment. Repeated identical traffic keeps the key alive.

Decision behavior

If count < threshold_identical_requests:

no detection
request proceeds normally

If count >= threshold_identical_requests, action is chosen by config:

reject: immediate reject with reason loop_detected, Retry-After = window_seconds
throttle: allow with increasing delay count * 100ms
warn: allow, flagged as loop event

Runtime note:

any throttle delay above 30s is capped by decision handler

State

State is stored in ngx.shared.fairvisor_counters:

Key:    ld:{rule_name}:{fingerprint}
Value:  integer counter
TTL:    window_seconds (refreshed on every increment)

Because TTL is refreshed on each increment, repeated identical traffic keeps the key alive. A key expires automatically when no matching request arrives within window_seconds.

Configuration

"loop_detection": {
  "enabled": true,
  "window_seconds": 60,
  "threshold_identical_requests": 4,
  "action": "reject",
  "similarity": "exact"
}

Field	Required when enabled	Default	Validation
`enabled`	yes	-	boolean
`window_seconds`	yes	-	positive integer
`threshold_identical_requests`	yes	-	positive integer, `>= 2`
`action`	no	`reject`	`reject`, `throttle`, `warn`
`similarity`	no	`exact`	currently must be `exact`

Response headers

On throttle (request allowed with delay):

X-Fairvisor-Loop-Delay: <delay_ms>

On rejection:

HTTP 429 Too Many Requests
Retry-After: <window_seconds>
X-Fairvisor-Reason: loop_detected

Failure behavior

If shared dict increment fails (memory pressure, dict issue):

detector returns non-detected
request is allowed (fail-open)
error is logged

This prevents accidental global blocking when storage is unhealthy.

Shadow mode

When policy runs in shadow mode, loop state is namespaced separately:

shadow:loop:{fingerprint}

So shadow experiments do not affect enforce-mode counters.

Accuracy and caveats

CRC32 collisions are possible but rare; acceptable for loop-defense use
if body hash is unavailable, body contribution may be empty
detector is exact-match only; near-duplicate prompts are not grouped

Tuning

Start with window_seconds: 30-60
Start with threshold_identical_requests: 3-5
Use warn in shadow rollout first
Move to throttle if you want graceful degradation
Use reject for hard protection against runaway automation

Rule of thumb:

effective duplicate rate limit ~= threshold_identical_requests / window_seconds

Examples:

threshold	window	trigger rate
3	30s	0.10 req/s
4	60s	0.067 req/s
5	60s	0.083 req/s

Example

{
  "id": "agent-tools",
  "spec": {
    "selector": { "pathPrefix": "/v1/tools/" },
    "loop_detection": {
      "enabled": true,
      "window_seconds": 45,
      "threshold_identical_requests": 4,
      "action": "throttle"
    },
    "rules": [
      {
        "name": "base-rps",
        "limit_keys": ["jwt:org_id", "header:x-tool-name"],
        "algorithm": "token_bucket",
        "algorithm_config": { "rps": 20, "burst": 40 }
      }
    ]
  }
}

In this setup, repeated identical tool calls are slowed progressively while normal traffic still uses standard rate-limiting.