Loop Detector

The loop detector blocks (or slows) repeated identical requests inside a short time window by fingerprinting each request to detect duplicates. It is intended to stop agent/tool loops where a client keeps sending the same call without progress.

See Loop Detection configuration for schema. This page focuses on algorithm behavior.

How it works

Detection is based on an exact fingerprint of request identity. Fingerprint inputs:

  • HTTP method
  • request path
  • sorted query parameters
  • body hash (if available)
  • sorted descriptor values used by policy evaluation

Runtime builds a canonical string and hashes it with ngx.crc32_short(...):

fingerprint = crc32(
  method + "|" + path + "|" +
  sorted(query k=v) + "|" +
  body_hash + "|" +
  sorted(descriptor k=v)
)

Implementation details:

  • key-value segments are sorted for deterministic output
  • module-level scratch buffers are reused to reduce allocations
  • similarity is currently only exact

On each request, the counter for the fingerprint is atomically incremented with a TTL of window_seconds:

dict:incr(loop_key, 1, 0, window_seconds)

This behaves like a sliding window because TTL is refreshed on every increment. Repeated identical traffic keeps the key alive.

Decision behavior

If count < threshold_identical_requests:

  • no detection
  • request proceeds normally

If count >= threshold_identical_requests, action is chosen by config:

  • reject: immediate reject with reason loop_detected, Retry-After = window_seconds
  • throttle: allow with increasing delay count * 100ms
  • warn: allow, flagged as loop event

Runtime note:

  • any throttle delay above 30s is capped by decision handler

State

State is stored in ngx.shared.fairvisor_counters:

Key:    ld:{rule_name}:{fingerprint}
Value:  integer counter
TTL:    window_seconds (refreshed on every increment)

Because TTL is refreshed on each increment, repeated identical traffic keeps the key alive. A key expires automatically when no matching request arrives within window_seconds.

Configuration

"loop_detection": {
  "enabled": true,
  "window_seconds": 60,
  "threshold_identical_requests": 4,
  "action": "reject",
  "similarity": "exact"
}
Field Required when enabled Default Validation
enabled yes - boolean
window_seconds yes - positive integer
threshold_identical_requests yes - positive integer, >= 2
action no reject reject, throttle, warn
similarity no exact currently must be exact

Response headers

On throttle (request allowed with delay):

X-Fairvisor-Loop-Delay: <delay_ms>

On rejection:

HTTP 429 Too Many Requests
Retry-After: <window_seconds>
X-Fairvisor-Reason: loop_detected

Failure behavior

If shared dict increment fails (memory pressure, dict issue):

  • detector returns non-detected
  • request is allowed (fail-open)
  • error is logged

This prevents accidental global blocking when storage is unhealthy.

Shadow mode

When policy runs in shadow mode, loop state is namespaced separately:

shadow:loop:{fingerprint}

So shadow experiments do not affect enforce-mode counters.

Accuracy and caveats

  • CRC32 collisions are possible but rare; acceptable for loop-defense use
  • if body hash is unavailable, body contribution may be empty
  • detector is exact-match only; near-duplicate prompts are not grouped

Tuning

  1. Start with window_seconds: 30-60
  2. Start with threshold_identical_requests: 3-5
  3. Use warn in shadow rollout first
  4. Move to throttle if you want graceful degradation
  5. Use reject for hard protection against runaway automation

Rule of thumb:

effective duplicate rate limit ~= threshold_identical_requests / window_seconds

Examples:

threshold window trigger rate
3 30s 0.10 req/s
4 60s 0.067 req/s
5 60s 0.083 req/s

Example

{
  "id": "agent-tools",
  "spec": {
    "selector": { "pathPrefix": "/v1/tools/" },
    "loop_detection": {
      "enabled": true,
      "window_seconds": 45,
      "threshold_identical_requests": 4,
      "action": "throttle"
    },
    "rules": [
      {
        "name": "base-rps",
        "limit_keys": ["jwt:org_id", "header:x-tool-name"],
        "algorithm": "token_bucket",
        "algorithm_config": { "rps": 20, "burst": 40 }
      }
    ]
  }
}

In this setup, repeated identical tool calls are slowed progressively while normal traffic still uses standard rate-limiting.