Capacity Planning

Input dimensions

  • active unique limit keys per window
  • algorithms used (token_bucket, cost_based, token_bucket_llm)
  • expected concurrency and burst profile
  • retention/reset windows

Practical sizing workflow

  1. Estimate unique active keys per critical rule
  2. Start with FAIRVISOR_SHARED_DICT_SIZE=128m
  3. Load test with realistic cardinality
  4. Increase to 256m/512m if signs of state pressure appear

Rule-of-thumb table

Active key cardinality Starting dict size
<= 100k 128m
100k-250k 256m
250k-500k 512m
> 500k profile and shard strategy

Validate after sizing

  • stable allow/reject semantics under load
  • no abnormal limiter resets
  • latency remains within SLO at p95/p99