Capacity Planning
Input dimensions
- active unique limit keys per window
- algorithms used (
token_bucket,cost_based,token_bucket_llm) - expected concurrency and burst profile
- retention/reset windows
Practical sizing workflow
- Estimate unique active keys per critical rule
- Start with
FAIRVISOR_SHARED_DICT_SIZE=128m - Load test with realistic cardinality
- Increase to
256m/512mif signs of state pressure appear
Rule-of-thumb table
| Active key cardinality | Starting dict size |
|---|---|
| <= 100k | 128m |
| 100k-250k | 256m |
| 250k-500k | 512m |
| > 500k | profile and shard strategy |
Validate after sizing
- stable allow/reject semantics under load
- no abnormal limiter resets
- latency remains within SLO at p95/p99