Core Concepts
Understanding the key concepts behind LightRate will help you design effective rate limiting strategies for your applications.
Overview
LightRate is built around four core concepts that work together to provide flexible, powerful rate limiting:
Applications
Applications are containers for organizing your rate limiting rules. Think of them as namespaces that group related rate limits together.
Key Features:
- Default token bucket for automatic fallback
- Multiple specific rules for fine-grained control
- Logical grouping by service or product
- Per-user or organization ownership
Example: You might create separate applications for “Email Service”, “Payment API”, and “Public REST API”, each with their own default limits and specific rules.
Learn more about Applications →
Rules
Rules define specific rate limiting behavior for operations or HTTP endpoints within an application.
Two Types:
- Operation-based: Rate limit by operation name (e.g.,
"sendEmail","processPayment") - Path-based: Rate limit by HTTP endpoint (e.g.,
POST /api/users,GET /api/data)
Key Features:
- Unique within an application
- Custom refill and burst rates
- Automatic fallback to application defaults
- Mix operation-based and path-based rules
Token Bucket Algorithm
LightRate uses the token bucket algorithm - a proven rate limiting technique that balances smoothness and flexibility.
How it Works:
- Tokens refill continuously at a set rate
- Each request consumes tokens
- Burst capacity allows temporary spikes
- Per-user buckets ensure fairness
Key Parameters:
- Refill Rate: Tokens added per second (sustained throughput)
- Burst Rate: Maximum tokens available (peak capacity)
Learn more about Token Buckets →
Local Token Buckets
Local token buckets are a client-side optimization that reduces API calls and improves performance while maintaining rate limiting.
How it Works:
- Clients fetch multiple tokens at once and cache them locally
- Subsequent requests consume from the local cache (no API call)
- When cache is empty, refill from server
Key Trade-off:
- More tokens = Fewer API calls, lower latency, but less accurate rate limiting
- Fewer tokens = More API calls, higher latency, but more accurate rate limiting
Key Features:
- Configurable bucket size per SDK
- Automatic refill when empty
- Per-user, per-operation buckets
Learn more about Local Token Buckets →
How They Work Together
Here’s how these concepts combine in practice:
Application: Email Service
├── Default: 10 tokens/sec, burst 100
│ (Used when no specific rule matches)
│
└── Rules:
├── sendEmail
│ └── Token Bucket: 5 tokens/sec, burst 50
│ ├── User alice → Bucket (5/sec, burst 50)
│ ├── User bob → Bucket (5/sec, burst 50)
│ └── User carol → Bucket (5/sec, burst 50)
│
└── sendBulkEmail
└── Token Bucket: 1 token/sec, burst 10
├── User alice → Bucket (1/sec, burst 10)
└── User bob → Bucket (1/sec, burst 10)Example Request Flow
- Request arrives: User
alicewants to perform operationsendEmail - Find application: Look up “Email Service” application
- Match rule: Find
sendEmailrule within the application - Check bucket: Look up alice’s token bucket for
sendEmailrule - Consume tokens: Try to consume 1 token from alice’s bucket
- Result:
- ✅ Tokens available → Request succeeds
- ❌ No tokens → Request is rate limited (429 response)
Next Steps
Ready to dive deeper? Explore each concept:
- Applications - Learn how to organize your rate limits
- Rules - Create specific rate limiting rules
- Token Bucket - Understand the server-side algorithm
- Local Token Buckets - Optimize client-side performance
Or jump straight to implementation:
- Quick Start - Set up your first rate limit
- API Reference - Direct API integration
- SDKs - Use our client libraries