Local Token Buckets
Local token buckets are a client-side optimization that reduces API calls and improves performance while maintaining rate limiting behavior.
How Local Buckets Work
Instead of making an HTTP API call to the LightRate server for every request, SDK clients maintain a local cache of tokens that can be consumed immediately.
SDK clients only create local token buckets after matching a request to a given rule. It does not create local buckets for application-level default token buckets. Create rules in your applications to match any endpoints that would benefit from local buckets.
The Flow
-
First Request: Client has no local tokens
- Makes HTTP API call to LightRate server
- Server returns multiple tokens (e.g., 10 tokens)
- Client stores tokens in local bucket
- Consumes 1 token for the current request
-
Subsequent Requests: Client has local tokens
- Consumes 1 token from local bucket (no API call!)
- Request completes immediately
- Continues until local bucket is empty
-
Bucket Empty: No local tokens remaining
- Makes HTTP API call to refill
- Process repeats
Visual Example
Request 1: [Local: 0] → API Call → [Local: 10] → Consume 1 → [Local: 9] ✓ (API call)
Request 2: [Local: 9] → Consume 1 → [Local: 8] ✓ (no API call)
Request 3: [Local: 8] → Consume 1 → [Local: 7] ✓ (no API call)
...
Request 10: [Local: 1] → Consume 1 → [Local: 0] ✓ (no API call)
Request 11: [Local: 0] → API Call → [Local: 10] → Consume 1 → [Local: 9] ✓ (API call)In this example, only 2 API calls were made for 11 requests!
Key Characteristics
Local Tokens Expiry
Important: Local token buckets expire if not consumed from or refilled within a 60 second period.
This means:
- Tokens fetched at 10:00:00 will all expire if none are used by 10:01:00
- Expiration is handled automatically by the SDK; no manual timers are required
- After expiration, the next request will trigger a refill from the server
The server-side token bucket continues to refill according to your configured refill rate, regardless of when local tokens are consumed or expired locally.
The Trade-off
Choosing your local bucket size involves a trade-off between performance and accuracy:
Performance vs Accuracy
Larger Local Buckets (More Tokens)
Configuration:
// JavaScript
const client = new LightRateClient('your_api_key', {
defaultLocalBucketSize: 100 // Fetch 100 tokens at a time
});# Ruby
client = LightrateClient::Client.new('your_api_key', 'your_application_id',
default_local_bucket_size: 100 # Fetch 100 tokens at a time
)Benefits:
- ✅ Fewer API Calls: 100 requests might only make 1-2 API calls
- ✅ Lower Latency: Most requests complete instantly (no network call)
- ✅ Reduced Costs: Fewer HTTP requests = lower infrastructure costs
- ✅ Better User Experience: Faster response times
Drawbacks:
- ❌ Less Accurate Rate Limiting: User could consume 100 tokens instantly, then wait
- ❌ Burst Allowance: Allows larger bursts than intended
- ❌ Delayed Enforcement: Rate limits enforced less frequently
Best For:
- High-throughput applications
- Trusted users/clients
- When performance is critical
- Internal services
Smaller Local Buckets (Fewer Tokens)
Configuration:
// JavaScript
const client = new LightRateClient('your_api_key', {
defaultLocalBucketSize: 5 // Fetch 5 tokens at a time
});# Ruby
client = LightrateClient::Client.new('your_api_key', 'your_application_id',
default_local_bucket_size: 5 # Fetch 5 tokens at a time
)Benefits:
- ✅ More Accurate Rate Limiting: Enforced more frequently
- ✅ Smaller Bursts: Users can’t consume as many tokens instantly
- ✅ Better Control: Rate limits closer to server-side behavior
- ✅ Fairer Distribution: Less opportunity for one user to grab many tokens
Drawbacks:
- ❌ More API Calls: More frequent HTTP requests to refill
- ❌ Higher Latency: More requests hit the network
- ❌ Increased Costs: More HTTP traffic
Best For:
- Public APIs
- Strict rate limiting requirements
- Untrusted clients
- Fair usage enforcement
Choosing the Right Size
General Guidelines
| Use Case | Recommended Size | Reasoning |
|---|---|---|
| Internal microservices | 50-100 | High trust, performance critical |
| Authenticated users | 10-20 | Balance of performance and control |
| Public API (free tier) | 5-10 | Tighter control, fair usage |
| Public API (paid tier) | 20-50 | Reward paying customers with better performance |
| Rate-limited webhooks | 5-10 | Ensure even distribution |
| Background jobs | 50-100 | Performance matters, less contention |
Example Scenarios
Scenario 1: High-Volume Internal API
// You trust your internal services and want maximum performance
const client = new LightRateClient('your_api_key', {
defaultLocalBucketSize: 100
});
// Result: One API call every 100 requests
// Perfect for microservice-to-microservice communicationScenario 2: Public REST API
// You need fair usage enforcement for public users
const client = new LightRateClient('your_api_key', {
defaultLocalBucketSize: 10
});
// Result: One API call every 10 requests
// Good balance of performance and controlScenario 3: Strict Rate Limiting
// You want rate limiting as close to real-time as possible
const client = new LightRateClient('your_api_key', {
defaultLocalBucketSize: 1
});
// Result: API call on every request
// Essentially disables local buckets for maximum accuracyImplementation Details
Per-User, Per-Operation Buckets
Local buckets are maintained separately for each:
- User identifier
- Operation or path
// These use different local buckets:
await client.consumeLocalBucketToken('user_A', 'send_email'); // Bucket 1
await client.consumeLocalBucketToken('user_B', 'send_email'); // Bucket 2
await client.consumeLocalBucketToken('user_A', 'send_sms'); // Bucket 3Each combination gets its own local bucket, so users don’t interfere with each other.
Bucket Refill Logic
When a local bucket is empty:
- Client makes HTTP API call with
tokensRequested: defaultLocalBucketSize - Server returns
tokensConsumed(how many tokens were available) - Client stores
tokensConsumedin local bucket - Client immediately consumes 1 token for current request
If the server has fewer tokens than requested, the local bucket gets fewer tokens. For example, if you request 10 tokens but the server only has 3 available, your local bucket gets 3 tokens.
Memory Efficiency
Local buckets are stored in memory and automatically cleaned up:
- No database or file storage needed
- Buckets expire when process restarts
- Most SDKs limit total number of buckets
SDK Configuration
JavaScript
const { LightRateClient } = require('lightrate-client');
const client = new LightRateClient('your_api_key', {
applicationId: 'your_application_id',
defaultLocalBucketSize: 20 // Default for all operations
});
// Use local bucket
const response = await client.consumeLocalBucketToken(
'user123',
'send_email'
);Ruby
require 'lightrate_client'
client = LightrateClient::Client.new(
'your_api_key',
'your_application_id',
default_local_bucket_size: 20 # Default for all operations
)
# Use local bucket
response = client.consume_local_bucket_token(
operation: 'send_email',
user_identifier: 'user123'
)Rails
# config/initializers/lightrate_rails.rb
LightrateRails.configure do |config|
config.api_key = ENV['LIGHTRATE_API_KEY']
config.application_id = ENV['LIGHTRATE_APPLICATION_ID']
config.default_local_bucket_size = 20 # Used automatically
endExpress
const { configure, lightrateMiddleware } = require('lightrate-client-express');
configure({
apiKey: process.env.LIGHTRATE_API_KEY,
applicationId: process.env.LIGHTRATE_APPLICATION_ID,
clientOptions: {
defaultLocalBucketSize: 20 // Used automatically by middleware
}
});Best Practices
1. Start Conservative
Begin with smaller bucket sizes and increase based on monitoring:
// Start here
defaultLocalBucketSize: 10
// Monitor your API call rate and latency
// Increase if performance is an issue
// Scale up
defaultLocalBucketSize: 502. Match to Your Refill Rate
Consider your server-side refill rate when choosing bucket size:
Refill Rate: 10 tokens/sec
Local Bucket: 10 tokens
Result: Client fetches ~1 second worth of tokensA good rule of thumb: Local bucket size = 1-2 seconds of refill rate
3. Consider Your Use Case
High Trust → Large Buckets
- Internal services
- Paid customers
- Known users
Low Trust → Small Buckets
- Public APIs
- Free tier
- Anonymous users
4. Monitor and Adjust
Track these metrics:
- API call frequency
- Local bucket hit rate
- P99 latency
- Rate limit exceeded events
Adjust bucket sizes based on your findings.
Disabling Local Buckets
For maximum accuracy, set bucket size to 1 (makes an API call every request):
const client = new LightRateClient('your_api_key', {
defaultLocalBucketSize: 1 // Essentially disables local caching
});Or use the direct HTTP API methods:
// Bypasses local bucket completely
await client.consumeTokens('user123', 1, 'send_email');Next Steps
- Applications - Organize your rate limits
- Rules - Create specific rate limiting rules
- Token Bucket Algorithm - Understand server-side rate limiting
- SDK Documentation - Implement local buckets in your language