Local Token Buckets

Local token buckets are a client-side optimization that reduces API calls and improves performance while maintaining rate limiting behavior.

How Local Buckets Work

Instead of making an HTTP API call to the LightRate server for every request, SDK clients maintain a local cache of tokens that can be consumed immediately.

SDK clients only create local token buckets after matching a request to a given rule. It does not create local buckets for application-level default token buckets. Create rules in your applications to match any endpoints that would benefit from local buckets.

The Flow

First Request: Client has no local tokens
- Makes HTTP API call to LightRate server
- Server returns multiple tokens (e.g., 10 tokens)
- Client stores tokens in local bucket
- Consumes 1 token for the current request
Subsequent Requests: Client has local tokens
- Consumes 1 token from local bucket (no API call!)
- Request completes immediately
- Continues until local bucket is empty
Bucket Empty: No local tokens remaining
- Makes HTTP API call to refill
- Process repeats

Visual Example


Request 1: [Local: 0] → API Call → [Local: 10] → Consume 1 → [Local: 9]  ✓ (API call)
Request 2: [Local: 9] → Consume 1 → [Local: 8]  ✓ (no API call)
Request 3: [Local: 8] → Consume 1 → [Local: 7]  ✓ (no API call)
...
Request 10: [Local: 1] → Consume 1 → [Local: 0]  ✓ (no API call)
Request 11: [Local: 0] → API Call → [Local: 10] → Consume 1 → [Local: 9]  ✓ (API call)

In this example, only 2 API calls were made for 11 requests!

Key Characteristics

Local Tokens Expiry

Important: Local token buckets expire if not consumed from or refilled within a 60 second period.

This means:

Tokens fetched at 10:00:00 will all expire if none are used by 10:01:00
Expiration is handled automatically by the SDK; no manual timers are required
After expiration, the next request will trigger a refill from the server

The server-side token bucket continues to refill according to your configured refill rate, regardless of when local tokens are consumed or expired locally.

The Trade-off

Choosing your local bucket size involves a trade-off between performance and accuracy:

Performance vs Accuracy

Larger Local Buckets (More Tokens)

Configuration:


// JavaScript
const client = new LightRateClient('your_api_key', {
  defaultLocalBucketSize: 100  // Fetch 100 tokens at a time
});


# Ruby
client = LightrateClient::Client.new('your_api_key', 'your_application_id',
  default_local_bucket_size: 100  # Fetch 100 tokens at a time
)

Benefits:

✅ Fewer API Calls: 100 requests might only make 1-2 API calls
✅ Lower Latency: Most requests complete instantly (no network call)
✅ Reduced Costs: Fewer HTTP requests = lower infrastructure costs
✅ Better User Experience: Faster response times

Drawbacks:

❌ Less Accurate Rate Limiting: User could consume 100 tokens instantly, then wait
❌ Burst Allowance: Allows larger bursts than intended
❌ Delayed Enforcement: Rate limits enforced less frequently

Best For:

High-throughput applications
Trusted users/clients
When performance is critical
Internal services

Smaller Local Buckets (Fewer Tokens)

Configuration:


// JavaScript
const client = new LightRateClient('your_api_key', {
  defaultLocalBucketSize: 5  // Fetch 5 tokens at a time
});


# Ruby
client = LightrateClient::Client.new('your_api_key', 'your_application_id',
  default_local_bucket_size: 5  # Fetch 5 tokens at a time
)

Benefits:

✅ More Accurate Rate Limiting: Enforced more frequently
✅ Smaller Bursts: Users can’t consume as many tokens instantly
✅ Better Control: Rate limits closer to server-side behavior
✅ Fairer Distribution: Less opportunity for one user to grab many tokens

Drawbacks:

❌ More API Calls: More frequent HTTP requests to refill
❌ Higher Latency: More requests hit the network
❌ Increased Costs: More HTTP traffic

Best For:

Public APIs
Strict rate limiting requirements
Untrusted clients
Fair usage enforcement

Choosing the Right Size

General Guidelines

Use Case	Recommended Size	Reasoning
Internal microservices	50-100	High trust, performance critical
Authenticated users	10-20	Balance of performance and control
Public API (free tier)	5-10	Tighter control, fair usage
Public API (paid tier)	20-50	Reward paying customers with better performance
Rate-limited webhooks	5-10	Ensure even distribution
Background jobs	50-100	Performance matters, less contention

Example Scenarios

Scenario 1: High-Volume Internal API


// You trust your internal services and want maximum performance
const client = new LightRateClient('your_api_key', {
  defaultLocalBucketSize: 100
});
 
// Result: One API call every 100 requests
// Perfect for microservice-to-microservice communication

Scenario 2: Public REST API


// You need fair usage enforcement for public users
const client = new LightRateClient('your_api_key', {
  defaultLocalBucketSize: 10
});
 
// Result: One API call every 10 requests
// Good balance of performance and control

Scenario 3: Strict Rate Limiting


// You want rate limiting as close to real-time as possible
const client = new LightRateClient('your_api_key', {
  defaultLocalBucketSize: 1
});
 
// Result: API call on every request
// Essentially disables local buckets for maximum accuracy

Implementation Details

Per-User, Per-Operation Buckets

Local buckets are maintained separately for each:

User identifier
Operation or path


// These use different local buckets:
await client.consumeLocalBucketToken('user_A', 'send_email');    // Bucket 1
await client.consumeLocalBucketToken('user_B', 'send_email');    // Bucket 2
await client.consumeLocalBucketToken('user_A', 'send_sms');      // Bucket 3

Each combination gets its own local bucket, so users don’t interfere with each other.

Bucket Refill Logic

When a local bucket is empty:

Client makes HTTP API call with tokensRequested: defaultLocalBucketSize
Server returns tokensConsumed (how many tokens were available)
Client stores tokensConsumed in local bucket
Client immediately consumes 1 token for current request

If the server has fewer tokens than requested, the local bucket gets fewer tokens. For example, if you request 10 tokens but the server only has 3 available, your local bucket gets 3 tokens.

Memory Efficiency

Local buckets are stored in memory and automatically cleaned up:

No database or file storage needed
Buckets expire when process restarts
Most SDKs limit total number of buckets

SDK Configuration

JavaScript


const { LightRateClient } = require('lightrate-client');
 
const client = new LightRateClient('your_api_key', {
  applicationId: 'your_application_id',
  defaultLocalBucketSize: 20  // Default for all operations
});
 
// Use local bucket
const response = await client.consumeLocalBucketToken(
  'user123',
  'send_email'
);

Ruby


require 'lightrate_client'
 
client = LightrateClient::Client.new(
  'your_api_key',
  'your_application_id',
  default_local_bucket_size: 20  # Default for all operations
)
 
# Use local bucket
response = client.consume_local_bucket_token(
  operation: 'send_email',
  user_identifier: 'user123'
)

Rails


# config/initializers/lightrate_rails.rb
LightrateRails.configure do |config|
  config.api_key = ENV['LIGHTRATE_API_KEY']
  config.application_id = ENV['LIGHTRATE_APPLICATION_ID']
  config.default_local_bucket_size = 20  # Used automatically
end

Express


const { configure, lightrateMiddleware } = require('lightrate-client-express');
 
configure({
  apiKey: process.env.LIGHTRATE_API_KEY,
  applicationId: process.env.LIGHTRATE_APPLICATION_ID,
  clientOptions: {
    defaultLocalBucketSize: 20  // Used automatically by middleware
  }
});

Best Practices

1. Start Conservative

Begin with smaller bucket sizes and increase based on monitoring:


// Start here
defaultLocalBucketSize: 10
 
// Monitor your API call rate and latency
// Increase if performance is an issue
 
// Scale up
defaultLocalBucketSize: 50

2. Match to Your Refill Rate

Consider your server-side refill rate when choosing bucket size:


Refill Rate: 10 tokens/sec
Local Bucket: 10 tokens

Result: Client fetches ~1 second worth of tokens

A good rule of thumb: Local bucket size = 1-2 seconds of refill rate

3. Consider Your Use Case

High Trust → Large Buckets

Internal services
Paid customers
Known users

Low Trust → Small Buckets

Public APIs
Free tier
Anonymous users

4. Monitor and Adjust

Track these metrics:

API call frequency
Local bucket hit rate
P99 latency
Rate limit exceeded events

Adjust bucket sizes based on your findings.

Disabling Local Buckets

For maximum accuracy, set bucket size to 1 (makes an API call every request):


const client = new LightRateClient('your_api_key', {
  defaultLocalBucketSize: 1  // Essentially disables local caching
});

Or use the direct HTTP API methods:


// Bypasses local bucket completely
await client.consumeTokens('user123', 1, 'send_email');

Next Steps

Applications - Organize your rate limits
Rules - Create specific rate limiting rules
Token Bucket Algorithm - Understand server-side rate limiting
SDK Documentation - Implement local buckets in your language