Skip to Content
LightRateConceptsLocal Token Buckets

Local Token Buckets

Local token buckets are a client-side optimization that reduces API calls and improves performance while maintaining rate limiting behavior.

How Local Buckets Work

Instead of making an HTTP API call to the LightRate server for every request, SDK clients maintain a local cache of tokens that can be consumed immediately.

SDK clients only create local token buckets after matching a request to a given rule. It does not create local buckets for application-level default token buckets. Create rules in your applications to match any endpoints that would benefit from local buckets.

The Flow

  1. First Request: Client has no local tokens

    • Makes HTTP API call to LightRate server
    • Server returns multiple tokens (e.g., 10 tokens)
    • Client stores tokens in local bucket
    • Consumes 1 token for the current request
  2. Subsequent Requests: Client has local tokens

    • Consumes 1 token from local bucket (no API call!)
    • Request completes immediately
    • Continues until local bucket is empty
  3. Bucket Empty: No local tokens remaining

    • Makes HTTP API call to refill
    • Process repeats

Visual Example

Request 1: [Local: 0] → API Call → [Local: 10] → Consume 1 → [Local: 9] ✓ (API call) Request 2: [Local: 9] → Consume 1 → [Local: 8] ✓ (no API call) Request 3: [Local: 8] → Consume 1 → [Local: 7] ✓ (no API call) ... Request 10: [Local: 1] → Consume 1 → [Local: 0] ✓ (no API call) Request 11: [Local: 0] → API Call → [Local: 10] → Consume 1 → [Local: 9] ✓ (API call)

In this example, only 2 API calls were made for 11 requests!

Key Characteristics

Local Tokens Expiry

Important: Local token buckets expire if not consumed from or refilled within a 60 second period.

This means:

  • Tokens fetched at 10:00:00 will all expire if none are used by 10:01:00
  • Expiration is handled automatically by the SDK; no manual timers are required
  • After expiration, the next request will trigger a refill from the server

The server-side token bucket continues to refill according to your configured refill rate, regardless of when local tokens are consumed or expired locally.

The Trade-off

Choosing your local bucket size involves a trade-off between performance and accuracy:

Performance vs Accuracy

Larger Local Buckets (More Tokens)

Configuration:

// JavaScript const client = new LightRateClient('your_api_key', { defaultLocalBucketSize: 100 // Fetch 100 tokens at a time });
# Ruby client = LightrateClient::Client.new('your_api_key', 'your_application_id', default_local_bucket_size: 100 # Fetch 100 tokens at a time )

Benefits:

  • Fewer API Calls: 100 requests might only make 1-2 API calls
  • Lower Latency: Most requests complete instantly (no network call)
  • Reduced Costs: Fewer HTTP requests = lower infrastructure costs
  • Better User Experience: Faster response times

Drawbacks:

  • Less Accurate Rate Limiting: User could consume 100 tokens instantly, then wait
  • Burst Allowance: Allows larger bursts than intended
  • Delayed Enforcement: Rate limits enforced less frequently

Best For:

  • High-throughput applications
  • Trusted users/clients
  • When performance is critical
  • Internal services

Smaller Local Buckets (Fewer Tokens)

Configuration:

// JavaScript const client = new LightRateClient('your_api_key', { defaultLocalBucketSize: 5 // Fetch 5 tokens at a time });
# Ruby client = LightrateClient::Client.new('your_api_key', 'your_application_id', default_local_bucket_size: 5 # Fetch 5 tokens at a time )

Benefits:

  • More Accurate Rate Limiting: Enforced more frequently
  • Smaller Bursts: Users can’t consume as many tokens instantly
  • Better Control: Rate limits closer to server-side behavior
  • Fairer Distribution: Less opportunity for one user to grab many tokens

Drawbacks:

  • More API Calls: More frequent HTTP requests to refill
  • Higher Latency: More requests hit the network
  • Increased Costs: More HTTP traffic

Best For:

  • Public APIs
  • Strict rate limiting requirements
  • Untrusted clients
  • Fair usage enforcement

Choosing the Right Size

General Guidelines

Use CaseRecommended SizeReasoning
Internal microservices50-100High trust, performance critical
Authenticated users10-20Balance of performance and control
Public API (free tier)5-10Tighter control, fair usage
Public API (paid tier)20-50Reward paying customers with better performance
Rate-limited webhooks5-10Ensure even distribution
Background jobs50-100Performance matters, less contention

Example Scenarios

Scenario 1: High-Volume Internal API

// You trust your internal services and want maximum performance const client = new LightRateClient('your_api_key', { defaultLocalBucketSize: 100 }); // Result: One API call every 100 requests // Perfect for microservice-to-microservice communication

Scenario 2: Public REST API

// You need fair usage enforcement for public users const client = new LightRateClient('your_api_key', { defaultLocalBucketSize: 10 }); // Result: One API call every 10 requests // Good balance of performance and control

Scenario 3: Strict Rate Limiting

// You want rate limiting as close to real-time as possible const client = new LightRateClient('your_api_key', { defaultLocalBucketSize: 1 }); // Result: API call on every request // Essentially disables local buckets for maximum accuracy

Implementation Details

Per-User, Per-Operation Buckets

Local buckets are maintained separately for each:

  • User identifier
  • Operation or path
// These use different local buckets: await client.consumeLocalBucketToken('user_A', 'send_email'); // Bucket 1 await client.consumeLocalBucketToken('user_B', 'send_email'); // Bucket 2 await client.consumeLocalBucketToken('user_A', 'send_sms'); // Bucket 3

Each combination gets its own local bucket, so users don’t interfere with each other.

Bucket Refill Logic

When a local bucket is empty:

  1. Client makes HTTP API call with tokensRequested: defaultLocalBucketSize
  2. Server returns tokensConsumed (how many tokens were available)
  3. Client stores tokensConsumed in local bucket
  4. Client immediately consumes 1 token for current request

If the server has fewer tokens than requested, the local bucket gets fewer tokens. For example, if you request 10 tokens but the server only has 3 available, your local bucket gets 3 tokens.

Memory Efficiency

Local buckets are stored in memory and automatically cleaned up:

  • No database or file storage needed
  • Buckets expire when process restarts
  • Most SDKs limit total number of buckets

SDK Configuration

JavaScript

const { LightRateClient } = require('lightrate-client'); const client = new LightRateClient('your_api_key', { applicationId: 'your_application_id', defaultLocalBucketSize: 20 // Default for all operations }); // Use local bucket const response = await client.consumeLocalBucketToken( 'user123', 'send_email' );

Ruby

require 'lightrate_client' client = LightrateClient::Client.new( 'your_api_key', 'your_application_id', default_local_bucket_size: 20 # Default for all operations ) # Use local bucket response = client.consume_local_bucket_token( operation: 'send_email', user_identifier: 'user123' )

Rails

# config/initializers/lightrate_rails.rb LightrateRails.configure do |config| config.api_key = ENV['LIGHTRATE_API_KEY'] config.application_id = ENV['LIGHTRATE_APPLICATION_ID'] config.default_local_bucket_size = 20 # Used automatically end

Express

const { configure, lightrateMiddleware } = require('lightrate-client-express'); configure({ apiKey: process.env.LIGHTRATE_API_KEY, applicationId: process.env.LIGHTRATE_APPLICATION_ID, clientOptions: { defaultLocalBucketSize: 20 // Used automatically by middleware } });

Best Practices

1. Start Conservative

Begin with smaller bucket sizes and increase based on monitoring:

// Start here defaultLocalBucketSize: 10 // Monitor your API call rate and latency // Increase if performance is an issue // Scale up defaultLocalBucketSize: 50

2. Match to Your Refill Rate

Consider your server-side refill rate when choosing bucket size:

Refill Rate: 10 tokens/sec Local Bucket: 10 tokens Result: Client fetches ~1 second worth of tokens

A good rule of thumb: Local bucket size = 1-2 seconds of refill rate

3. Consider Your Use Case

High Trust → Large Buckets

  • Internal services
  • Paid customers
  • Known users

Low Trust → Small Buckets

  • Public APIs
  • Free tier
  • Anonymous users

4. Monitor and Adjust

Track these metrics:

  • API call frequency
  • Local bucket hit rate
  • P99 latency
  • Rate limit exceeded events

Adjust bucket sizes based on your findings.

Disabling Local Buckets

For maximum accuracy, set bucket size to 1 (makes an API call every request):

const client = new LightRateClient('your_api_key', { defaultLocalBucketSize: 1 // Essentially disables local caching });

Or use the direct HTTP API methods:

// Bypasses local bucket completely await client.consumeTokens('user123', 1, 'send_email');

Next Steps

Last updated on