Core Concepts

Understanding the key concepts behind LightRate will help you design effective rate limiting strategies for your applications.

Overview

LightRate is built around four core concepts that work together to provide flexible, powerful rate limiting:

Applications

Applications are containers for organizing your rate limiting rules. Think of them as namespaces that group related rate limits together.

Key Features:

Default token bucket for automatic fallback
Multiple specific rules for fine-grained control
Logical grouping by service or product
Per-user or organization ownership

Example: You might create separate applications for “Email Service”, “Payment API”, and “Public REST API”, each with their own default limits and specific rules.

Learn more about Applications →

Rules

Rules define specific rate limiting behavior for operations or HTTP endpoints within an application.

Two Types:

Operation-based: Rate limit by operation name (e.g., "sendEmail", "processPayment")
Path-based: Rate limit by HTTP endpoint (e.g., POST /api/users, GET /api/data)

Key Features:

Unique within an application
Custom refill and burst rates
Automatic fallback to application defaults
Mix operation-based and path-based rules

Learn more about Rules →

Token Bucket Algorithm

LightRate uses the token bucket algorithm - a proven rate limiting technique that balances smoothness and flexibility.

How it Works:

Tokens refill continuously at a set rate
Each request consumes tokens
Burst capacity allows temporary spikes
Per-user buckets ensure fairness

Key Parameters:

Refill Rate: Tokens added per second (sustained throughput)
Burst Rate: Maximum tokens available (peak capacity)

Learn more about Token Buckets →

Local Token Buckets

Local token buckets are a client-side optimization that reduces API calls and improves performance while maintaining rate limiting.

How it Works:

Clients fetch multiple tokens at once and cache them locally
Subsequent requests consume from the local cache (no API call)
When cache is empty, refill from server

Key Trade-off:

More tokens = Fewer API calls, lower latency, but less accurate rate limiting
Fewer tokens = More API calls, higher latency, but more accurate rate limiting

Key Features:

Configurable bucket size per SDK
Automatic refill when empty
Per-user, per-operation buckets

Learn more about Local Token Buckets →

How They Work Together

Here’s how these concepts combine in practice:


Application: Email Service
├── Default: 10 tokens/sec, burst 100
│   (Used when no specific rule matches)
│
└── Rules:
    ├── sendEmail
    │   └── Token Bucket: 5 tokens/sec, burst 50
    │       ├── User alice → Bucket (5/sec, burst 50)
    │       ├── User bob   → Bucket (5/sec, burst 50)
    │       └── User carol → Bucket (5/sec, burst 50)
    │
    └── sendBulkEmail
        └── Token Bucket: 1 token/sec, burst 10
            ├── User alice → Bucket (1/sec, burst 10)
            └── User bob   → Bucket (1/sec, burst 10)

Example Request Flow

Request arrives: User alice wants to perform operation sendEmail
Find application: Look up “Email Service” application
Match rule: Find sendEmail rule within the application
Check bucket: Look up alice’s token bucket for sendEmail rule
Consume tokens: Try to consume 1 token from alice’s bucket
Result:
- ✅ Tokens available → Request succeeds
- ❌ No tokens → Request is rate limited (429 response)

Next Steps

Ready to dive deeper? Explore each concept:

Applications - Learn how to organize your rate limits
Rules - Create specific rate limiting rules
Token Bucket - Understand the server-side algorithm
Local Token Buckets - Optimize client-side performance

Or jump straight to implementation:

Quick Start - Set up your first rate limit
API Reference - Direct API integration
SDKs - Use our client libraries