Skip to Content
LightRateConceptsOverview

Core Concepts

Understanding the key concepts behind LightRate will help you design effective rate limiting strategies for your applications.

Overview

LightRate is built around four core concepts that work together to provide flexible, powerful rate limiting:

Applications

Applications are containers for organizing your rate limiting rules. Think of them as namespaces that group related rate limits together.

Key Features:

  • Default token bucket for automatic fallback
  • Multiple specific rules for fine-grained control
  • Logical grouping by service or product
  • Per-user or organization ownership

Example: You might create separate applications for “Email Service”, “Payment API”, and “Public REST API”, each with their own default limits and specific rules.

Learn more about Applications →

Rules

Rules define specific rate limiting behavior for operations or HTTP endpoints within an application.

Two Types:

  • Operation-based: Rate limit by operation name (e.g., "sendEmail", "processPayment")
  • Path-based: Rate limit by HTTP endpoint (e.g., POST /api/users, GET /api/data)

Key Features:

  • Unique within an application
  • Custom refill and burst rates
  • Automatic fallback to application defaults
  • Mix operation-based and path-based rules

Learn more about Rules →

Token Bucket Algorithm

LightRate uses the token bucket algorithm - a proven rate limiting technique that balances smoothness and flexibility.

How it Works:

  • Tokens refill continuously at a set rate
  • Each request consumes tokens
  • Burst capacity allows temporary spikes
  • Per-user buckets ensure fairness

Key Parameters:

  • Refill Rate: Tokens added per second (sustained throughput)
  • Burst Rate: Maximum tokens available (peak capacity)

Learn more about Token Buckets →

Local Token Buckets

Local token buckets are a client-side optimization that reduces API calls and improves performance while maintaining rate limiting.

How it Works:

  • Clients fetch multiple tokens at once and cache them locally
  • Subsequent requests consume from the local cache (no API call)
  • When cache is empty, refill from server

Key Trade-off:

  • More tokens = Fewer API calls, lower latency, but less accurate rate limiting
  • Fewer tokens = More API calls, higher latency, but more accurate rate limiting

Key Features:

  • Configurable bucket size per SDK
  • Automatic refill when empty
  • Per-user, per-operation buckets

Learn more about Local Token Buckets →

How They Work Together

Here’s how these concepts combine in practice:

Application: Email Service ├── Default: 10 tokens/sec, burst 100 │ (Used when no specific rule matches) └── Rules: ├── sendEmail │ └── Token Bucket: 5 tokens/sec, burst 50 │ ├── User alice → Bucket (5/sec, burst 50) │ ├── User bob → Bucket (5/sec, burst 50) │ └── User carol → Bucket (5/sec, burst 50) └── sendBulkEmail └── Token Bucket: 1 token/sec, burst 10 ├── User alice → Bucket (1/sec, burst 10) └── User bob → Bucket (1/sec, burst 10)

Example Request Flow

  1. Request arrives: User alice wants to perform operation sendEmail
  2. Find application: Look up “Email Service” application
  3. Match rule: Find sendEmail rule within the application
  4. Check bucket: Look up alice’s token bucket for sendEmail rule
  5. Consume tokens: Try to consume 1 token from alice’s bucket
  6. Result:
    • ✅ Tokens available → Request succeeds
    • ❌ No tokens → Request is rate limited (429 response)

Next Steps

Ready to dive deeper? Explore each concept:

Or jump straight to implementation:

Last updated on