Without rate limiting, a single bad actor can bring down your API. A script making 1000 requests per second can overwhelm your servers, rack up cloud bills, and ruin the experience for legitimate users.
Redis is the perfect tool for rate limiting—it's fast enough to check every request with sub-millisecond latency, and its atomic operations prevent race conditions. This guide shows you how to implement production-ready rate limiting.
Why Rate Limiting Matters
- Prevent abuse: Stop scrapers, credential stuffing, spam bots
- Protect resources: Prevent server overload
- Ensure fairness: Give all users equal access
- Reduce costs: Limit expensive operations (AI APIs, external services)
- Comply with SLAs: Enforce API usage limits for paying tiers
Rate Limiting Algorithms
1. Fixed Window Counter
Simple but imperfect. Count requests in fixed time windows (e.g., per minute).
# Example: 100 requests per minute
Window: 12:00:00 - 12:00:59 → Allow 100 requests
Window: 12:01:00 - 12:01:59 → Reset, allow 100 more
# Problem: "Boundary burst"
# User can make 100 requests at 12:00:59
# Then 100 more at 12:01:00
# = 200 requests in 2 seconds!
2. Sliding Window Log
Accurate but memory-intensive. Store timestamp of every request.
# Store: [12:00:01, 12:00:05, 12:00:15, 12:00:30...]
# Count requests in last 60 seconds
# Remove expired timestamps
# Problem: Stores every request timestamp = lots of memory
3. Sliding Window Counter (Recommended)
Best balance of accuracy and efficiency. Combine fixed windows with weighted calculation.
# Two counters: current window + previous window
# Weight previous window by overlap percentage
# Example at 12:00:45 (45 seconds into current window):
# Previous window (12:00:00): 60 requests
# Current window (12:01:00): 20 requests
# Weight: (60 - 45) / 60 = 0.25 (25% of previous window counts)
# Estimated rate: (60 * 0.25) + 20 = 35 requests
4. Token Bucket
Allows controlled bursts. Users have a "bucket" of tokens that refills over time.
# Configuration:
# - Bucket size: 10 tokens (max burst)
# - Refill rate: 1 token per second
# User makes 5 requests instantly → 5 tokens left
# Wait 3 seconds → 8 tokens
# Make 10 requests → Only 8 succeed, 2 rejected
Implementation: Sliding Window Counter
Node.js / Express
// rateLimit.js
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);
/**
* Sliding window rate limiter using Redis
* @param {string} key - Unique identifier (e.g., IP, user ID)
* @param {number} limit - Max requests allowed
* @param {number} windowSec - Time window in seconds
* @returns {Object} { allowed: boolean, remaining: number, resetAt: Date }
*/
export async function checkRateLimit(key, limit, windowSec) {
const now = Date.now();
const windowMs = windowSec * 1000;
const currentWindow = Math.floor(now / windowMs);
const previousWindow = currentWindow - 1;
const currentKey = `ratelimit:${key}:${currentWindow}`;
const previousKey = `ratelimit:${key}:${previousWindow}`;
// Atomic Lua script for accuracy
const script = `
local current = tonumber(redis.call('GET', KEYS[1]) or 0)
local previous = tonumber(redis.call('GET', KEYS[2]) or 0)
local weight = (tonumber(ARGV[1]) % tonumber(ARGV[2])) / tonumber(ARGV[2])
local count = current + math.floor(previous * (1 - weight))
if count >= tonumber(ARGV[3]) then
return {0, count}
end
redis.call('INCR', KEYS[1])
redis.call('EXPIRE', KEYS[1], ARGV[2] * 2)
return {1, count + 1}
`;
const [allowed, count] = await redis.eval(
script,
2,
currentKey,
previousKey,
now,
windowMs,
limit
);
const resetAt = new Date((currentWindow + 1) * windowMs);
return {
allowed: allowed === 1,
remaining: Math.max(0, limit - count),
resetAt,
limit,
};
}
// Express middleware
export function rateLimitMiddleware(options = {}) {
const {
limit = 100,
windowSec = 60,
keyGenerator = (req) => req.ip,
handler = (req, res) => {
res.status(429).json({
error: 'Too Many Requests',
retryAfter: Math.ceil(windowSec),
});
},
} = options;
return async (req, res, next) => {
const key = keyGenerator(req);
const result = await checkRateLimit(key, limit, windowSec);
// Set standard rate limit headers
res.set('X-RateLimit-Limit', result.limit);
res.set('X-RateLimit-Remaining', result.remaining);
res.set('X-RateLimit-Reset', Math.floor(result.resetAt.getTime() / 1000));
if (!result.allowed) {
res.set('Retry-After', Math.ceil((result.resetAt - Date.now()) / 1000));
return handler(req, res);
}
next();
};
}
// Usage in Express app
import express from 'express';
import { rateLimitMiddleware } from './rateLimit.js';
const app = express();
// Global rate limit: 100 requests per minute per IP
app.use(rateLimitMiddleware({
limit: 100,
windowSec: 60,
}));
// Stricter limit for auth endpoints
app.use('/api/auth', rateLimitMiddleware({
limit: 10,
windowSec: 60,
keyGenerator: (req) => `auth:${req.ip}`,
}));
// Per-user limit for authenticated routes
app.use('/api/v1', rateLimitMiddleware({
limit: 1000,
windowSec: 3600, // 1000 requests per hour
keyGenerator: (req) => `user:${req.user?.id || req.ip}`,
}));
Python / FastAPI
# rate_limit.py
import redis.asyncio as redis
import time
from fastapi import Request, HTTPException
from functools import wraps
redis_client = redis.from_url("redis://localhost:6379")
async def check_rate_limit(key: str, limit: int, window_sec: int) -> dict:
"""Sliding window counter rate limiter."""
now = time.time()
window_ms = window_sec * 1000
current_window = int(now * 1000 // window_ms)
previous_window = current_window - 1
current_key = f"ratelimit:{key}:{current_window}"
previous_key = f"ratelimit:{key}:{previous_window}"
# Get counts
pipe = redis_client.pipeline()
pipe.get(current_key)
pipe.get(previous_key)
current, previous = await pipe.execute()
current = int(current or 0)
previous = int(previous or 0)
# Calculate weighted count
elapsed = (now * 1000) % window_ms
weight = elapsed / window_ms
count = current + int(previous * (1 - weight))
if count >= limit:
reset_at = (current_window + 1) * window_sec
return {
"allowed": False,
"remaining": 0,
"reset_at": reset_at,
"limit": limit,
}
# Increment current window
pipe = redis_client.pipeline()
pipe.incr(current_key)
pipe.expire(current_key, window_sec * 2)
await pipe.execute()
return {
"allowed": True,
"remaining": max(0, limit - count - 1),
"reset_at": (current_window + 1) * window_sec,
"limit": limit,
}
# FastAPI dependency
from fastapi import Depends
async def rate_limit(
request: Request,
limit: int = 100,
window_sec: int = 60,
):
key = request.client.host
result = await check_rate_limit(key, limit, window_sec)
request.state.rate_limit = result
if not result["allowed"]:
raise HTTPException(
status_code=429,
detail="Too Many Requests",
headers={
"Retry-After": str(int(result["reset_at"] - time.time())),
"X-RateLimit-Limit": str(result["limit"]),
"X-RateLimit-Remaining": str(result["remaining"]),
},
)
# Usage
from fastapi import FastAPI
from functools import partial
app = FastAPI()
# Create limiters with different configurations
default_limit = partial(rate_limit, limit=100, window_sec=60)
auth_limit = partial(rate_limit, limit=10, window_sec=60)
api_limit = partial(rate_limit, limit=1000, window_sec=3600)
@app.get("/api/data")
async def get_data(_: None = Depends(default_limit)):
return {"data": "example"}
@app.post("/api/auth/login")
async def login(_: None = Depends(auth_limit)):
return {"token": "..."}
PHP / Laravel
// app/Http/Middleware/RateLimitMiddleware.php
resolveKey($request);
$result = $this->checkRateLimit($key, $limit, $windowSec);
if (!$result['allowed']) {
return response()->json([
'error' => 'Too Many Requests',
'retry_after' => $result['retry_after'],
], 429)->withHeaders([
'X-RateLimit-Limit' => $result['limit'],
'X-RateLimit-Remaining' => $result['remaining'],
'X-RateLimit-Reset' => $result['reset_at'],
'Retry-After' => $result['retry_after'],
]);
}
$response = $next($request);
return $response->withHeaders([
'X-RateLimit-Limit' => $result['limit'],
'X-RateLimit-Remaining' => $result['remaining'],
'X-RateLimit-Reset' => $result['reset_at'],
]);
}
private function resolveKey(Request $request): string
{
if ($user = $request->user()) {
return 'user:' . $user->id;
}
return 'ip:' . $request->ip();
}
private function checkRateLimit(string $key, int $limit, int $windowSec): array
{
$now = now()->timestamp * 1000;
$windowMs = $windowSec * 1000;
$currentWindow = intdiv((int)$now, $windowMs);
$previousWindow = $currentWindow - 1;
$currentKey = "ratelimit:{$key}:{$currentWindow}";
$previousKey = "ratelimit:{$key}:{$previousWindow}";
$current = (int) Redis::get($currentKey) ?: 0;
$previous = (int) Redis::get($previousKey) ?: 0;
$elapsed = $now % $windowMs;
$weight = $elapsed / $windowMs;
$count = $current + (int)($previous * (1 - $weight));
$resetAt = ($currentWindow + 1) * $windowSec;
$retryAfter = max(0, $resetAt - now()->timestamp);
if ($count >= $limit) {
return [
'allowed' => false,
'remaining' => 0,
'reset_at' => $resetAt,
'retry_after' => $retryAfter,
'limit' => $limit,
];
}
Redis::pipeline(function ($pipe) use ($currentKey, $windowSec) {
$pipe->incr($currentKey);
$pipe->expire($currentKey, $windowSec * 2);
});
return [
'allowed' => true,
'remaining' => max(0, $limit - $count - 1),
'reset_at' => $resetAt,
'retry_after' => $retryAfter,
'limit' => $limit,
];
}
}
// Register in Kernel.php
protected $routeMiddleware = [
'throttle.custom' => \App\Http\Middleware\RateLimitMiddleware::class,
];
// Usage in routes/api.php
Route::middleware(['throttle.custom:100,60'])->group(function () {
Route::get('/data', [DataController::class, 'index']);
});
Route::middleware(['throttle.custom:10,60'])->group(function () {
Route::post('/login', [AuthController::class, 'login']);
});
Token Bucket Implementation
For APIs that allow controlled bursts:
// tokenBucket.js
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);
/**
* Token bucket rate limiter
* @param {string} key - Unique identifier
* @param {number} bucketSize - Max tokens (burst limit)
* @param {number} refillRate - Tokens added per second
*/
export async function tokenBucket(key, bucketSize, refillRate) {
const script = `
local bucket_key = KEYS[1]
local last_update_key = KEYS[2]
local bucket_size = tonumber(ARGV[1])
local refill_rate = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
-- Get current state
local tokens = tonumber(redis.call('GET', bucket_key) or bucket_size)
local last_update = tonumber(redis.call('GET', last_update_key) or now)
-- Calculate tokens to add based on time elapsed
local elapsed = (now - last_update) / 1000
local new_tokens = math.min(bucket_size, tokens + (elapsed * refill_rate))
-- Try to consume one token
if new_tokens < 1 then
return {0, new_tokens, 0}
end
new_tokens = new_tokens - 1
-- Update state
redis.call('SET', bucket_key, new_tokens)
redis.call('SET', last_update_key, now)
redis.call('EXPIRE', bucket_key, 3600)
redis.call('EXPIRE', last_update_key, 3600)
return {1, new_tokens, 1}
`;
const bucketKey = `tokenbucket:${key}:tokens`;
const lastUpdateKey = `tokenbucket:${key}:updated`;
const [allowed, remaining] = await redis.eval(
script,
2,
bucketKey,
lastUpdateKey,
bucketSize,
refillRate,
Date.now()
);
return {
allowed: allowed === 1,
remaining: Math.floor(remaining),
bucketSize,
refillRate,
};
}
// Usage: Allow bursts of 10, refill 1 token per second
const result = await tokenBucket(`user:${userId}`, 10, 1);
Rate Limiting Strategies
By IP Address
// Basic - works for most cases
const key = req.ip;
// Handle proxies (trust X-Forwarded-For)
const key = req.headers['x-forwarded-for']?.split(',')[0] || req.ip;
By User ID
// For authenticated users - fair per-user limits
const key = req.user?.id ? `user:${req.user.id}` : `ip:${req.ip}`;
By API Key
// For API consumers - tie limits to subscription tier
const apiKey = req.headers['x-api-key'];
const tier = await getSubscriptionTier(apiKey);
const limits = {
free: { limit: 100, window: 3600 },
pro: { limit: 1000, window: 3600 },
enterprise: { limit: 10000, window: 3600 },
};
By Endpoint
// Different limits for different endpoints
const endpointLimits = {
'POST /api/auth/login': { limit: 5, window: 60 },
'POST /api/auth/register': { limit: 3, window: 60 },
'GET /api/search': { limit: 30, window: 60 },
'POST /api/ai/generate': { limit: 10, window: 3600 },
'default': { limit: 100, window: 60 },
};
Best Practices
1. Return Proper Headers
// Standard rate limit headers
X-RateLimit-Limit: 100 // Max requests allowed
X-RateLimit-Remaining: 45 // Requests remaining in window
X-RateLimit-Reset: 1704067200 // Unix timestamp when limit resets
Retry-After: 30 // Seconds until retry (on 429)
2. Provide Helpful Error Messages
{
"error": "rate_limit_exceeded",
"message": "You have exceeded the rate limit of 100 requests per minute",
"limit": 100,
"remaining": 0,
"reset_at": "2025-01-01T12:01:00Z",
"retry_after": 30,
"documentation": "https://api.example.com/docs/rate-limits"
}
3. Implement Graceful Degradation
// Instead of hard blocking, consider:
// 1. Return cached responses
// 2. Queue requests for later processing
// 3. Reduce functionality (no search, basic features only)
if (!result.allowed && result.remaining > -10) {
// Slightly over limit - return cached data
return getCachedResponse(req);
}
4. Whitelist Trusted IPs/Users
const whitelist = new Set([
'10.0.0.1', // Internal services
'192.168.1.100', // Monitoring
]);
if (whitelist.has(req.ip)) {
return next(); // Skip rate limiting
}
5. Log Rate Limit Events
if (!result.allowed) {
logger.warn('Rate limit exceeded', {
ip: req.ip,
user: req.user?.id,
endpoint: req.path,
limit: result.limit,
});
}
DDoS Considerations
Rate limiting alone won't stop DDoS attacks. For serious protection:
- Cloudflare: DDoS protection + CDN (free tier available)
- Fail2ban: Block IPs at firewall level after repeated violations
- Geographic blocking: Block traffic from unexpected regions
- WAF rules: Block known attack patterns
# Example: Block IP after 10 rate limit violations
# In your rate limiter:
if (!result.allowed) {
const violationKey = `violations:${req.ip}`;
const violations = await redis.incr(violationKey);
await redis.expire(violationKey, 3600); // Track for 1 hour
if (violations > 10) {
// Add to blocklist or trigger fail2ban
await redis.sadd('blocklist', req.ip);
await redis.expire(`blocklist:${req.ip}`, 86400); // Block for 24h
}
}
Testing Rate Limits
# Using curl
for i in {1..150}; do
curl -s -o /dev/null -w "%{http_code}\n" https://api.example.com/endpoint
done
# Using Apache Bench
ab -n 200 -c 10 https://api.example.com/endpoint
# Using wrk
wrk -t4 -c100 -d30s https://api.example.com/endpoint
Get Started with DanubeData Redis
For production rate limiting, you need reliable, fast Redis. DanubeData offers managed Redis with sub-millisecond latency.
👉 Create a Managed Redis Instance - Starting at €9.99/mo
DanubeData Redis Features:
- Sub-millisecond latency for rate limit checks
- Automatic failover for high availability
- TLS encryption
- Multiple providers: Redis, Valkey, Dragonfly
Need help implementing rate limiting? Contact our team for architecture advice.