URL Encoding: The Bug That Wasted My Entire Afternoon
I once spent 4 hours debugging an API call before realizing I'd used encodeURI instead of encodeURIComponent. Here's everything I learned so you don't make the same mistake.
I've implemented rate limiting three times. The first two were disasters. Here's what I learned about token buckets, sliding windows, and why Redis is your friend.
My first API rate limiting implementation was a disaster. I used a simple counter that reset every minute. Seemed fine in testing. Then a bot hit my API at 11:59:59, made 100 requests, waited 2 seconds, and made another 100. My 'protected' endpoint got hammered.
That's when I learned there's more to rate limiting than counting requests. Token buckets, sliding windows, distributed state - it gets complex fast. But the fundamentals aren't hard once someone explains them properly.
This is what I wish someone had told me before I deployed my first rate limiter. Use the JWT Decoder to inspect rate limit claims if you're embedding limits in tokens.
Several algorithms underlie rate limiting implementations. Each has different characteristics for burst handling, memory usage, and fairness.
Token Bucket maintains a bucket of tokens that refills at a constant rate. Each request consumes a token. When the bucket is empty, requests are rejected. This allows bursts up to the bucket size while enforcing an average rate.
Leaky Bucket processes requests at a constant rate, queuing excess requests. It smooths traffic to a steady rate, eliminating bursts. This is ideal when downstream systems cannot handle variable load.
Fixed Window counts requests in fixed time windows (e.g., per minute). Simple to implement but allows bursts at window boundaries—a user could make double the limit by timing requests at window edges.
Sliding Window tracks requests over a moving window, eliminating the boundary burst problem. More accurate than fixed window but requires more memory or computation.
Understanding the tradeoffs between algorithms helps you choose the right one for your use case.
Token Bucket is ideal when you want to allow bursts while enforcing an overall rate. A user with 100 tokens per minute can make 100 requests instantly, then wait for refill. This matches real usage patterns where users work in bursts.
Fixed Window is simplest to implement—increment a counter, reset at window boundaries. The edge burst problem matters less if your limits are generous relative to typical usage. Many APIs use fixed windows successfully.
Sliding Window eliminates edge bursts by considering requests over the past N seconds continuously. Implementation options include sliding logs (store each timestamp) or sliding window counters (approximate using weighted windows).
Sliding Window Counters combine benefits: accuracy approaching sliding logs with memory efficiency closer to fixed windows. Weight the current and previous window by overlap to approximate the true sliding window.
Rate limiting typically happens at the API gateway or in middleware before requests reach application logic.
Identify requests by API key, user ID, IP address, or a combination. API keys are most reliable. IP addresses are problematic when users share IPs (NAT, corporate networks). Consider your user base when choosing identifiers.
Store rate limit state in a fast, shared data store. Redis is the most common choice—it is fast, supports atomic operations, and handles expiration. For single-server deployments, in-memory storage works.
Check limits early in the request pipeline. Rejecting rate-limited requests quickly saves resources. Do not authenticate, parse bodies, or run business logic before checking limits.
Return appropriate responses. 429 Too Many Requests is the standard status. Include Retry-After header to tell clients when to retry. Provide helpful error messages.
Standard headers communicate rate limit status to clients. Consistent headers enable clients to adapt their behavior and avoid hitting limits.
X-RateLimit-Limit indicates the maximum requests allowed in the current window. This tells clients their quota.
X-RateLimit-Remaining shows how many requests remain in the current window. Clients can pace themselves as they approach zero.
X-RateLimit-Reset indicates when the limit resets, usually as a Unix timestamp. Use the Timestamp Converter to verify these values during debugging.
Retry-After on 429 responses tells clients how long to wait before retrying. This can be seconds or an HTTP date.
RateLimit-Policy is an emerging standard that provides structured rate limit information including quota, window, and burst capacity.
JWTs can carry rate limit information as claims, enabling per-user limits without database lookups.
Include rate limit tier or plan identifier in JWT claims. The API checks this claim and applies appropriate limits. Different user plans get different limits.
Avoid including current quota usage in JWTs. Tokens are issued once and cannot track changing state. Usage must be tracked server-side.
Use the JWT Decoder to inspect tokens and verify rate limit claims are present and correct. This helps debug why certain users are getting unexpected limits.
Consider custom claims for specialized limits: per-endpoint limits, burst capacity, or grace periods. Structure claims clearly—complex claims become maintenance burdens.
When your API runs on multiple servers, rate limiting must work across the cluster. Several strategies handle this coordination.
Centralized storage (Redis) is the most common approach. All servers read and write limits to a single Redis instance or cluster. This provides accurate, consistent limits but adds latency and a dependency.
Eventual consistency accepts slight inaccuracies for reduced latency. Each server maintains local counters that sync periodically. A user might slightly exceed limits during sync delays.
Sticky sessions route each user to the same server, enabling local rate limiting. This is simple but limits load balancing flexibility and fails when servers restart.
Token-based quotas assign quota blocks to servers. Each server has a portion of the total limit. This reduces coordination but can leave quota unused on less-loaded servers.
Rate limiting generates valuable signals about API usage and potential abuse. Monitoring these signals enables proactive management.
Track rate limit hits by user and endpoint. Frequent hits from legitimate users suggest limits are too strict or usage patterns are changing. Hits from unknown sources may indicate abuse.
Alert on unusual patterns. Sudden spikes in rate limit hits may indicate attacks or malfunctioning clients. Gradual increases may indicate growing popularity requiring limit adjustments.
Monitor quota utilization. Users consistently near their limits may need upgrades or indicate that plans are poorly sized. Users never approaching limits may be paying for unused capacity.
Log rate limited requests with context. Include user ID, endpoint, current usage, and limit. This enables debugging and abuse investigation.
How clients handle rate limit errors affects user experience. Provide guidance and tools for graceful handling.
Always include Retry-After header with 429 responses. This tells clients exactly how long to wait. Include it in seconds or as an HTTP date.
Provide remaining quota in response headers for all requests, not just rate-limited ones. Clients can proactively slow down as they approach limits.
Document rate limits clearly. Include limits, windows, headers, and recommended handling in API documentation. Clients cannot adapt to limits they do not know about.
Consider degraded responses instead of hard failures. For read endpoints, returning cached or stale data may be acceptable when limits are hit. This maintains partial functionality.
Implement exponential backoff on your side for downstream rate limits. When your API calls external services that rate limit, back off gracefully rather than failing immediately.
Following best practices and avoiding common mistakes leads to rate limiting that protects without frustrating users.
Set limits based on real usage data. Analyze how legitimate users consume your API before setting limits. Arbitrary limits often surprise users or fail to protect against real abuse.
Provide limit headroom. If typical usage is 50 requests per minute, setting a limit at 60 forces users to carefully manage their usage. Setting it at 100 or 200 protects against abuse while accommodating normal variation.
Consider burst allowances. Users often need to make many requests quickly, then pause. Token bucket algorithms naturally handle this. Fixed window limits may need explicit burst provisions.
Do not limit by IP alone in production. Multiple users share IPs through NAT, corporate proxies, and VPNs. IP-based limits are suitable for anonymous endpoints but problematic for authenticated APIs.
Test rate limiting under load. Verify your implementation handles high concurrency correctly. Race conditions in rate limit checks can allow limits to be exceeded.
Always 429 Too Many Requests. Always include Retry-After header. I made the mistake of returning 503 once - clients thought my server was down. Be specific.
API key whenever possible. IP limiting breaks for corporate networks (everyone shares an IP), VPNs, and mobile carriers. I use IP only as a last-resort anti-DDoS layer.
Start with fixed window - it's simpler. Upgrade to token bucket only if the edge burst problem actually hits you. Most APIs never need the complexity. I overthought this on my first implementation.
X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset (Unix timestamp). Plus Retry-After on 429 responses. That's it. Don't overcomplicate.
Redis. Just use Redis. I tried other approaches - in-memory with sticky sessions, eventual consistency - and they all had edge cases. Redis atomic operations solve this cleanly.
Founder of CodeUtil. Web developer building tools I actually use. When I'm not coding, I experiment with productivity techniques (with mixed success).
I once spent 4 hours debugging an API call before realizing I'd used encodeURI instead of encodeURIComponent. Here's everything I learned so you don't make the same mistake.
Everything developers need to know about QR codes: data types (URL, WiFi, vCard, email), error correction levels, size optimization, and code examples for JavaScript, Python, and PHP. Free online generator included.
Master API debugging by learning to decode Base64 payloads and inspect JWT tokens. Troubleshoot authentication issues, read encoded error messages, and understand what your APIs are actually sending and receiving.