System Design Interview Roadmap

System Design Interview Roadmap

Bot Detection and Mitigation: Identifying Non-Human Traffic in Real-Time

Section 8: Production Engineering & Optimization | Article 213

Jun 05, 2026
∙ Paid

Introduction

Your login endpoint just processed 4,000 requests in 60 seconds from a single IP. Your rate limiter fires, blocks the IP, and you declare victory. Thirty seconds later, the same credential-stuffing attack resumes—now spread across 800 IPs, using valid browser User-Agent strings, with randomized delays between requests. The attacker bypassed your layer-4 defense by simply reading your block response and adapting. This is the fundamental asymmetry of bot detection: defenders must evaluate every signal with precision; attackers only need to find one gap.


What Bot Detection Actually Is

Bot detection is a multi-signal classification problem running under strict latency budgets. Every request arriving at your edge must be evaluated—typically in under 10ms—against a fingerprint of behavioral, environmental, and network signals to produce a risk score. Requests above a threshold get challenged or blocked; those below pass through.

The naive approach—blocklisting known bad IPs or User-Agent strings—fails because these signals are trivially spoofed. Sophisticated bots rotate through residential proxies (real IP addresses belonging to ISPs, not datacenters), mimic browser TLS fingerprints, and replay valid JavaScript challenge tokens captured from real browsers.

Signal categories that meaningful detection systems evaluate:

Network-layer signals: IP reputation, ASN classification (datacenter vs. residential vs. mobile), IP velocity (how many sessions from this IP in the last N seconds), and geolocation consistency (a session from US → Brazil → Germany in 10 minutes is impossible for a human).

TLS fingerprinting (JA3/JA3S): The TLS ClientHello message contains a deterministic fingerprint of the client’s cipher suite ordering, extensions, and elliptic curves. Browsers have distinctive, stable JA3 hashes. A curl binary, a Python requests client, and a headless Chromium each produce different JA3 hashes—even if they all send User-Agent: Mozilla/5.0. This is one of the most reliable passive signals because it happens before the HTTP handshake and requires effort to spoof.

HTTP/2 fingerprinting: H2 frames have ordering, priority weights, and SETTINGS values that differ between real browsers and bot libraries. A Chrome browser sending HTTP/2 produces a different SETTINGS frame than Go’s net/http client—even if both claim to be Chrome.

Behavioral signals (client-side): JavaScript executed in the browser collects mouse movement entropy, scroll patterns, keyboard timing, touch event presence, WebGL renderer strings, canvas fingerprints, and AudioContext oscillator outputs. Real humans produce noisy, irregular interaction patterns. Bots replicate clicks and keystrokes with machine precision or not at all.

Session behavioral signals (server-side): Request rate, page visit sequences, time-on-page distributions, and form interaction timing. A real user takes 15–90 seconds to fill a checkout form. A bot fills it in 200ms or exactly 5,000ms (hardcoded delay).

These signals feed a scoring pipeline. Individual signals carry low confidence; their combination—especially cross-referencing network signals with client behavioral signals—drives accuracy above 99% for most attack categories.

User's avatar

Continue reading this post for free, courtesy of System Design Roadmap.

Or purchase a paid subscription.
© 2026 SystemDR Inc · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture