System Design: Flash Sale — Surviving Black Friday and Limited-Stock Drops

The Interview Question

You're sitting across from the interviewer. They lean forward:

"Design a flash sale system. One thousand limited-edition sneakers go on sale at exactly 12:00 PM. Five hundred thousand users are waiting. The sale must be fair, there should be no overselling, bots must not win, the site must stay up, and all one thousand items must sell in under thirty seconds."

This is not hypothetical. Nike SNKRS drops, Supreme launches, Taylor Swift tickets, Nvidia GPU releases — these systems fail publicly and memorably. Lets design one that does not.

1. The Three Hard Problems

Every flash sale lives or dies by three interlocked failure modes. You cannot solve them independently.

Problem 1: Overselling. When 500,000 people simultaneously try to buy the last item, naive code sells it to all of them. Each request reads stock = 1, checks 1 > 0, then decrements. You end up at stock = -499,999 and a customer service catastrophe.

Problem 2: Thundering herd. At 12:00:00.000, every one of those 500,000 users clicks "Buy" simultaneously. Even if your system handles 10,000 req/sec normally, a 500,000 req/sec spike is 50x capacity. Servers fall over. The CDN does not help because these are authenticated purchase requests, not static assets.

Problem 3: Bot fairness. Automated buyers run on cloud VMs with 1ms latency and sub-millisecond click timing. A human might submit their request 200ms after the sale opens. A bot cluster is already done. Without addressing bots, 100% of inventory goes to resellers every time.

The layered solution we build below addresses each problem specifically, with each level building on the last.

2. Level 1 — Naive SQL

Level 1 The first instinct

Here is what every developer writes first:

-- Check availability
SELECT stock FROM products WHERE id = ?;

-- Application logic: if stock > 0, proceed
UPDATE products SET stock = stock - 1 WHERE id = ?;

INSERT INTO orders (user_id, product_id, created_at)
  VALUES (?, ?, NOW());

This has a classic check-then-act race condition. Between the SELECT and the UPDATE, another request can read the same stock value. If two requests both read stock = 1 and both verify 1 > 0 = true, they both proceed to UPDATE and INSERT. Stock becomes -1. You have sold an item you do not have.

At 500,000 concurrent users this does not fail occasionally — it fails for nearly every transaction.

Interactive Demo: The Race Condition

Two purchase requests hit the database simultaneously. Watch them both read stock = 1 and both proceed — leaving stock at -1.

Race Condition: Two Simultaneous Purchase Requests

products.stock (in database) 1

⚡ Request A User #1

SELECT stock FROM products → 1

CHECK stock > 0 → true ✓

UPDATE SET stock = 1 - 1

INSERT INTO orders ✓

⚡ Request B User #2

SELECT stock FROM products → 1

CHECK stock > 0 → true ✓

UPDATE SET stock = 1 - 1

INSERT INTO orders ✓

3. Level 2 — Pessimistic Locking

Level 2 Row-level locks

The first real fix is a database transaction with an exclusive row lock:

BEGIN;

SELECT stock FROM products
  WHERE id = ?
  FOR UPDATE; -- acquires exclusive row lock; others block here

-- application: if stock > 0:
UPDATE products SET stock = stock - 1 WHERE id = ?;
INSERT INTO orders (user_id, product_id) VALUES (?, ?);

COMMIT; -- releases lock; next waiter proceeds

FOR UPDATE acquires an exclusive lock on the row before reading. Any other transaction attempting to read the same row must block until this transaction commits or rolls back. This is correct — no more overselling.

The problem is throughput. Every purchase attempt serializes through that single row lock. With 500,000 concurrent connections:

Connection pool exhaustion — databases cap connections at 200–500
Lock queue — transactions pile up waiting, consuming memory and file descriptors
Lock timeout cascades — transactions waiting too long start failing, generating user-visible errors
Database CPU hits 100% managing the lock queue

bottleneck Throughput is bounded by lock serialization: roughly 1 purchase per DB round-trip, typically 50–200 purchases/sec even on powerful hardware. Adequate for a normal sale; fatal for a flash sale.

4. Level 3 — Optimistic Locking

Level 3 Conflict detection, not prevention

Optimistic locking assumes conflicts are rare and detects them on write instead of preventing them on read. Add a version column:

-- Schema
ALTER TABLE products ADD COLUMN version INT NOT NULL DEFAULT 1;

-- Read without a lock
SELECT stock, version FROM products WHERE id = ?;
-- got: stock=5, version=42

-- Update only if version still matches what we read
UPDATE products
  SET stock = stock - 1, version = version + 1
  WHERE id = ?
    AND version = 42   -- must match what we read
    AND stock > 0;

-- rows affected = 0: lost the race, retry or fail
-- rows affected = 1: success, proceed to INSERT order

This allows concurrent reads (no lock held during SELECT) and detects conflicts at write time. Under normal concurrent load it performs very well. Under flash-sale load — 500,000 synchronized arrivals — the retry rate becomes catastrophic:

500,000 users attempt at T=0
Only 1 succeeds with version=42; 499,999 get 0-rows-affected
All 499,999 retry, colliding on version=43
Without exponential backoff this creates a retry storm that is worse than the original problem
Still hammers the database with ~499,999 failed write attempts per tick

Optimistic locking is excellent for typical web workloads but poorly suited for the pathology of a synchronized-start flash sale where every contender arrives simultaneously.

5. Level 4 — Redis Atomic Decrement

Level 4 Move the hot path to Redis

Redis is single-threaded and executes each command atomically. The DECR command reads and decrements an integer in a single indivisible operation — no locks, no transactions, no race conditions at the application level.

Pre-load inventory before the sale opens:

-- Before 12:00 PM: seed inventory counter
SET product:sneaker-001:stock 1000
SET product:sneaker-001:stock:initial 1000

-- At each purchase attempt:
remaining = DECR product:sneaker-001:stock

IF remaining >= 0:
    -- Reservation successful
    createOrderAsync(user_id, 'sneaker-001')
    RETURN "reserved"

ELSE:
    -- Compensate: undo the decrement
    INCR product:sneaker-001:stock
    RETURN "sold_out"

DECR is atomic at the command level. There is no window between reading and writing the value — it is a single CPU instruction from Redis's perspective. This eliminates the overselling race condition entirely.

Performance characteristics of a single Redis instance:

Throughput: 100,000–200,000 DECR operations per second
Latency: under 0.1ms typical on the same network
No connection pool exhaustion (Redis handles thousands of concurrent connections cheaply via epoll)
Redis Cluster scales linearly with shard count

The Redis DECR approach has a subtle failure mode: if DECR succeeds (reservation made) but the subsequent database write for the order fails, you have decremented stock without creating a confirmed order. The inventory count is now wrong. The fix is a compensation step — if the DB write fails, immediately run INCR to restore the count. This is the smallest possible saga pattern: a two-step distributed transaction with a defined rollback operation.

This solves overselling at high throughput. But it does not yet solve the thundering herd — 500,000 requests still hammer your API layer simultaneously at T=0. And it does not address fairness.

6. Level 5 — The Pre-Sale Queue

Level 5 Decouple demand from fulfillment

The key insight: you do not need to process 500,000 requests simultaneously. You only need to sell 1,000 items. Everything else is waste. The virtual queue separates accepting demand (which must be instantaneous and massively parallel) from fulfilling orders (which is controlled and serial).

Architecture

[Users] 500k burst at T=0 | v (stateless queue entry service, scales horizontally) [Queue Entry API] -- ZADD queue:sale:001 timestamp user_id | -- ZRANK queue:sale:001 user_id -> position | -- Return: position, estimated_wait_sec v [Redis Sorted Set] key: queue:sale:001 score: arrival timestamp (ms) member: user_id | v (single worker, or partitioned by sale_id) [Queue Processor] -- ZPOPMIN 200/sec -- DECR stock -> write order -> notify user | +------------------+ v v [PostgreSQL] [WebSocket / SSE] Order creation "You got it!" / "Sold out" 200 writes/sec 499k notifications

Queue Entry — Absorbing the T=0 Spike

When the user clicks "Buy Now":

-- NX: only add if member does not exist (one entry per user)
ZADD queue:sale:001 NX timestamp_ms() user_id

-- Their position in line (0-indexed)
position = ZRANK queue:sale:001 user_id

-- Estimated wait at current drain rate
estimated_wait_sec = position / drain_rate_per_sec

-- Respond to the user immediately
RETURN position, estimated_wait_sec, queue_token

The NX flag ensures a user can only enter the queue once (idempotent retries are safe). The sorted set scores by timestamp, so first-come-first-served ordering is enforced by Redis itself. A ZADD is O(log N) — at 500,000 entries, this is still well under 1ms.

Queue Processor — Controlled Drain

-- Runs in a tight loop, every 1000ms:
LOOP:
    -- Atomically pop up to 200 entries from the front
    entries = ZPOPMIN queue:sale:001 200

    FOR EACH entry IN entries:
        remaining = DECR product:sneaker-001:stock

        IF remaining >= 0:
            createOrder(entry.user_id, 'sneaker-001')
            notifyUser(entry.user_id, 'purchased')

        ELSE:
            INCR product:sneaker-001:stock  -- compensate
            notifyUser(entry.user_id, 'sold_out')
            drainRemainingQueueAsSoldOut()
            BREAK

    sleep(1000ms)

The processor runs at a rate you control. Set it to 200/sec: 1,000 items sell in exactly 5 seconds. The database sees a steady 200 writes/sec — well within capacity. Users are notified via WebSocket or server-sent events as their turn arrives.

Interactive Demo: The Queue in Action

Five hundred users rush in at T=0. The queue drains at a configurable rate. The yellow dot is you — watch your estimated wait time count down as the queue processes.

Queue Simulation: 500 Users, 100 Items in Stock

Drain rate: 20/sec

100 stock left

0 in queue

0 sold

— your pos

— est wait

Queue

7. Level 6 — Anti-Bot Measures

Level 6 Making bots pay the human tax

A fair queue means nothing if automated buyers monopolize the first positions. Anti-bot layers must be enforced at queue entry, not at checkout.

Rate Limiting with Redis Sliding Window

-- Max 1 queue entry attempt per user per 60-second window
key = "ratelimit:user:" + user_id + ":" + floor(now_sec / 60)
count = INCR key
IF count == 1: EXPIRE key 120  -- TTL just past window boundary
IF count >  1: RETURN "rate_limited"

-- IP-level: max 3 distinct users per IP per minute (catches bot farms)
ip_key = "ratelimit:ip:" + client_ip + ":" + floor(now_sec / 60)
IF (INCR ip_key) > 3: RETURN "rate_limited"

Multi-Layer Bot Defence

Account age gate. Bots register new accounts for each sale. Require accounts to be at least 30 days old. This forces operators to maintain aged accounts — expensive at scale and detectable by statistical clustering.

CAPTCHA before queue entry. Present an invisible CAPTCHA solved before the sale starts, not at T=0 when every second counts. Humans solve it during the countdown; bots that skip it are rejected at queue entry.

Behavioral fingerprinting. Bots exhibit characteristic timing signatures:

Click arrives within 5ms of sale start — human reaction time is 150–300ms minimum
Mouse path is a direct straight line from page load to the buy button with zero deviation
No scroll events, no hover delay, no micro-pauses before clicking
HTTP headers inconsistent with the declared browser version

Device-bound participation token. Issue a signed token 5–10 minutes before the sale. The token binds to a browser fingerprint (canvas hash, WebGL renderer string, installed fonts, screen resolution). Same device cannot join the queue twice:

-- Participation token payload (signed with HMAC-SHA256)
{
  "user_id":      "u_abc123",
  "sale_id":      "sale_2026_sneaker_001",
  "device_hash":  "sha256_of_fingerprint_components",
  "issued_at":    1748984400,
  "expires_at":   1748988000,
  "bot_score":    0.02
}

-- Token is single-use: mark consumed on first queue entry
SET token:used:sha256(token) 1 EX 7200

Per-sale purchase cap. One account, one item, enforced at queue processing time:

purchased_key = "purchased:" + sale_id + ":" + user_id
IF EXISTS purchased_key: RETURN "already_purchased"
-- Set on success:
SET purchased_key 1 EX 86400

Nike SNKRS drops are notoriously competitive — often 100,000 people competing for 1,000 pairs. Nike moved to a randomized draw model instead of first-come-first-served specifically to neutralize bots. You cannot bot a random draw: submitting faster gives zero advantage because the draw happens at a fixed cutoff time and all entries before that moment have equal probability. The queue-based approach can adopt the same idea — randomize queue order among entries that arrive within the first 500ms (the human reaction window).

8. Level 7 — The Waiting Room

Level 7 Absorb pre-sale load on the CDN

The waiting room is a completely static HTML page served from the CDN edge. It collects users before the sale opens, pre-validates them, and releases a controlled burst at T=0.

Timeline

T-60 min: Users visit the product page and are served a redirect to the waiting room. This is a static file on CloudFront or Cloudflare — zero backend load, unlimited concurrent viewers, sub-10ms global latency.

T-10 min: The waiting room begins accepting "intent registrations." The page sends the user's auth token to a lightweight validation endpoint which checks account age, purchase history, and device fingerprint, then issues a queue entry JWT valid for 15 minutes.

T=0: The waiting room JavaScript detects the countdown reaching zero — either by local clock or by a server-sent event — and fires the queue entry request with the pre-validated JWT. Since validation already happened, queue entry is a single Redis ZADD with no database calls and no auth overhead.

// Waiting room countdown fires queue entry at T=0
fetch('/api/sale/sneaker-001/start-time')
  .then(function (r) { return r.json(); })
  .then(function (cfg) {
    var saleStart = new Date(cfg.start_time).getTime();

    var tick = setInterval(function () {
      var remaining = saleStart - Date.now();

      if (remaining <= 0) {
        clearInterval(tick);
        enterQueue(queueEntryJWT);  // single Redis ZADD
        return;
      }

      var secs = Math.floor(remaining / 1000);
      var mins = Math.floor(secs / 60);
      var pad  = (secs % 60) < 10 ? '0' : '';
      countdownEl.textContent = mins + ':' + pad + (secs % 60);
    }, 100);
  });

The key benefit: without the waiting room, T=0 triggers simultaneous authentication + authorization + bot-check + inventory operation for 500,000 users. With the waiting room, authentication is distributed over 10 minutes before the sale, and T=0 is reduced to a single Redis call per user.

Cloudflare Waiting Room and Queue-it sell exactly this pattern as managed products. A waiting room is fundamentally just a static countdown page with a WebSocket or SSE connection. The infrastructure cost to serve 500,000 people a CDN-cached countdown timer is essentially zero — a few dollars in bandwidth. The value is entirely in the controlled transition: at T=0, you decide exactly how many requests per second migrate from the waiting room to your real backend.

9. Handling Payment Failures

A user reaches the front of the queue, their slot is reserved, stock decremented — then their payment fails. What happens to that inventory unit?

The Soft Hold Pattern

-- Queue processor reserves a slot: create a hold with TTL
order_id = uuid()
SET hold:sneaker-001:+order_id user_id EX 300  -- 5-minute TTL

INSERT INTO orders
  (id, user_id, product_id, status, hold_expires_at)
  VALUES
  (order_id, user_id, 'sneaker-001', 'pending_payment', NOW() + 300);

-- On successful payment:
UPDATE orders SET status = 'confirmed' WHERE id = order_id;
DEL hold:sneaker-001:+order_id
SET purchased:sale_id:user_id 1 EX 86400

-- On payment failure:
UPDATE orders SET status = 'cancelled' WHERE id = order_id;
DEL  hold:sneaker-001:+order_id
INCR product:sneaker-001:stock  -- release unit back to inventory

A background worker sweeps for expired holds every 60 seconds:

-- Cleanup job: runs every 60 seconds
expired = SELECT id, user_id FROM orders
  WHERE status = 'pending_payment'
    AND hold_expires_at < NOW();

FOR EACH order IN expired:
    UPDATE orders SET status = 'expired' WHERE id = order.id;
    INCR product:sneaker-001:stock     -- reclaim the unit
    notifyUser(order.user_id, 'hold_expired')

Inventory Consistency Invariant

At all times the following must hold true. Run this as a monitoring query and alert on any divergence:

-- The invariant:
-- redis_stock = initial_stock
--            - COUNT(confirmed orders)
--            - COUNT(active holds not yet expired)

redis_stock == initial_stock
  - (SELECT COUNT(*) FROM orders WHERE status = 'confirmed')
  - (SELECT COUNT(*) FROM orders
     WHERE status = 'pending_payment'
       AND hold_expires_at > NOW())

-- Alert threshold: abs(divergence) > 1
-- Expected divergence in normal operation: 0

Concert ticketing platforms like Ticketmaster use exactly this pattern. The countdown timer you see while completing your purchase is a soft hold enforced by a server-side Redis TTL. If you abandon checkout, those seats return to inventory automatically when the key expires. The 10-minute checkout window is not just UX — it is the TTL value in their hold key. They run the same background sweep to catch seats abandoned mid-payment.

10. Capacity Estimates

Metric	Value
Users waiting at T=0	500,000
Queue entry requests at T=0 (burst)	~500,000/sec
Redis ZADD throughput (single node)	500,000+/sec
API pods needed for queue entry (10k req/s each)	50 pods
Controlled queue drain rate	200/sec
Time to sell all 1,000 items	~5 seconds
Steady-state DB write rate (order creation)	200/sec
Users notified "sold out"	~499,000
Active soft hold keys at peak	~200 (TTL 300s)
Redis memory for full queue (500k entries × ~50 bytes)	~25 MB
Waiting room CDN cost (500k users, static page)	~$0.50

The numbers reveal an important inversion: the hard engineering problem is not the 1,000 successful transactions — it is gracefully handling the 499,000 failures. Each rejection requires a polite notification. That is 499,000 WebSocket messages or SSE events to deliver, plus queue cleanup in Redis, plus user-facing messaging. Design your notification pipeline to handle this throughput before the first sale.

11. Failure Modes and Recovery

What happens when components go down mid-sale?

Redis failure. Both the queue and the inventory counter live in Redis. If Redis goes down, you cannot accept new queue entries or decrement stock. Mitigations:

Redis Sentinel or Redis Cluster with automatic failover — target under 1 second failover time
Accept that a brief Redis outage pauses the sale; surface a "Technical difficulty — please wait" banner
Never run a flash sale without Redis replication. A single Redis node is a single point of failure

Queue processor crash. If the worker processing the queue crashes mid-drain, items may be in a gap between "popped from sorted set" and "order written to DB." Use a two-set approach for safe handoff:

-- Atomically move entries to a "processing" set
MULTI
  entries = ZPOPMIN queue:sale:001 200
  ZADD queue:processing timestamp() entry.user_id  -- for each
EXEC

-- Only remove from processing set after DB write confirmed:
ZREM queue:processing user_id

-- On worker restart: re-process anything stuck in queue:processing
-- Items stuck > 30s are stale; re-inject to front of main queue

Database overload. If the database cannot sustain 200 writes/sec (unlikely on modern hardware, possible under high I/O contention):

Switch to batched multi-row INSERT: accumulate 200 order records, insert as one statement per tick
Temporarily reduce drain rate — 100/sec still sells 1,000 items in 10 seconds
Write orders to a Kafka topic and let the DB consumer work at its own pace

Clock skew across API pods. Queue positions are sorted by arrival timestamp. If API servers disagree on the current time by ±50ms, queue ordering within that window is non-deterministic. This is acceptable — simultaneous arrival is indistinguishable from near-simultaneous arrival, and the window is far smaller than human reaction time differences. Use NTP with a local time server if tighter ordering matters.

12. Complete Architecture at a Glance

CDN Edge Waiting room page, countdown JS, device fingerprinting Handles 10M+ concurrent viewers at zero backend cost Issues queue-entry JWTs at T-10 min (after account validation) | | T=0: 500k bursts simultaneously v API Gateway / Load Balancer Per-IP and per-user rate limiting (Redis INCR) JWT signature verification (CPU-only, no DB) Bot score threshold gate | v (autoscaled, stateless) Queue Entry Service x50 pods ZADD queue:sale:X NX timestamp user_id ZRANK -> return position + estimated wait | v Redis Cluster queue:sale:X sorted set ~500k entries, ~25MB queue:processing sorted set in-flight items product:X:stock integer atomic DECR/INCR ratelimit:* counters sliding window TTLs hold:* strings EX 300 soft holds token:used:* strings single-use token registry | v (single leader, or range-partitioned by sale) Queue Processor Worker ZPOPMIN 200/sec -> DECR stock -> write order -> notify Compensation: if DB write fails, INCR stock back | +-----+-----------+ v v PostgreSQL Notification Service orders table WebSocket / SSE push 200 writes/sec 499k "sold out" messages Batch INSERT OK "You got it!" to 1k buyers

Summary: The Six-Layer Defence

Layer	Problem Solved	Mechanism
Redis atomic DECR	Overselling	Atomic read-decrement; no application-level race
Virtual queue	Thundering herd	Absorb 500k burst; drain at controlled 200/sec
Rate limiting	Bot request spam	Per-user and per-IP Redis sliding window counters
Account requirements	Throwaway bot accounts	30-day age gate, prior purchase requirement
Device-bound token	Multi-entry bots	Fingerprint-bound JWT, single-use enforcement
Waiting room (CDN)	Pre-sale load spike	Static page absorbs crowd; pre-validates users

The answer the interviewer is looking for is not "use Redis." It is the recognition that flash sales have three distinct failure modes — overselling, thundering herd, and bot fairness — each requiring a different mechanism, and that the virtual queue is the architectural cornerstone that makes the other layers composable. Without the queue, you are applying point fixes to a fundamentally broken request flow.

The real challenge in production is not the 1,000 successful sales. It is the 499,000 graceful failures — delivered fast, politely, without crashing anything.