System Design: Live Sports Scores — Real-Time Updates for Millions of Fans
The interview question lands with deceptive simplicity: design a live sports score app. But hidden inside is one of the most demanding fan-out problems in web engineering. Fifty million people, all watching the same scoreboard, all expecting to see a goal appear within two seconds of the referee’s whistle. Let’s build it — layer by layer.
1. The Fan-Out Problem
During the 2018 FIFA World Cup, Google reported that “World Cup” was searched 3.5 billion times over the tournament. The France vs Croatia final had tens of millions of concurrent score watchers — a load spike that would take down most architectures not specifically designed for it.
Every system design problem has a core tension. For live scores it is fan-out: one event (a goal is scored) must produce fifty million side-effects (screen updates) in under two seconds. The math alone is staggering. If you treat each notification as a separate unit of work, you are attempting to do 50,000,000 operations in 2,000 ms — or 25 million operations per second just for one goal event.
No single machine does that. The architecture must distribute the fan-out across thousands of nodes, and the data model must be designed to keep each individual payload as tiny as possible.
The design decisions cascade: How do clients connect? How does the backend propagate the event? How does payload size scale with user count? We’ll walk through six levels of increasing sophistication, then build a working demo that shows all the pieces together.
2. Level 1 — Naive Polling
The first instinct: have every browser call /score?matchId=123 on a timer.
// Client-side polling — the naive approach function startPolling(matchId) { setInterval(async () => { const res = await fetch('/score?matchId=' + matchId); const data = await res.json(); updateScoreUI(data); }, 5000); // every 5 seconds }
Simple. Works. Terrible at scale.
- 50 M clients × 1 req / 5 s = 10 M requests/second to your origin servers
- ~99.9% of those requests return “no change” — pure wasted compute and bandwidth
- Maximum latency: 5 seconds (you just missed a goal update by 4.9 s)
- Each HTTP request carries full headers (~700 bytes), making even “empty” responses expensive
At 10 M req/s you need roughly 3,000 application servers just to accept connections (assuming ~3,000 RPS each). Your infrastructure bill is astronomical and your users still wait up to 5 seconds.
3. Level 2 — Conditional Polling (304 Not Modified)
Add a lastUpdated timestamp to every response. The client sends it back as If-Modified-Since. The server returns 304 Not Modified with an empty body if nothing changed.
// Conditional polling — saves bandwidth, not connections let lastUpdated = null; function startConditionalPolling(matchId) { setInterval(async () => { const headers = {}; if (lastUpdated) { headers['If-Modified-Since'] = lastUpdated; } const res = await fetch('/score?matchId=' + matchId, { headers }); if (res.status === 200) { lastUpdated = res.headers.get('Last-Modified'); const data = await res.json(); updateScoreUI(data); } // 304 response: body is empty, nothing to do }, 5000); }
Better — but the improvement is only in bytes transferred, not in connections made. You still open 10 million TCP connections per second. Server CPU is dominated by accept/handshake overhead, not JSON serialisation. You haven’t solved the fundamental problem.
4. Level 3 — Server-Sent Events (SSE)
Server-Sent Events (SSE) was standardised in HTML5 (2009) but remained obscure for years. With HTTP/2 multiplexing, SSE became much more efficient — multiple SSE streams share one TCP connection. It’s now the go-to choice for unidirectional real-time data: stock tickers, live feeds, sports scores.
The key insight: instead of the client asking repeatedly, flip the model — have the server push whenever something changes.
SSE (Server-Sent Events) is perfect here. The browser opens one long-lived HTTP connection. The server streams data: lines down it whenever it has something to say. When a goal is scored, one server-initiated push reaches the client instantly — no polling interval latency.
// Server: Node.js SSE endpoint app.get('/stream/score', (req, res) => { const matchId = req.query.matchId; res.setHeader('Content-Type', 'text/event-stream'); res.setHeader('Cache-Control', 'no-cache'); res.setHeader('Connection', 'keep-alive'); res.flushHeaders(); // Subscribe to score updates for this match const onUpdate = (payload) => { res.write('event: score\n'); res.write('data: ' + JSON.stringify(payload) + '\n\n'); }; scoreEmitter.on('match:' + matchId, onUpdate); // Clean up when client disconnects req.on('close', () => { scoreEmitter.off('match:' + matchId, onUpdate); }); });
// Client: native EventSource API const es = new EventSource('/stream/score?matchId=123'); es.addEventListener('score', (e) => { const update = JSON.parse(e.data); applyDelta(update); // apply incremental update to local state }); es.addEventListener('error', () => { // EventSource reconnects automatically — built into the spec console.log('Reconnecting...'); });
Why SSE over WebSockets for this use case?
Sports score updates are unidirectional — the server pushes, the client only consumes. WebSockets add bidirectional complexity (handshakes, framing, ping/pong, message type routing) that you simply don’t need. SSE is a plain HTTP response — it works through proxies, CDNs, and load balancers without special configuration. The browser’s EventSource API handles reconnection automatically, complete with Last-Event-ID header so you never miss an event across reconnects.
Protocol Comparison
SSE vs WebSocket vs Polling — 30-Minute Match Simulation
Polling (5s)
Server-Sent Events
WebSocket
5. Level 4 — Redis Pub/Sub Fan-Out
A single Node.js process can hold ~50,000 concurrent SSE connections comfortably (it’s all I/O, no CPU). For 50 million users, that means you need 1,000 SSE servers. But now there is a coordination problem: when a goal is scored, how does every one of those 1,000 servers know to push to its clients?
Redis Pub/Sub solves this cleanly.
// Score Ingestion Service — publishes to Redis const redis = require('ioredis'); const pub = new redis(); async function publishGoal(matchId, goalEvent) { const payload = JSON.stringify({ type: 'GOAL', matchId, team: goalEvent.team, minute: goalEvent.minute, scorer: goalEvent.scorer, newScore: goalEvent.newScore, ts: Date.now() }); // Single publish → all 1,000 subscribers receive it await pub.publish('match:' + matchId + ':score', payload); }
// SSE Server — subscribes and pushes to clients const sub = new redis(); const clients = new Map(); // matchId → Set of response objects sub.on('message', (channel, message) => { // channel = "match:123:score", message = JSON delta string const matchId = channel.split(':')[1]; const matchClients = clients.get(matchId); if (!matchClients) return; const frame = 'event: score\ndata: ' + message + '\n\n'; matchClients.forEach((res) => { try { res.write(frame); } catch (e) { matchClients.delete(res); } }); }); // Subscribe when first client joins a match function ensureSubscribed(matchId) { if (!clients.has(matchId)) { clients.set(matchId, new Set()); sub.subscribe('match:' + matchId + ':score'); } }
The fan-out math now:
- 1 Redis PUBLISH to
match:123:score - 1,000 SSE servers each receive the message via their Redis subscription (network hop: ~1 ms)
- Each server pushes to ~50,000 clients (I/O loop: ~5–20 ms)
- Total: 50 million clients notified in under 100 ms in practice
The key is that Redis broadcasts to all subscribers in one atomic operation. There is no loop in your application code over 1,000 servers — Redis handles that fan-out tier.
6. Level 5 — Tiered Fan-Out for Mega Events
The latency problem for live sports scores has an amusing human dimension: TV broadcast delay is typically 5–8 seconds. This means if you have a score alert on your phone AND are watching TV, you will see the goal on your phone before you see it on screen. Broadcasters have tried to compensate by deliberately delaying digital score alerts to match TV lag — a surreal engineering requirement.
At true World Cup scale, even Redis pub/sub can become a bottleneck. A single Redis node receiving millions of concurrent subscriptions and processing thousands of publishes per second hits CPU and memory ceilings. The answer is tiered fan-out — move from a flat architecture to a tree.
The tiers:
- Tier 0 — Score source: Data vendor (Sportradar, Opta) sends a single event to your ingestion service
- Tier 1 — Kafka: Score Ingestion Service writes to a Kafka topic
match-scores. Kafka gives you durability, replay, and decoupling - Tier 2 — Fan-out workers: 10 consumers read from Kafka. Each manages a shard of SSE servers — e.g., Worker 0 owns SSE servers 0–99, Worker 1 owns 100–199, etc.
- Tier 3 — Regional Redis: Each fan-out worker publishes to a regional Redis cluster (US-East, EU-West, APAC). SSE servers in that region subscribe locally
- Tier 4 — SSE servers: Each SSE server holds 50k connections; on Redis message, pushes to all local clients
// Kafka consumer — Fan-out Worker const { Kafka } = require('kafkajs'); const kafka = new Kafka({ brokers: ['kafka:9092'] }); const consumer = kafka.consumer({ groupId: 'fanout-workers' }); await consumer.connect(); await consumer.subscribe({ topic: 'match-scores' }); await consumer.run({ eachMessage: async ({ message }) => { const event = JSON.parse(message.value.toString()); const region = getRegionForEvent(event); const regionalRedis = redisClients[region]; // Fan out to regional Redis — local SSE servers pick it up await regionalRedis.publish( 'match:' + event.matchId + ':score', JSON.stringify(event) ); } });
This architecture keeps each component at a manageable scale:
| Component | Count | Load per unit |
|---|---|---|
| Score ingestion | 1 | Trivial |
| Kafka brokers | 3–5 | ~50 events/sec (all matches) |
| Fan-out workers | 10 | 5 events/sec each |
| Regional Redis | 3 | ~100 subscribing SSE servers |
| SSE servers | 1,000 | 50k connections, ~5 pushes/min |
Fan-Out Propagation Visualiser
Tiered Fan-Out — Goal Event Propagation
7. Level 6 — Delta Compression and Efficient Payloads
A critical multiplier in any fan-out system is payload size. Let’s compare two approaches to the score update message.
Full State Snapshot
{
"matchId": "wc2026_final",
"homeTeam": { "name": "Brazil", "code": "BRA" },
"awayTeam": { "name": "France", "code": "FRA" },
"score": { "home": 1, "away": 0 },
"minute": 67,
"status": "LIVE",
"events": [
{ "type": "GOAL", "minute": 67,
"team": "home", "player": "Vinicius Jr" }
// ... all previous events ...
],
"lineups": { /* 22 players ... */ },
"stats": { /* shots, possession ... */ }
}
~5,000 bytes per update
Delta Update
{
"matchId": "wc2026_final",
"type": "GOAL",
"team": "home",
"minute": 67,
"scorer": "Vinicius Jr",
"score": { "home": 1, "away": 0 },
"ts": 1749549600000
}
~120 bytes per update
The strategy: clients maintain local match state. The server only sends what changed. On initial page load, the client fetches a full snapshot (GET /api/match/wc2026_final/state), then subscribes to the SSE stream for incremental deltas.
On reconnect, the EventSource API automatically sends Last-Event-ID (you set this via SSE’s id: field). Your server can fast-replay any missed events since that ID from a short-lived Redis stream or Postgres events table.
// Client: maintain local state, apply deltas let matchState = null; async function initMatch(matchId) { // 1. Fetch full snapshot once const res = await fetch('/api/match/' + matchId + '/state'); matchState = await res.json(); renderMatch(matchState); // 2. Subscribe to deltas const es = new EventSource('/stream/' + matchId); es.addEventListener('score', (e) => { const delta = JSON.parse(e.data); matchState = applyDelta(matchState, delta); renderMatch(matchState); }); } function applyDelta(state, delta) { switch (delta.type) { case 'GOAL': state.score[delta.team]++; state.events.push(delta); break; case 'YELLOW_CARD': case 'RED_CARD': case 'SUBSTITUTION': state.events.push(delta); break; case 'STATUS_CHANGE': state.status = delta.status; break; } return state; }
8. The Score Widget on Google
When you search “World Cup final score” on Google, the result appears inline — no page navigation, no separate API call, score updating in real time. How?
Google owns the whole stack, so they optimise at every layer:
-
Server-side render the initial score directly into the search result HTML. The score you see when the page loads costs zero additional requests — it’s baked into the HTML by the time the CDN serves it.
-
Distributed score cache (Spanner/Bigtable): Google ingests data vendor feeds (or scrapes official sources) and stores match state in a globally replicated cache. Every datacenter has a local copy that’s < 1 second stale.
-
Inline SSE from the same domain: The score widget opens an SSE connection back to Google’s servers. Because Google controls their own CDN (Edgenetwork), the SSE termination can happen at the nearest Google PoP — often < 30 ms away from the user.
-
No cross-origin overhead: Same-origin SSE streams don’t need CORS preflight. The EventSource connection is immediate.
-
Graceful degradation: If the SSE connection fails or the user has a slow connection, the last server-rendered score is still correct and visible. The widget doesn’t blank out.
9. Interactive Live Score Demo
⚽ Live Match — FIFA World Cup 2026 Final
10. Failure Modes and Resilience
The happy path is straightforward. The interesting engineering is in the failure modes.
SSE server crash: Clients automatically reconnect (EventSource handles this). They send Last-Event-ID. Your server replays any missed events from a Redis Stream (XRANGE match:123:events lastId + COUNT 100). The client is never stale for more than the reconnect interval (~3 seconds).
// Redis Stream for event replay (server-side) async function replayMissedEvents(matchId, lastEventId, res) { const missed = await redis.xrange( 'match:' + matchId + ':events', lastEventId || '-', '+', 'COUNT', 100 ); missed.forEach(([id, fields]) => { res.write('id: ' + id + '\n'); res.write('event: score\n'); res.write('data: ' + fields[1] + '\n\n'); }); }
Redis pub/sub node failure: Use Redis Sentinel or Redis Cluster. SSE servers reconnect to the replica that was just promoted. The fan-out gap during failover (~1–5 s) is covered by the client replaying from the Redis Stream when it reconnects.
Data vendor outage: The Score Ingestion Service should have a dead-letter queue in Kafka. When the vendor recovers, events replay in order. The Kafka consumer group offset ensures no event is processed twice.
Thundering herd on reconnect: After a major outage, all 50 million SSE clients reconnect simultaneously. Add jitter to the EventSource retry:
// Prevent thundering herd on reconnect function createSSEWithJitter(matchId) { function connect() { const es = new EventSource('/stream/' + matchId); es.addEventListener('error', () => { es.close(); // Jitter: 1–5 seconds random delay before reconnect const delay = 1000 + Math.random() * 4000; setTimeout(connect, delay); }); return es; } return connect(); }
11. Capacity Estimate
| Metric | Number |
|---|---|
| Concurrent users (World Cup final) | 50,000,000 |
| SSE servers needed | 1,000 (50k connections each) |
| Connections per SSE server | 50,000 |
| Score updates per 90-min match | ~50 (goals, cards, substitutions) |
| Fan-out latency target | < 2 seconds end-to-end |
| Typical achieved latency | 50–200 ms |
| Payload size (delta) | ~120 bytes |
| Bandwidth per goal event (50M × 120B) | ~6 GB |
| Redis pub/sub messages per event | 1 → 1,000 servers |
| Kafka partitions (match-scores topic) | 20 (1 per fan-out worker pair) |
| Regional Redis clusters | 3 (US-East, EU-West, APAC) |
| Memory per SSE connection (Node.js) | ~10 KB |
| Total SSE server memory (50k × 10 KB) | ~500 MB per server |
12. Summary Architecture Diagram
┌─────────────────────────────────────────────────────────────────┐
│ Score Data Flow │
└─────────────────────────────────────────────────────────────────┘
[Sportradar/Opta]
│ HTTP webhook / TCP feed
▼
[Score Ingestion Service]
│ Kafka PRODUCE → topic: match-scores
▼
[Kafka Cluster] ←─── durable, replayable, ordered
│ 10 consumer group instances
▼
[Fan-out Workers x10] ─── one per shard of SSE servers
│ Redis PUBLISH → match:{id}:score
├──────────────────┬──────────────────────┐
▼ ▼ ▼
[Redis US-East] [Redis EU-West] [Redis APAC]
│ │ │
───────────── ───────────── ──────────────
SSE x333 SSE x333 SSE x334
50k clients ea. 50k clients ea. 50k clients ea.
───────────── ───────────── ──────────────
│ │ │
└──────────────────┴──────────────────────┘
50,000,000 clients
receive update in < 200ms
The elegance of this architecture is its layered fan-out: each tier multiplies the reach while keeping each individual node at a manageable load level. Redis doesn’t talk to clients. SSE servers don’t talk to Kafka. Each component does one job well.
An interesting edge case: what happens when the official data vendor feed is wrong? Sportradar occasionally sends erroneous events. In 2014, a data error caused a goal to briefly appear in a match score on multiple platforms simultaneously before being retracted. Designing for event correction (a RETRACT or AMEND delta type) is part of a production score system — the client’s local state machine must handle out-of-order corrections gracefully.
The design principles generalise far beyond sports scores. Any “one event → millions of notifications” problem — auction bids, stock prices, breaking news, collaborative document cursors — benefits from the same tiered fan-out pattern: durable queue (Kafka) → regional brokers (Redis) → persistent connections (SSE/WebSocket) → client-side state with delta application.
What makes the sports score case particularly instructive is the bursty nature of the load: the system must handle 50 million concurrent connections during a World Cup final but then serve a fraction of that for a mid-season league game the next afternoon. Auto-scaling the SSE tier (Kubernetes HPA on connection count) and keeping the fan-out workers stateless makes the scale-down as important as the scale-up.
Build the state machine on the client. Keep payloads tiny. Fan out through tiers. The rest is operations.