System Design: Live Sports Scores — Real-Time Updates for Millions of Fans

The interview question lands with deceptive simplicity: design a live sports score app. But hidden inside is one of the most demanding fan-out problems in web engineering. Fifty million people, all watching the same scoreboard, all expecting to see a goal appear within two seconds of the referee’s whistle. Let’s build it — layer by layer.


1. The Fan-Out Problem

During the 2018 FIFA World Cup, Google reported that “World Cup” was searched 3.5 billion times over the tournament. The France vs Croatia final had tens of millions of concurrent score watchers — a load spike that would take down most architectures not specifically designed for it.

Every system design problem has a core tension. For live scores it is fan-out: one event (a goal is scored) must produce fifty million side-effects (screen updates) in under two seconds. The math alone is staggering. If you treat each notification as a separate unit of work, you are attempting to do 50,000,000 operations in 2,000 ms — or 25 million operations per second just for one goal event.

No single machine does that. The architecture must distribute the fan-out across thousands of nodes, and the data model must be designed to keep each individual payload as tiny as possible.

The design decisions cascade: How do clients connect? How does the backend propagate the event? How does payload size scale with user count? We’ll walk through six levels of increasing sophistication, then build a working demo that shows all the pieces together.


2. Level 1 — Naive Polling

The first instinct: have every browser call /score?matchId=123 on a timer.

// Client-side polling — the naive approach
function startPolling(matchId) {
  setInterval(async () => {
    const res = await fetch('/score?matchId=' + matchId);
    const data = await res.json();
    updateScoreUI(data);
  }, 5000); // every 5 seconds
}

Simple. Works. Terrible at scale.

  • 50 M clients × 1 req / 5 s = 10 M requests/second to your origin servers
  • ~99.9% of those requests return “no change” — pure wasted compute and bandwidth
  • Maximum latency: 5 seconds (you just missed a goal update by 4.9 s)
  • Each HTTP request carries full headers (~700 bytes), making even “empty” responses expensive

At 10 M req/s you need roughly 3,000 application servers just to accept connections (assuming ~3,000 RPS each). Your infrastructure bill is astronomical and your users still wait up to 5 seconds.


3. Level 2 — Conditional Polling (304 Not Modified)

Add a lastUpdated timestamp to every response. The client sends it back as If-Modified-Since. The server returns 304 Not Modified with an empty body if nothing changed.

// Conditional polling — saves bandwidth, not connections
let lastUpdated = null;

function startConditionalPolling(matchId) {
  setInterval(async () => {
    const headers = {};
    if (lastUpdated) {
      headers['If-Modified-Since'] = lastUpdated;
    }

    const res = await fetch('/score?matchId=' + matchId, { headers });

    if (res.status === 200) {
      lastUpdated = res.headers.get('Last-Modified');
      const data = await res.json();
      updateScoreUI(data);
    }
    // 304 response: body is empty, nothing to do
  }, 5000);
}

Better — but the improvement is only in bytes transferred, not in connections made. You still open 10 million TCP connections per second. Server CPU is dominated by accept/handshake overhead, not JSON serialisation. You haven’t solved the fundamental problem.


4. Level 3 — Server-Sent Events (SSE)

Server-Sent Events (SSE) was standardised in HTML5 (2009) but remained obscure for years. With HTTP/2 multiplexing, SSE became much more efficient — multiple SSE streams share one TCP connection. It’s now the go-to choice for unidirectional real-time data: stock tickers, live feeds, sports scores.

The key insight: instead of the client asking repeatedly, flip the model — have the server push whenever something changes.

SSE (Server-Sent Events) is perfect here. The browser opens one long-lived HTTP connection. The server streams data: lines down it whenever it has something to say. When a goal is scored, one server-initiated push reaches the client instantly — no polling interval latency.

// Server: Node.js SSE endpoint
app.get('/stream/score', (req, res) => {
  const matchId = req.query.matchId;

  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');
  res.flushHeaders();

  // Subscribe to score updates for this match
  const onUpdate = (payload) => {
    res.write('event: score\n');
    res.write('data: ' + JSON.stringify(payload) + '\n\n');
  };

  scoreEmitter.on('match:' + matchId, onUpdate);

  // Clean up when client disconnects
  req.on('close', () => {
    scoreEmitter.off('match:' + matchId, onUpdate);
  });
});
// Client: native EventSource API
const es = new EventSource('/stream/score?matchId=123');

es.addEventListener('score', (e) => {
  const update = JSON.parse(e.data);
  applyDelta(update);        // apply incremental update to local state
});

es.addEventListener('error', () => {
  // EventSource reconnects automatically — built into the spec
  console.log('Reconnecting...');
});

Why SSE over WebSockets for this use case?

Sports score updates are unidirectional — the server pushes, the client only consumes. WebSockets add bidirectional complexity (handshakes, framing, ping/pong, message type routing) that you simply don’t need. SSE is a plain HTTP response — it works through proxies, CDNs, and load balancers without special configuration. The browser’s EventSource API handles reconnection automatically, complete with Last-Event-ID header so you never miss an event across reconnects.

Protocol Comparison

SSE vs WebSocket vs Polling — 30-Minute Match Simulation

Polling (5s)
⬆️
Total requests
0
Data transferred
0 KB
Max goal latency
5,000 ms
Server-Sent Events
⬇️
Total requests
1
Data transferred
0 KB
Max goal latency
~50 ms
WebSocket
↕️
Total requests
1
Data transferred
0 KB
Max goal latency
~50 ms
Press Run to start the simulation

5. Level 4 — Redis Pub/Sub Fan-Out

A single Node.js process can hold ~50,000 concurrent SSE connections comfortably (it’s all I/O, no CPU). For 50 million users, that means you need 1,000 SSE servers. But now there is a coordination problem: when a goal is scored, how does every one of those 1,000 servers know to push to its clients?

Redis Pub/Sub solves this cleanly.

// Score Ingestion Service — publishes to Redis
const redis = require('ioredis');
const pub = new redis();

async function publishGoal(matchId, goalEvent) {
  const payload = JSON.stringify({
    type: 'GOAL',
    matchId,
    team: goalEvent.team,
    minute: goalEvent.minute,
    scorer: goalEvent.scorer,
    newScore: goalEvent.newScore,
    ts: Date.now()
  });

  // Single publish → all 1,000 subscribers receive it
  await pub.publish('match:' + matchId + ':score', payload);
}
// SSE Server — subscribes and pushes to clients
const sub = new redis();
const clients = new Map(); // matchId → Set of response objects

sub.on('message', (channel, message) => {
  // channel = "match:123:score", message = JSON delta string
  const matchId = channel.split(':')[1];
  const matchClients = clients.get(matchId);
  if (!matchClients) return;

  const frame = 'event: score\ndata: ' + message + '\n\n';

  matchClients.forEach((res) => {
    try { res.write(frame); }
    catch (e) { matchClients.delete(res); }
  });
});

// Subscribe when first client joins a match
function ensureSubscribed(matchId) {
  if (!clients.has(matchId)) {
    clients.set(matchId, new Set());
    sub.subscribe('match:' + matchId + ':score');
  }
}

The fan-out math now:

  1. 1 Redis PUBLISH to match:123:score
  2. 1,000 SSE servers each receive the message via their Redis subscription (network hop: ~1 ms)
  3. Each server pushes to ~50,000 clients (I/O loop: ~5–20 ms)
  4. Total: 50 million clients notified in under 100 ms in practice

The key is that Redis broadcasts to all subscribers in one atomic operation. There is no loop in your application code over 1,000 servers — Redis handles that fan-out tier.


6. Level 5 — Tiered Fan-Out for Mega Events

The latency problem for live sports scores has an amusing human dimension: TV broadcast delay is typically 5–8 seconds. This means if you have a score alert on your phone AND are watching TV, you will see the goal on your phone before you see it on screen. Broadcasters have tried to compensate by deliberately delaying digital score alerts to match TV lag — a surreal engineering requirement.

At true World Cup scale, even Redis pub/sub can become a bottleneck. A single Redis node receiving millions of concurrent subscriptions and processing thousands of publishes per second hits CPU and memory ceilings. The answer is tiered fan-out — move from a flat architecture to a tree.

The tiers:

  • Tier 0 — Score source: Data vendor (Sportradar, Opta) sends a single event to your ingestion service
  • Tier 1 — Kafka: Score Ingestion Service writes to a Kafka topic match-scores. Kafka gives you durability, replay, and decoupling
  • Tier 2 — Fan-out workers: 10 consumers read from Kafka. Each manages a shard of SSE servers — e.g., Worker 0 owns SSE servers 0–99, Worker 1 owns 100–199, etc.
  • Tier 3 — Regional Redis: Each fan-out worker publishes to a regional Redis cluster (US-East, EU-West, APAC). SSE servers in that region subscribe locally
  • Tier 4 — SSE servers: Each SSE server holds 50k connections; on Redis message, pushes to all local clients
// Kafka consumer — Fan-out Worker
const { Kafka } = require('kafkajs');
const kafka = new Kafka({ brokers: ['kafka:9092'] });

const consumer = kafka.consumer({ groupId: 'fanout-workers' });

await consumer.connect();
await consumer.subscribe({ topic: 'match-scores' });

await consumer.run({
  eachMessage: async ({ message }) => {
    const event = JSON.parse(message.value.toString());
    const region = getRegionForEvent(event);
    const regionalRedis = redisClients[region];

    // Fan out to regional Redis — local SSE servers pick it up
    await regionalRedis.publish(
      'match:' + event.matchId + ':score',
      JSON.stringify(event)
    );
  }
});

This architecture keeps each component at a manageable scale:

Component Count Load per unit
Score ingestion 1 Trivial
Kafka brokers 3–5 ~50 events/sec (all matches)
Fan-out workers 10 5 events/sec each
Regional Redis 3 ~100 subscribing SSE servers
SSE servers 1,000 50k connections, ~5 pushes/min

Fan-Out Propagation Visualiser

Tiered Fan-Out — Goal Event Propagation


7. Level 6 — Delta Compression and Efficient Payloads

A critical multiplier in any fan-out system is payload size. Let’s compare two approaches to the score update message.

Full State Snapshot
{
  "matchId": "wc2026_final",
  "homeTeam": { "name": "Brazil", "code": "BRA" },
  "awayTeam": { "name": "France", "code": "FRA" },
  "score": { "home": 1, "away": 0 },
  "minute": 67,
  "status": "LIVE",
  "events": [
    { "type": "GOAL", "minute": 67,
      "team": "home", "player": "Vinicius Jr" }
    // ... all previous events ...
  ],
  "lineups": { /* 22 players ... */ },
  "stats": { /* shots, possession ... */ }
}
~5,000 bytes per update
50M clients × 5 KB = 250 GB per goal
Delta Update
{
  "matchId": "wc2026_final",
  "type": "GOAL",
  "team": "home",
  "minute": 67,
  "scorer": "Vinicius Jr",
  "score": { "home": 1, "away": 0 },
  "ts": 1749549600000
}
~120 bytes per update
50M clients × 120 B = 6 GB per goal

The strategy: clients maintain local match state. The server only sends what changed. On initial page load, the client fetches a full snapshot (GET /api/match/wc2026_final/state), then subscribes to the SSE stream for incremental deltas.

On reconnect, the EventSource API automatically sends Last-Event-ID (you set this via SSE’s id: field). Your server can fast-replay any missed events since that ID from a short-lived Redis stream or Postgres events table.

// Client: maintain local state, apply deltas
let matchState = null;

async function initMatch(matchId) {
  // 1. Fetch full snapshot once
  const res = await fetch('/api/match/' + matchId + '/state');
  matchState = await res.json();
  renderMatch(matchState);

  // 2. Subscribe to deltas
  const es = new EventSource('/stream/' + matchId);
  es.addEventListener('score', (e) => {
    const delta = JSON.parse(e.data);
    matchState = applyDelta(matchState, delta);
    renderMatch(matchState);
  });
}

function applyDelta(state, delta) {
  switch (delta.type) {
    case 'GOAL':
      state.score[delta.team]++;
      state.events.push(delta);
      break;
    case 'YELLOW_CARD':
    case 'RED_CARD':
    case 'SUBSTITUTION':
      state.events.push(delta);
      break;
    case 'STATUS_CHANGE':
      state.status = delta.status;
      break;
  }
  return state;
}

8. The Score Widget on Google

When you search “World Cup final score” on Google, the result appears inline — no page navigation, no separate API call, score updating in real time. How?

Google owns the whole stack, so they optimise at every layer:

  1. Server-side render the initial score directly into the search result HTML. The score you see when the page loads costs zero additional requests — it’s baked into the HTML by the time the CDN serves it.

  2. Distributed score cache (Spanner/Bigtable): Google ingests data vendor feeds (or scrapes official sources) and stores match state in a globally replicated cache. Every datacenter has a local copy that’s < 1 second stale.

  3. Inline SSE from the same domain: The score widget opens an SSE connection back to Google’s servers. Because Google controls their own CDN (Edgenetwork), the SSE termination can happen at the nearest Google PoP — often < 30 ms away from the user.

  4. No cross-origin overhead: Same-origin SSE streams don’t need CORS preflight. The EventSource connection is immediate.

  5. Graceful degradation: If the SSE connection fails or the user has a slow connection, the last server-rendered score is still correct and visible. The widget doesn’t blank out.


9. Interactive Live Score Demo

⚽ Live Match — FIFA World Cup 2026 Final

Live 00:00
Brazil
0
France
0
1,000 fans connected
Match events will appear here...

10. Failure Modes and Resilience

The happy path is straightforward. The interesting engineering is in the failure modes.

SSE server crash: Clients automatically reconnect (EventSource handles this). They send Last-Event-ID. Your server replays any missed events from a Redis Stream (XRANGE match:123:events lastId + COUNT 100). The client is never stale for more than the reconnect interval (~3 seconds).

// Redis Stream for event replay (server-side)
async function replayMissedEvents(matchId, lastEventId, res) {
  const missed = await redis.xrange(
    'match:' + matchId + ':events',
    lastEventId || '-',
    '+',
    'COUNT', 100
  );
  missed.forEach(([id, fields]) => {
    res.write('id: ' + id + '\n');
    res.write('event: score\n');
    res.write('data: ' + fields[1] + '\n\n');
  });
}

Redis pub/sub node failure: Use Redis Sentinel or Redis Cluster. SSE servers reconnect to the replica that was just promoted. The fan-out gap during failover (~1–5 s) is covered by the client replaying from the Redis Stream when it reconnects.

Data vendor outage: The Score Ingestion Service should have a dead-letter queue in Kafka. When the vendor recovers, events replay in order. The Kafka consumer group offset ensures no event is processed twice.

Thundering herd on reconnect: After a major outage, all 50 million SSE clients reconnect simultaneously. Add jitter to the EventSource retry:

// Prevent thundering herd on reconnect
function createSSEWithJitter(matchId) {
  function connect() {
    const es = new EventSource('/stream/' + matchId);

    es.addEventListener('error', () => {
      es.close();
      // Jitter: 1–5 seconds random delay before reconnect
      const delay = 1000 + Math.random() * 4000;
      setTimeout(connect, delay);
    });

    return es;
  }
  return connect();
}

11. Capacity Estimate

MetricNumber
Concurrent users (World Cup final)50,000,000
SSE servers needed1,000 (50k connections each)
Connections per SSE server50,000
Score updates per 90-min match~50 (goals, cards, substitutions)
Fan-out latency target< 2 seconds end-to-end
Typical achieved latency50–200 ms
Payload size (delta)~120 bytes
Bandwidth per goal event (50M × 120B)~6 GB
Redis pub/sub messages per event1 → 1,000 servers
Kafka partitions (match-scores topic)20 (1 per fan-out worker pair)
Regional Redis clusters3 (US-East, EU-West, APAC)
Memory per SSE connection (Node.js)~10 KB
Total SSE server memory (50k × 10 KB)~500 MB per server

12. Summary Architecture Diagram


  ┌─────────────────────────────────────────────────────────────────┐
  │                    Score Data Flow                              │
  └─────────────────────────────────────────────────────────────────┘

  [Sportradar/Opta]
         │  HTTP webhook / TCP feed
         ▼
  [Score Ingestion Service]
         │  Kafka PRODUCE → topic: match-scores
         ▼
  [Kafka Cluster]  ←─── durable, replayable, ordered
         │  10 consumer group instances
         ▼
  [Fan-out Workers x10]  ─── one per shard of SSE servers
         │  Redis PUBLISH → match:{id}:score
         ├──────────────────┬──────────────────────┐
         ▼                  ▼                      ▼
  [Redis US-East]   [Redis EU-West]       [Redis APAC]
         │                  │                      │
    ─────────────      ─────────────         ──────────────
    SSE x333           SSE x333              SSE x334
    50k clients ea.    50k clients ea.       50k clients ea.
    ─────────────      ─────────────         ──────────────
         │                  │                      │
         └──────────────────┴──────────────────────┘
                    50,000,000 clients
                    receive update in < 200ms

The elegance of this architecture is its layered fan-out: each tier multiplies the reach while keeping each individual node at a manageable load level. Redis doesn’t talk to clients. SSE servers don’t talk to Kafka. Each component does one job well.

An interesting edge case: what happens when the official data vendor feed is wrong? Sportradar occasionally sends erroneous events. In 2014, a data error caused a goal to briefly appear in a match score on multiple platforms simultaneously before being retracted. Designing for event correction (a RETRACT or AMEND delta type) is part of a production score system — the client’s local state machine must handle out-of-order corrections gracefully.

The design principles generalise far beyond sports scores. Any “one event → millions of notifications” problem — auction bids, stock prices, breaking news, collaborative document cursors — benefits from the same tiered fan-out pattern: durable queue (Kafka) → regional brokers (Redis) → persistent connections (SSE/WebSocket) → client-side state with delta application.

What makes the sports score case particularly instructive is the bursty nature of the load: the system must handle 50 million concurrent connections during a World Cup final but then serve a fraction of that for a mid-season league game the next afternoon. Auto-scaling the SSE tier (Kubernetes HPA on connection count) and keeping the fan-out workers stateless makes the scale-down as important as the scale-up.

Build the state machine on the client. Keep payloads tiny. Fan out through tiers. The rest is operations.