System Design: Location Services — How Uber Matches Drivers in Real-Time

Series System Design Interview Prep — #11 of 15

This question appears in
almost every senior eng
interview at Uber, Lyft,
DoorDash, and Google
Maps. Geohash + Redis
GEO is the canonical
answer — know it cold.

The question: Design Uber’s ride-matching system. Find the nearest available driver within 5 km, match in under 500 ms, handle 10 million drivers updating location every 5 seconds, serve globally.

The deceptively hard part isn’t finding nearby drivers — it’s doing it at scale: 2 million location writes per second, global distribution, and sub-500 ms end-to-end latency under bursty demand. This post walks every level of the solution, from a naive SQL query to the geohash + Redis architecture Uber actually uses.


1. The Problem

Scale the numbers first:

MetricValueImplication
Active drivers10 millionEach holds a position record in memory
Location update intervalEvery 5 seconds10M ÷ 5s = 2 million writes/sec
Ride requests~1 million/hour~278 read queries/sec — actually modest
Search radius5 kmIn NYC that's ~100+ drivers in the polygon
End-to-end latency SLA< 500 msLocation lookup + ETA calc + offer delivery
CoverageGlobalMulti-region with data locality

The write storm — 2 million location writes per second — is the core challenge. Reads are comparatively light. Any solution that can’t absorb those writes at millisecond latency fails immediately.

2. Level 1 — Naive SQL Radius Query

The first instinct is a SQL query using the Haversine formula:

SQL
SELECT id, lat, lng,
  (6371 * acos(
    cos(radians(?)) * cos(radians(lat))
    * cos(radians(lng) - radians(?))
    + sin(radians(?)) * sin(radians(lat))
  )) AS distance
FROM drivers
WHERE available = 1
HAVING distance < 5
ORDER BY distance
LIMIT 10;

This works perfectly at hundreds of drivers. At 10 million active drivers with 2 million writes per second, it’s a disaster: full table scan on every request, HAVING filters applied after reading every row, trigonometric functions on every row, and the write rate alone saturates any relational database.

3. Level 2 — Bounding Box Optimization

Instead of computing Haversine distance on every row, first narrow candidates with a cheap bounding box filter:

SQL
-- Pre-compute bounding box on application side
-- lat ± delta_lat, lng ± delta_lng (delta ≈ 0.045° per 5 km)
SELECT id, lat, lng
FROM drivers
WHERE available = 1
  AND lat BETWEEN ? AND ?
  AND lng BETWEEN ? AND ?
-- Then apply Haversine only on this small result set

Add a composite spatial index on (lat, lng) — B-tree or R-tree. PostgreSQL’s PostGIS uses exactly this approach. Better, but two problems remain: in dense areas the bounding box still returns thousands of candidates, and 2 million writes per second will saturate any relational write path via WAL overhead and index maintenance. We need something fundamentally different for the location data.

4. Level 3 — Geohashing

Geohash is elegant because
it turns a 2D spatial
problem into a 1D string
prefix problem — and
databases are very good
at prefix queries.

Geohash encodes any (lat, lng) coordinate as a short string by recursively bisecting the world into a grid. Each additional character doubles precision in both dimensions. The magic: nearby locations share the same prefix (most of the time), so a spatial query becomes a string prefix query: WHERE geohash LIKE 'u4pruy%'.

How encoding works:

  1. Start with lat range [−90, 90] and lng range [−180, 180]
  2. Interleave bits: even bits bisect longitude, odd bits bisect latitude
  3. Group every 5 bits into one base32 character (alphabet: 0123456789bcdefghjkmnpqrstuvwxyz)
  4. Longer hash = finer precision

Precision table:

#Cell sizeExampleUse case
1≈ 5,000 × 5,000 kmuContinent
2≈ 1,250 × 625 kmu4Country
3≈ 156 × 156 kmu4pState
4≈ 39 × 20 kmu4prCity
5≈ 4.9 × 4.9 kmu4pruRide pickup zone
6≈ 1.2 × 0.6 kmu4pruyStreet block
7≈ 153 × 153 mu4pruydBuilding
8≈ 38 × 19 mu4pruydpParking spot

Interactive Geohash Explorer — click any cell on the world grid to zoom into its sub-cells. Click “Place Driver” then click a cell to drop a driver and watch its geohash update as you change the precision slider.

🌎 Geohash Grid Explorer

Precision: 5
Click a cell to zoom into its geohash sub-cells. Neighbours are highlighted in teal.

The boundary problem: A driver near a cell border will fall in a different cell than a nearby rider. Solution: always query the 8 neighbouring cells in addition to the rider’s own cell. The grid above highlights these neighbours automatically.

5. Level 4 — Geohash Storage in Redis

Uber uses S2 geometry
(from Google) in production
— more accurate than
geohash at cell boundaries
— but geohash is perfect
for interviews and gets
you full marks.

Redis is the natural home for location data: in-memory, sub-millisecond reads/writes, built-in GEO data structure. Each geohash cell becomes a Redis sorted set key holding driver IDs. Redis GEO commands internally encode (lat, lng) as geohash and store in a sorted set.

Redis CLI
# Driver updates location — atomic: remove old, add new
ZREM drivers:u4prue driver:42          # remove from old geohash
GEOADD drivers:city:london -0.1276 51.5074 driver:42

# Rider requests ride — query own cell + 8 neighbours
GEORADIUS drivers:city:london -0.1276 51.5074 5 km
  WITHCOORD WITHDIST ASC COUNT 20

# Check distance between two points
GEODIST drivers:city:london driver:42 driver:99 km

# Get geohash string for a driver (precision 6)
GEOHASH drivers:city:london driver:42
# → "gcpvhep"

# For the write storm: pipeline multiple updates
PIPELINE
  GEOADD drivers:city:london -0.128 51.507 driver:42
  GEOADD drivers:city:london -0.135 51.512 driver:99
  GEOADD drivers:city:london -0.121 51.503 driver:7
EXEC

Driver update flow:

  1. Driver app detects GPS change > 50 m → sends update to Location Service via WebSocket
  2. Location Service calls GEOADD city:geohash_key lng lat driver_id in Redis
  3. Old geohash entry is automatically overwritten (sorted set member is unique by name)

Rider lookup flow:

  1. Rider taps “Request” → Matching Service receives (lat, lng, radius=5km)
  2. GEORADIUS returns up to 20 nearest available drivers with distances, sorted by proximity
  3. Matching Service ranks by ETA (calls Maps API for road distance on top 5 candidates)
  4. Sends offer to best driver via WebSocket push
6. Level 5 — Quad Tree

Geohash divides the world into uniform rectangular cells. Quad trees divide space adaptively: each node splits into 4 quadrants recursively until each leaf contains ≤ K points (e.g. K = 100). This gives dense coverage in cities (deep tree) and sparse coverage in rural areas (shallow tree), with constant lookup time proportional to tree depth rather than area.

Interactive Quad Tree Builder — click to place driver dots, then click “Build Quad Tree” to watch it subdivide. Capacity K controls when a cell splits.

■ Quad Tree Builder

Capacity K: 4
Click on the canvas to place drivers, then build the quad tree.

Quad tree vs geohash:

PropertyGeohashQuad Tree
Cell sizeUniform per levelAdaptive to density
StorageString key in RedisIn-memory tree structure
Update costO(1) — just change keyO(log n) — may need rebalancing
Range queryPrefix scan + 9 cellsTree traversal
Boundary problemYes — check neighboursYes — check sibling nodes
Used byRedis GEO, DynamoDBPostGIS, game engines
7. Level 6 — Real-Time Matching Architecture

Click any component in the diagram below to see its role in detail.

🚗
Driver App
🔌
WebSocket
Gateway
📍
Location
Service
💾
Redis GEO
Cluster
👨
Rider App
📝
Ride Request
Service
Matching
Service
🏠
Maps API
(ETA)
Driver App: Polls GPS every 1 s, applies 50 m filter client-side. Sends diffs only when threshold exceeded. Keeps persistent WebSocket connection to gateway. Handles offer delivery and accept/reject response.
WebSocket Gateway: Maintains 10 M+ concurrent connections. Horizontally scaled, stateless (connection state in Redis). Routes location updates to Location Service, delivers match offers back to drivers. Uses sticky sessions for connection affinity.
Location Service: Receives location updates, validates, writes to Redis GEO with pipelining. Batches updates within 100 ms windows. Shards by city/geohash prefix. Updates driver availability index. Target: <10 ms per update end-to-end.
Redis GEO Cluster: One Redis node per major city (sharded by geohash prefix). GEOADD for writes (~sub-ms). GEORADIUS for reads (~1–3 ms for 5 km radius). Total RAM: ~500 MB for 10 M drivers. Replicated with Redis Sentinel for HA.
Rider App: Sends ride request with pickup coordinates. Receives real-time driver position updates on map. Streams ETA updates every 10 s during trip. Handles offer acceptance, cancellation, and surge pricing display.
Ride Request Service: Validates rider request, checks for surge pricing in zone, creates ride record in DB (Cassandra), forwards to Matching Service. Handles idempotency (no double bookings).
Matching Service: Queries Redis GEO for nearest 20 drivers. Filters by availability and vehicle type. Calls Maps API for road ETA on top 5. Ranks by ETA + rating + acceptance rate. Sends offer via WebSocket. If rejected: offer next driver (up to 3 attempts in 30 s).
Maps API (ETA): Returns road-distance ETA for driver-to-pickup. Results cached aggressively (5 min TTL per route segment). Falls back to straight-line × 1.3 multiplier on cache miss. Target: <50 ms for cached, <200 ms uncached.
8. Level 7 — Handling the Write Storm

2 million location writes per second is non-trivial. Four optimizations, ordered by impact:

Optimization 1 — Client-side movement filter (saves ~60%)

Swift / Kotlin (pseudo)
val MIN_DISTANCE_METERS = 50.0
var lastSentLocation: Location? = null

fun onGpsUpdate(newLoc: Location) {
  if (lastSentLocation == null ||
      newLoc.distanceTo(lastSentLocation!!) > MIN_DISTANCE_METERS) {
    sendLocationUpdate(newLoc)   // only send if moved > 50 m
    lastSentLocation = newLoc
  }
  // otherwise discard silently — cheapest server call is none at all
}

The “50 metres moved” filter
on the driver app saves
~60% of location updates
— the cheapest optimisation
is the one that prevents
the write from happening
at all.

Optimization 2 — Redis pipelining (batches multiple writes in one round-trip)

Python
def flush_location_batch(updates):
  pipe = redis.pipeline(transaction=False)
  for driver_id, lat, lng in updates:
    city_key = "drivers:" + get_city(lat, lng)
    pipe.geoadd(city_key, [lng, lat, driver_id])
  pipe.execute()  # one network round-trip for all updates

Optimization 3 — Shard Redis by city / geohash prefix

Config
# Redis Cluster: consistent hashing across shards
shard-0: drivers:london, drivers:paris, drivers:berlin
shard-1: drivers:nyc,    drivers:boston,  drivers:dc
shard-2: drivers:sf,     drivers:la,      drivers:seattle
shard-3: drivers:tokyo,  drivers:osaka,   drivers:seoul
# → each shard handles ~500k writes/sec — well within Redis limits

Optimization 4 — Dead zone suppression

Drivers in slow traffic or at stop lights barely move. Track the last-sent geohash and skip writes entirely if the driver is still in the same precision-6 cell (~1.2 km × 0.6 km). For Uber this suppresses an additional 20–30% of writes during rush-hour gridlock.

9. Level 8 — Surge Pricing Zones

Surge pricing is a direct consequence of geospatial aggregation: divide the city into geohash zones, count active riders vs available drivers per zone in real time, and compute a multiplier when supply is short.

Python
def compute_surge(geohash_prefix: str) -> float:
  zone_key = "zone:" + geohash_prefix
  riders   = redis.get(zone_key + ":riders")    # active ride requests
  drivers  = redis.zcard("drivers:" + geohash_prefix)  # available drivers

  if drivers == 0: return 3.0     # no drivers → max surge
  ratio = riders / drivers

  if   ratio < 1.0: return 1.0    # normal
  elif ratio < 2.0: return 1.5    # mild surge
  elif ratio < 3.5: return 2.0    # surge
  else:             return 3.0    # high surge

Live Surge Map — click “Simulate Rush Hour” to see demand spike in city centre. Each zone shows rider/driver ratio and surge multiplier.

⚡ Surge Pricing Zone Map

Zones are geohash precision-5 cells (~5 km). Green = normal, amber = 1.5×, orange = 2×, red = 3× surge.
10. Full Ride Matching Demo

Watch the complete matching pipeline in action: five drivers move around the city in real time. Click “Request Ride” then click anywhere on the grid to drop a ride request — the system highlights the geohash cells, finds the nearest driver, and animates the match.

🏎 Live Ride Matching Simulator

Drivers moving in real time. Click "Request Ride" to begin.
11. Capacity Estimation
ComponentCalculationResult
Location write throughput10M drivers × 200 B per update ÷ 5 s400 MB/s sustained
Redis GEO memory10M drivers × ~50 B per GEO entry~500 MB — fits in one instance
Ride request read load1M requests/hr = 278/s, each queries 9 cells2,500 Redis ops/s — trivial
WebSocket connections10M drivers + 5M active riders15M connections, ~50 gateway servers at 300k/each
Geohash index size10M entries × 12 B (hash + ID)120 MB per precision level
Redis shards needed500k writes/s per shard, 2M total4–8 Redis nodes for writes
Maps API calls278 ride requests/s × 5 ETA checks1,390 Maps API calls/s — cache aggressively
The counter-intuitive result: reads are easy (only 278 ride requests/sec), writes are hard (2M location updates/sec). Design around the write path. Redis with pipelining handles ~1M SET ops/sec per node — you need 2–4 nodes for the write load alone, then add replicas for HA.
12. Interview Cheat Sheet

Hit these points in order when answering in an interview:

  1. Quantify the write storm first — 10M drivers × 1/5s = 2M writes/sec. This disqualifies SQL immediately.
  2. Explain why geohash — turns 2D spatial into 1D string prefix; databases love prefix queries.
  3. Name the boundary problem — and explain the 9-cell (own + 8 neighbours) solution.
  4. Redis GEO — GEOADD, GEORADIUS, GEOHASH. In-memory, sub-ms, ~500 MB for 10M drivers.
  5. Write optimisations in order — client-side 50 m filter → pipelining → shard by city.
  6. ETA vs distance — sort by straight-line first, call Maps API only on top 5 candidates.
  7. Mention S2 — Uber uses Google’s S2 library in production; geohash is interview-correct but S2 handles polar distortion better.
Bonus points in interviews: explain the difference between geohash precision 5 (~5 km) for driver search and precision 6 (~600 m) for surge zone aggregation — they're deliberately different because the business problems are different. Surge needs fine-grained zones; driver search needs coarser ones to ensure enough candidates per cell.

Next in the series: #12 — Design a Distributed Message Queue — Kafka internals, partitioning, consumer groups, and at-least-once vs exactly-once delivery.