System Design: SSO and Session Management — Authentication at Scale
Google’s auth system
is called GAIA — Google
Accounts and ID
Administration. It
handles every login
across every Google
product for billions
of users, every day.
Logging into Gmail and instantly being logged into YouTube, Drive, and Maps feels like magic. It isn’t. Behind that seamless experience sits one of the most carefully engineered systems in software: a distributed Single Sign-On (SSO) infrastructure that manages billions of active sessions, issues and rotates cryptographic tokens, and must never go down — because when it does, half the internet notices.
The interview question: Design the authentication system for a company like Google, where logging into one service (Gmail) also logs you into all other services (Drive, YouTube, Maps). Handle millions of sessions, token refresh, logout-everywhere, and support third-party apps via OAuth.
1. Session vs Token: The Fundamental Choice
Every authentication system faces the same foundational question first: where does the server keep track of who is logged in?
Server-Side Sessions
The traditional model: a user logs in, the server generates a random sessionId, stores the session data in a database (or Redis), and sends only the sessionId to the browser as a cookie. On every subsequent request, the server looks up the sessionId to find the user.
# Login: server creates a session in Redis def login(username, password): user = db.find_user(username) if not verify_password(password, user.password_hash): raise AuthError("invalid credentials") session_id = generate_random_id() # e.g. "a3f9c..." (128-bit random) session_data = { "userId": user.id, "createdAt": now(), "expiresAt": now() + timedelta(days=30), "ip": request.remote_addr, "userAgent": request.headers["User-Agent"], } redis.setex( "session:" + session_id, 86400 * 30, # TTL: 30 days in seconds json_encode(session_data) ) return session_id # stored in browser cookie # Every request: server validates the session def authenticate_request(request): session_id = request.cookies.get("session_id") session = redis.get("session:" + session_id) if not session or session["expiresAt"] < now(): raise AuthError("session expired or invalid") return session["userId"]
Pros: Instant revocation — delete the key from Redis and the user is immediately logged out on their next request. Small cookie (just the ID). Full control over session lifecycle.
Cons: Stateful — every application server must reach the same session store, adding a network round-trip to every authenticated request. The session store becomes a critical single point of failure.
JWT (JSON Web Tokens)
A different model: the server signs a token containing the user’s identity and hands it back to the client. The client sends that token on every request. The server verifies the signature locally — no database lookup required.
A JWT has three base64url-encoded parts separated by dots:
# Login: server issues a signed JWT def login_jwt(username, password): user = db.find_user(username) if not verify_password(password, user.password_hash): raise AuthError("invalid credentials") payload = { "userId": user.id, "email": user.email, "tokenVersion": user.token_version, # for revocation (section 2) "iat": now_unix(), "exp": now_unix() + 900, # expires in 15 minutes } return jwt.encode(payload, SECRET_KEY, algorithm="HS256") # Every request: server verifies locally — NO Redis lookup def authenticate_request_jwt(request): token = request.headers.get("Authorization").split()[1] try: claims = jwt.decode(token, SECRET_KEY, algorithms=["HS256"]) except jwt.ExpiredSignatureError: raise AuthError("token expired") except jwt.InvalidTokenError: raise AuthError("invalid token") # Optionally verify tokenVersion against DB (section 2) return claims["userId"]
Pros: Stateless — any server can verify a token without shared storage. Scales horizontally with zero coordination. Works naturally across domains.
Cons: The logout problem — a signed token is valid until it expires. You can’t “un-sign” it. If a token is stolen, you’re stuck until expiry (up to 15 minutes for a short-lived token, or days if misconfigured).
| Property | Server Sessions | JWT |
|---|---|---|
| Revocation speed | Instant | On expiry only |
| Horizontal scaling | Needs shared store | Zero coordination |
| Cross-domain | Cookie limitations | Header-based, works anywhere |
| Token size | ~50 bytes (ID only) | ~200–500 bytes |
| DB lookup per request | Always | Never (or optional) |
| Payload tampering | Not possible | Detected by signature |
2. The JWT Revocation Problem
The JWT spec (RFC 7519)
defines no revocation
mechanism at all.
This was a deliberate
trade-off for
statelessness — and
the source of
countless security bugs.
A JWT cannot be “un-issued.” Once signed, it is valid until its exp claim passes. This creates a fundamental tension: short expiry improves security but creates constant re-authentication friction. Long expiry improves UX but leaves stolen tokens valid for hours or days.
Three real solutions exist, each with different trade-offs:
Solution A: Short-Lived Access Tokens + Refresh Tokens
This is the industry standard (used by Google, GitHub, Stripe, and most major platforms).
- Access token: Short-lived (15 minutes). Stateless JWT. Used for every API call.
- Refresh token: Long-lived (30 days). Opaque random string stored in the DB. Used only to get a new access token.
Revocation is now possible: delete the refresh token from the database. The access token lives at most 15 more minutes — an acceptable window for most threat models.
# Issuing tokens at login def login_with_refresh(username, password): user = db.authenticate(username, password) access_token = jwt.encode({ "userId": user.id, "exp": now_unix() + 900, # 15 minutes }, SECRET_KEY) refresh_token = generate_secure_random(64) db.store_refresh_token({ "token": sha256(refresh_token), # store hash, not plaintext "userId": user.id, "expiresAt": now() + timedelta(days=30), "deviceId": request.get_device_id(), }) return {"access_token": access_token, "refresh_token": refresh_token} # Client calls this when access_token expires (HTTP 401) def refresh_access_token(refresh_token): token_hash = sha256(refresh_token) record = db.find_refresh_token(token_hash) if not record or record["expiresAt"] < now(): raise AuthError("refresh token invalid or expired") # Rotate: old token out, new token in (prevents replay) db.delete_refresh_token(token_hash) return login_with_refresh.issue_new_pair(record["userId"])
Solution B: Token Blacklist in Redis
When a token is revoked, store its jti (JWT ID claim) in Redis with TTL equal to the token’s remaining lifetime. Each request checks the blacklist.
Solution C: Token Versioning
Store a tokenVersion integer on the user record in the database. Include it in the JWT payload. On every request, verify the JWT’s tokenVersion matches the current value in the DB.
Revoking all sessions for a user is a single UPDATE users SET token_version = token_version + 1 WHERE id = ?. All existing tokens fail their version check on the next request.
-- Revoke all sessions for a user UPDATE users SET token_version = token_version + 1 WHERE id = 'user_123'; -- Application check (pseudo-code in SQL style) -- jwt.tokenVersion must equal users.token_version SELECT token_version FROM users WHERE id = jwt_claim_user_id AND token_version = jwt_claim_token_version;
This approach re-introduces one DB read per request, but only a single integer column — fast with a primary key lookup and easily cached.
3. SSO Architecture
The protocol underlying
most SSO systems is
SAML 2.0 (enterprises)
or OpenID Connect
(modern web). OIDC is
OAuth 2.0 + an identity
layer (the id_token).
Google uses OIDC.
Single Sign-On answers the question: how does logging into one service automatically authenticate you to all others? The answer is a centralized Identity Provider (IdP) — accounts.google.com — that all services (called Service Providers or Relying Parties) delegate authentication to.
The canonical flow:
mail.google.com. Gmail checks for a local session — none found. Gmail redirects to accounts.google.com/login?service=gmail&return_to=https://mail.google.comaccounts.google.com. Auth Server verifies password (and MFA if enrolled). Creates a long-lived SSO session in Redis, sets an accounts.google.com cookie (httpOnly, Secure).mail.google.com?sso_token=XYZ.sso_token to Auth Server for validation (server-to-server). Auth Server verifies signature, marks token as used (prevents replay), returns user identity.mail.google.com cookie. User is now authenticated to Gmail.youtube.com. YouTube has no local session. Redirects to accounts.google.com. Auth Server detects the existing SSO session cookie — no credentials re-entry needed. Issues a new SSO token for YouTube. Steps 3–5 repeat silently.The key architectural insight: each service maintains its own local session (for performance — they don’t hit the Auth Server on every request), but all of them were bootstrapped via the same central SSO session.
accounts.google.com) from the service cookies (mail.google.com, youtube.com). Browsers scope cookies to domains, so the SSO cookie travels with every request to the Auth Server but is invisible to the individual services. This is not a bug — it's the design.
The SSO Token Exchange (Server-to-Server Validation)
# Auth Server: issue SSO token after successful authentication def issue_sso_token(user_id, service, return_to): token_id = generate_random_id() token_data = { "userId": user_id, "service": service, "return_to": return_to, "createdAt": now_unix(), "expiresAt": now_unix() + 60, # 60-second window "used": False, } redis.setex("sso_token:" + token_id, 120, json_encode(token_data)) return token_id # Service Provider: validate SSO token (server-to-server) def validate_sso_token(token_id): key = "sso_token:" + token_id data = redis.get(key) if not data: raise AuthError("token not found or expired") if data["expiresAt"] < now_unix(): raise AuthError("token expired") if data["used"]: raise AuthError("token already used — replay attack?") # Mark as used atomically to prevent replay redis.hset(key, "used", True) return data["userId"]
4. OAuth 2.0 + PKCE for Third-Party Apps
OAuth 2.0 is not an
authentication protocol.
It is an authorization
framework. OpenID
Connect (OIDC) adds
the identity layer on
top. “Login with Google”
is OIDC, not raw OAuth.
OAuth solves a different problem: how does a third-party application (say, a calendar app) get limited access to your Google data, without you giving it your Google password?
The Authorization Code Flow with PKCE (Proof Key for Code Exchange) is the current standard for all OAuth clients, especially mobile and single-page apps that cannot safely store a client secret.
Why PKCE?
Without PKCE, the authorization code returned in the redirect URL could be intercepted by a malicious app on the same device (common on mobile — any app can register a URL scheme). PKCE makes the authorization code useless without the original code_verifier known only to the legitimate app.
code_verifier = 64 random bytes (base64url-encoded)code_challenge = BASE64URL(SHA256(code_verifier))App stores
code_verifier in memory (never sent to server).GET /authorize?response_type=code&client_id=APP_ID&redirect_uri=https://app.example.com/callback&scope=email+calendar&code_challenge=CHALLENGE&code_challenge_method=S256&state=RANDOM_STATEThe
state parameter prevents CSRF attacks.code_challenge alongside the generated authorization code. Redirects to:https://app.example.com/callback?code=AUTH_CODE&state=RANDOM_STATEPOST /token { grant_type=authorization_code, code=AUTH_CODE, redirect_uri=..., code_verifier=VERIFIER }Auth Server recomputes SHA256(code_verifier) and verifies it matches the stored code_challenge. If it does, issues tokens.
{ "access_token": "...", "token_type": "Bearer", "expires_in": 3600, "refresh_token": "...", "id_token": "..." }The
id_token is an OIDC JWT containing the user's identity (sub, email, name).// PKCE: generating the code_verifier and code_challenge async function generatePKCE() { // 1. Generate a cryptographically random verifier const array = new Uint8Array(64); crypto.getRandomValues(array); const verifier = base64URLEncode(array); // 2. Hash it: challenge = BASE64URL(SHA256(verifier)) const data = new TextEncoder().encode(verifier); const hashBuffer = await crypto.subtle.digest("SHA-256", data); const challenge = base64URLEncode(new Uint8Array(hashBuffer)); return { verifier, challenge }; } function base64URLEncode(buffer) { return btoa(String.fromCharCode(...buffer)) .replace(/\+/g, "-") .replace(/\//g, "_") .replace(/=/g, ""); }
5. Interactive: JWT Playground
6. Interactive: SSO Flow Visualizer
7. Session Storage at Scale
Redis is not a database.
It is an in-memory
store with optional
persistence. For session
data you can afford
to lose (user just
logs in again), this
is fine. For refresh
tokens, you need
durability — use
Redis AOF or a proper DB.
Google has roughly 5 billion active sessions. Keeping all of them in a single Redis instance is impossible (memory limit) and unwise (single point of failure). The solution is tiered storage based on session activity.
Hot tier — Redis Cluster:
- Sessions active in the last 7 days
- Sharded by
sessionIdacross 50+ nodes (~50 GB each) - O(1) reads, sub-millisecond latency
- LRU eviction pushes cold sessions to warm tier
Warm tier — Redis with disk persistence:
- Sessions 7–30 days inactive
- Slower access acceptable — user is returning after a gap
- When accessed, session is promoted back to hot tier
Cold tier — Cassandra:
- Sessions 30+ days inactive (keep for “remember me” scenarios)
- Wide-column model: partition key is
userId, clustering key issessionId - Batch deletion of expired sessions via TTL
class TieredSessionStore: def get(self, session_id): # 1. Check hot tier (Redis) first session = self.redis_hot.get("sess:" + session_id) if session: self.redis_hot.expire("sess:" + session_id, 604800) # refresh TTL return decode(session) # 2. Check warm tier session = self.redis_warm.get("sess:" + session_id) if session: self._promote_to_hot(session_id, session) return decode(session) # 3. Check cold tier (Cassandra) row = self.cassandra.execute( "SELECT * FROM sessions WHERE session_id = ?", [session_id] ).one() if row: self._promote_to_hot(session_id, encode(row)) return row return None # session not found anywhere def _promote_to_hot(self, session_id, data): self.redis_hot.setex("sess:" + session_id, 604800, data) # Optionally delete from warm/cold to avoid duplication
8. Logout Everywhere
When a user clicks “Sign out of all devices,” the system must invalidate every active session across every device, every browser, every service. This is the logout problem at its hardest.
UPDATE users SET token_version = token_version + 1 WHERE id = 'user_123'. All access tokens now carry a stale version — they will fail on next use.DELETE FROM refresh_tokens WHERE user_id = 'user_123'. Clients can no longer silently renew their access tokens.9. Multi-Factor Authentication (MFA)
TOTP (RFC 6238) uses
HMAC-SHA1 over the
current Unix time
divided by 30. The
same algorithm runs
on your phone and
the server — if
clocks are in sync,
the codes match.
No network needed.
MFA adds a second verification step after password authentication. The most common mechanism is TOTP (Time-based One-Time Password), used by Google Authenticator, Authy, and 1Password.
TOTP algorithm:
import hmac, hashlib, struct, time, base64 def generate_totp(secret_base32, digits=6, period=30): # 1. Decode the shared secret (stored in user DB, displayed as QR code) secret = base64.b32decode(secret_base32.upper()) # 2. Compute time counter: 30-second windows since Unix epoch counter = int(time.time()) // period # 3. HMAC-SHA1 of the 8-byte big-endian counter msg = struct.pack(">Q", counter) h = hmac.new(secret, msg, hashlib.sha1).digest() # 4. Dynamic truncation: take 4 bytes at offset indicated by last nibble offset = h[-1] & 0x0F code = struct.unpack(">I", h[offset:offset+4])[0] & 0x7FFFFFFF # 5. Modulo to get N-digit code return str(code % (10 ** digits)).zfill(digits) def verify_totp(secret, provided_code, window=1): # Accept current window and ±1 (clock skew tolerance) for drift in range(-window, window + 1): expected = generate_totp(secret, period=30) if hmac.compare_digest(expected, provided_code): return True return False
The MFA challenge flow:
def login_step1(username, password): user = db.authenticate(username, password) if not user.mfa_enabled: return issue_full_session(user) # no MFA, done # Issue a short-lived challenge token (not a full session!) challenge = { "userId": user.id, "mfaNeeded": True, "exp": now_unix() + 60, # 60-second window to enter MFA code } challenge_token = jwt.encode(challenge, MFA_KEY, algorithm="HS256") return {"mfa_required": True, "challenge_token": challenge_token} def login_step2(challenge_token, totp_code): claims = jwt.decode(challenge_token, MFA_KEY, algorithms=["HS256"]) user = db.find_user(claims["userId"]) if not verify_totp(user.mfa_secret, totp_code): raise AuthError("invalid TOTP code") return issue_full_session(user) # MFA passed, issue real session
10. Capacity Estimate
| Metric | Assumption | Result |
|---|---|---|
| Active sessions (Google-scale) | ~5 billion logged-in users | ~5,000,000,000 |
| Session size in Redis | userId + metadata + expiry | ~500 bytes |
| Total session storage | 5B × 500 bytes | ~2.5 TB |
| Redis nodes required | 50 GB usable per node | ~50 nodes |
| Auth requests / second | 5B sessions / 10s avg request interval | ~500,000 req/s |
| Token refresh requests / day | Every access token refreshed every 15 min | ~5B × 96 = ~480B/day |
| Refresh token DB size | 1 row per device × avg 3 devices/user | ~15B rows |
| Auth Server replication | 500K req/s at 5ms/req per core | ~1,000 cores |
11. Security Hardening Checklist
httpOnly cookies vs
localStorage for JWTs:
httpOnly prevents XSS
reads but is vulnerable
to CSRF. localStorage
blocks CSRF but is
readable by JS (XSS).
Neither is perfect.
The security community
debates this endlessly —
the right answer is
“it depends on your
threat model.”
Beyond the core architecture, production-grade auth systems require these mitigations:
CSRF protection:
Every state-changing request must include either a CSRF token (double-submit cookie pattern) or use the SameSite=Strict cookie attribute to prevent cross-site form submissions.
Token storage:
Store access tokens in httpOnly cookies (inaccessible to JavaScript — prevents XSS token theft). Store refresh tokens the same way. Never put tokens in localStorage if XSS is a realistic threat vector.
Rate limiting on auth endpoints: The login endpoint is the most-attacked endpoint in any system. Apply per-IP rate limiting (e.g., 10 attempts per 15 minutes), account lockout after N failures, and CAPTCHA after repeated failures.
Refresh token rotation: On every use of a refresh token, immediately issue a new one and invalidate the old. If a refresh token is used twice, it likely means the original was stolen — revoke all tokens for that user.
def use_refresh_token(token): record = db.find_refresh_token(sha256(token)) if not record: # Token not found — either expired or already used. # Check if this token was recently rotated (possible replay attack) rotated = db.find_rotated_token(sha256(token)) if rotated: # Replay detected: revoke the entire token family db.revoke_token_family(rotated.family_id) raise SecurityAlert("refresh token reuse detected") raise AuthError("token invalid") # Valid: rotate (issue new, invalidate old) new_token = generate_secure_random(64) db.rotate_refresh_token( old_token_hash=sha256(token), new_token_hash=sha256(new_token), family_id=record.family_id ) new_access = issue_access_token(record.user_id) return {"access_token": new_access, "refresh_token": new_token} # Secure cookie settings # Set-Cookie: refresh_token=XYZ; HttpOnly; Secure; SameSite=Strict; Path=/auth/refresh # Path=/auth/refresh means the cookie is ONLY sent to the refresh endpoint
12. System Diagram: Full Architecture
┌──────────────────────────────────────────────────────────────┐
│ Browser │
│ Cookies: accounts.google.com (SSO) + mail.google.com │
└──────────┬──────────────────────────────────────┬────────────┘
│ HTTPS │ HTTPS
▼ ▼
┌──────────────────────┐ ┌──────────────────────┐
│ Auth Server Cluster │ │ Service Cluster │
│ accounts.google.com │ │ (Gmail, YouTube...) │
│ │◄── s2s ──────►│ │
│ - GAIA auth logic │ token valid? │ - Local session │
│ - MFA verification │ │ - API calls │
│ - OAuth 2.0 + OIDC │ │ - Verifies JWT │
└──────────┬───────────┘ └──────────┬───────────┘
│ │
┌──────▼──────┐ ┌──────▼──────┐
│ Redis Cluster│ │ Redis Hot │
│ SSO sessions│ │ (sessions) │
│ MFA tokens │ └──────┬──────┘
│ Blacklist │ │
└──────┬───────┘ ┌──────▼──────┐
│ │ Redis Warm │
┌──────▼───────┐ └──────┬──────┘
│ Primary DB │ │
│ - users │ ┌──────▼──────┐
│ - tokenVer │ │ Cassandra │
│ - refresh │ │ (cold) │
│ tokens │ └─────────────┘
└──────────────┘
Summary: Interview Cheat Sheet
| Topic | Key Decision | Production Recommendation |
|---|---|---|
| Token type | Sessions vs JWT | JWT access tokens (15 min) + opaque refresh tokens (30 days) |
| Revocation | Instant vs eventual | Token versioning in DB for logout-everywhere; blacklist for single-token revocation |
| SSO mechanism | Central IdP | Single auth domain issues short-lived SSO tokens; services create local sessions |
| Third-party auth | OAuth flow | Authorization Code + PKCE; mandatory for mobile/SPA; never use Implicit Flow |
| Session storage | Hot/warm/cold | Redis Cluster (hot) → Redis+disk (warm) → Cassandra (cold) |
| Token storage | Cookie vs localStorage | httpOnly cookies; SameSite=Strict; Path-scoped refresh endpoint |
| MFA | TOTP vs push | TOTP (RFC 6238) + recovery codes; push notifications for enterprise |
| Scale | Auth bottleneck | Stateless JWT verification removes auth from critical path; 50+ Redis shards for sessions |