Prepare for senior-level system design interviews with 30+ questions on scalability, distributed systems, and architecture.
10 Questions
~30 min read
Key components: (1) Generate short URL using base62 encoding or hash, (2) Store mapping in database (NoSQL for scale), (3) Redirect service with caching (Redis), (4) Analytics service for click tracking. Considerations: handle collisions, expire old URLs, rate limiting, custom aliases. Scale: horizontal scaling, CDN for redirects, sharding by hash prefix.
Two approaches: (1) Pull (fan-out on read) - fetch from all followed users at read time, good for users with many followers, (2) Push (fan-out on write) - pre-compute feeds when tweet posted, faster reads but expensive writes. Hybrid: push for normal users, pull for celebrities. Use Redis for feed cache, Kafka for async processing, CDN for media.
Components: (1) WebSocket gateway for real-time, (2) Message service with queue (Kafka), (3) Channel/DM storage (partition by channel), (4) Presence service (Redis pub/sub), (5) Search (Elasticsearch), (6) File storage (S3 + CDN). Considerations: message ordering, read receipts, typing indicators, offline sync, push notifications. Scale channels independently.
CAP: Consistency (all nodes see same data), Availability (every request gets response), Partition Tolerance (system works despite network failures). Can only guarantee 2 of 3 during partition. CP systems: sacrifice availability (MongoDB, HBase). AP systems: sacrifice consistency (Cassandra, DynamoDB). CA impossible in distributed systems (network failures happen).
Strategies: (1) Strong consistency - 2PC, Paxos, Raft (slower), (2) Eventual consistency - async replication (faster), (3) Saga pattern for distributed transactions. Techniques: version vectors, CRDTs for conflict-free merging, idempotent operations. Choose based on requirements: banking needs strong consistency, social feeds can be eventual.
Algorithms: (1) Token Bucket - tokens added at fixed rate, requests consume tokens, (2) Sliding Window - count requests in time window, (3) Fixed Window - simpler but has edge issues, (4) Leaky Bucket - queue requests, process at fixed rate. Implementation: Redis for distributed limiting, use user ID or IP as key. Return 429 with Retry-After header when limited.
Key decisions: (1) Cache invalidation strategy (TTL, write-through, write-back), (2) Eviction policy (LRU, LFU, FIFO), (3) Consistent hashing for distribution, (4) Replication for availability. Technologies: Redis Cluster, Memcached. Patterns: cache-aside, read-through, write-through. Handle: thundering herd (locking), cache stampede (staggered TTL), hot keys (local cache + distributed).
Sharding splits data across multiple databases. Strategies: (1) Hash-based - consistent hashing on key, (2) Range-based - by date/ID range, (3) Directory-based - lookup service. Challenges: cross-shard queries, rebalancing, joins. Use when: single database can't handle load, data exceeds single machine capacity. Alternatives: read replicas, caching, vertical scaling first.
Strategies: (1) Redundancy - multiple instances, replicas, (2) Load balancing - distribute traffic, health checks, (3) Failover - automatic switching to backup, (4) Geographic distribution - multi-region deployment, (5) Graceful degradation - reduce functionality under load. Measure with: SLO/SLA, uptime percentage. Implement: health checks, circuit breakers, chaos engineering.
Monolith: simpler deployment, easier debugging, no network latency between components, good for small teams. Microservices: independent scaling/deployment, technology flexibility, team autonomy, fault isolation. Challenges with microservices: distributed tracing, data consistency, service discovery, operational complexity. Start monolith, extract services when needed. Don't microservice prematurely.