System Architecture

Designing for
Scale & Reliability

System design is the process of defining the architecture, components, modules, interfaces, and data for a system to satisfy specified requirements.

It is the art of trade-offs. Speed vs. Consistency. Complexity vs. Maintainability. Monolith vs. Microservices. There is no "perfect" architecture, only the right set of choices for the specific constraints.

01 / Fundamentals

Core Principles

SOLID

Five principles for OO design to make software more understandable and flexible.

  • Single Responsibility: Do one thing well.
  • Open/Closed: Open for extension, closed for mod.
  • Liskov Substitution: Subtypes must be substitutable.
  • Interface Segregation: Many client-specific interfaces.
  • Dependency Inversion: Depend on abstractions.

CAP Theorem

In a distributed data store, you can only guarantee two of the following three:

  • Consistency: Every read receives the most recent write.
  • Availability: Every request receives a response (no error).
  • Partition Tolerance: System continues despite message loss.

Note: In distributed systems, P is mandatory. You mostly choose between CP (Consistency) or AP (Availability).

KISS & DRY

Philosophy for maintainable code.

  • Keep It Simple, Stupid: Complexity is the enemy of reliability.
  • Don't Repeat Yourself: Duplication leads to inconsistent updates.

Twelve-Factor App

A methodology for building SaaS apps. Key points:

  • Store config in environment variables.
  • Treat backing services (DBs) as attached resources.
  • Execute the app as one or more stateless processes.
  • Keep dev, staging, and prod as similar as possible.
02 / Structural Patterns

Monolithic vs. Microservices vs. Modular

Monoliths are unified units. Microservices are distributed independent services. Modular Monoliths are hybrid.

Select an architecture pattern to visualize its structure and scaling behavior.
Monolith
  • Pros: Simple to deploy, easy E2E testing, zero network latency between calls.
  • Cons: High coupling, single point of failure, technology lock-in, scales as a whole block.
Modular Monolith
  • Pros: Code boundaries enforced, easier refactoring, simplified ops (single deployment).
  • Cons: Still shares runtime resources (CPU/Memory) across modules.
Microservices
  • Pros: Independent scaling, fault isolation, tech freedom per service.
  • Cons: Distributed system complexity, network latency, difficult debugging/tracing.
02.1 / Structural Patterns

Layered & Tiered Architecture

N-Tier Architecture

The most common pattern. Separates concerns horizontally. Strict layering means Layer A calls Layer B, but Layer B never calls Layer A.

Presentation Layer (UI)
Business Logic Layer
Data Access Layer

Common Pitfall: "Architecture Sinkhole" — requests passing through layers without logic just to get to DB.

SOA (Service Oriented)

Precursor to microservices. Focuses on reusing components via an Enterprise Service Bus (ESB).

Enterprise Service Bus (ESB)
Billing
Inv
CRM

Note: The ESB often becomes a bottleneck or "Smart pipes, dumb endpoints".

02.2 / Asynchronous

Event-Driven Architecture (EDA)

Components communicate by emitting Events (facts about what happened). Producers don't know consumers. This decouples services in time and space.

Producer
Email Svc
Inventory Svc
Key Concepts
  • Decoupling: The checkout service doesn't need to know if the Email service is online. It just drops a message and moves on.
  • Eventual Consistency: Data isn't consistent immediately across all services (e.g., Inventory updates 200ms after Order).
  • Load Leveling: If a traffic spike occurs, the queue buffers requests so consumers can process at their own pace without crashing.
03 / Traffic Control

Load Balancing

Distributing incoming network traffic across a group of backend servers. This ensures no single server bears too much load.

USR
Traffic Source
Load Balancer
S1 0%
S2 0%
S3 0%
Total Requests
0
Algorithms
  • Round Robin: Requests are distributed sequentially. Simple, effective for equal servers.
  • Least Connections: Sends new requests to the server with the fewest active connections.
  • IP Hash: Uses client IP to determine server. Useful for "Sticky Sessions" (user stays on same server).
Types
  • L4 (Transport): Balances based on IP/Port (TCP/UDP). Very fast, packet level.
  • L7 (Application): Balances based on HTTP headers, URLs, Cookies. Smarter, CPU intensive.
03.1 / Speed

Caching Strategy

Reading from memory (RAM) is ~100x faster than disk. Caching reduces database load and latency.

App
GET /user/1
CACHE
Redis
DB
Postgres
Request Latency
Waiting for request...
Strategies
  • Cache-Aside (Lazy Loading): App checks cache. If miss, app fetches DB and updates cache. Best for read-heavy.
  • Write-Through: App writes to Cache and DB simultaneously. Consistency is high, writes are slower.
  • Write-Back: App writes only to Cache (fast). Cache async writes to DB. Risk of data loss.
Eviction
  • LRU (Least Recently Used): Discard items not used recently.
  • TTL (Time To Live): Data expires after fixed time (e.g., 5 mins). Prevents "Stale Data".
03.2 / Storage

Databases & Scaling

Relational (SQL)

ACID
id | name
1  |  John

NoSQL

BASE
{ "id": 1, "name": "John" }
Scaling Techniques
  • Vertical Scaling: Upgrade the server hardware (CPU/RAM). Simple but has a hard limit (ceiling).
  • Read Replicas: Master DB handles writes. Slave DBs handle reads. Great for read-heavy apps.
  • Horizontal Scaling: Adding more machines to the pool (scaling out). includes Sharding (Splitting data across multiple servers).
  • CDN (Content Delivery Network): Cached static content (images, CSS) stored in servers globally close to the user to reduce latency.
  • Reverse Proxy: A server (like Nginx) sitting in front of web servers, handling security, SSL termination, and routing.
04 / Interfaces

API Paradigms

REST

GET /users/1

Standard, resource-based. Uses HTTP verbs. Stateless.

GraphQL

query { user(1) { name } }

Flexible. Client asks for exactly what it needs. Avoids Over-fetching.

gRPC

service.GetUser(id)

High performance. Uses Protocol Buffers (Binary). Great for internal microservices.

Important Concepts
  • Idempotency: Making the same request multiple times has the same effect as making it once (e.g., retrying a Payment).
  • Statelessness: Server does not store client context between requests. Scalable.
04.1 / Deployment

Containers vs. Virtual Machines

Select a deployment strategy to see the infrastructure layers.
Orchestration

Managing 100s of containers manually is impossible. We use orchestrators like Kubernetes (K8s).

  • Auto-scaling: Spin up more containers when CPU is high.
  • Self-healing: Restart containers if they crash.
  • Immutable Infrastructure: Never patch a running server. Replace it with a new image.
04.2 / Operations

Monitoring & Logging

You can't fix what you can't see. Observability is composed of three pillars.

01. Metrics

"CPU is at 99%" (Aggregatable data)

02. Logs
[INFO] Req started
[INFO] DB conn ok
[ERR] Timeout

"Why it failed" (Discrete events)

03. Traces

"Where it slowed down" (Request Lifecycle)

The 4 Golden Signals (Google SRE)
  • Latency: Time taken to serve a request.
  • Throughput: Requests processed per unit of time (req/sec).
  • Traffic: Demand on the system.
  • Errors: Rate of failing requests (500s).
  • Saturation: How "full" the service is (CPU/Memory usage).
04.3 / Quality

Testing Pyramid

As systems scale, manual testing becomes impossible. The Testing Pyramid is a framework that dictates how many tests of each type you should write to maintain high release velocity and reliability.

UI / E2E
End-to-End Tests. Simulates real user scenarios. Slowest and most expensive to run.
Integration
Tests interactions between components (e.g., DB + API). Balances speed and coverage.
Unit Tests
Tests individual functions in isolation. Fastest, cheapest, and most numerous.
The "Ice Cream Cone" Anti-Pattern

An inverted pyramid where teams rely heavily on manual or E2E tests, and have very few unit tests. This leads to:

  • Slow deployment cycles (tests take hours).
  • Flaky tests (E2E tests fail due to network/UI changes).
  • High cost of maintenance.
Key Rules
  • Push tests down: If a high-level test fails, write a lower-level test to catch it next time.
  • Mock external services: In Unit/Integration tests, mock Stripe or AWS to avoid network latency and costs.
05 / Summary

Design Checklist

  • 1. Define Functional Requirements (What does it do?)
  • 2. Define Non-Functional Requirements (Latency, Availability, Consistency).
  • 3. Estimate Scale (Traffic, Storage, Bandwidth).
  • 4. Define Data Model (SQL vs NoSQL).
  • 5. Design High-Level API (REST/GraphQL).
Glossary

Key Definitions

CAP Theorem

States that in a distributed data store, you can only guarantee two of Consistency, Availability, and Partition Tolerance.

Microservices

An architectural style where an application is structured as a collection of loosely coupled, independently deployable services.

Monolith

An architectural style where the application is built as a single, unified unit.

Load Balancer

A device or software that distributes network traffic across a cluster of servers.

Caching

The process of storing copies of data in a temporary storage location (cache) for faster access.

Sharding

A method of splitting and storing a single logical dataset in multiple databases.

SOLID

Five design principles intended to make software designs more understandable, flexible, and maintainable.

Event-Driven Architecture

A software architecture pattern promoting the production, detection, consumption of, and reaction to events.

Vertical Scaling

Adding more processing power (CPU, RAM) to an existing server (Scaling Up).

Horizontal Scaling

Adding more servers to the resource pool (Scaling Out).

08 / Knowledge Check