Where Caches Live: The Full Stack

Caching is not one layer — it is a stack. At the top, the browser caches static assets (CSS, JS, images) and sometimes API responses based on HTTP headers. Below that, a CDN (Cloudflare, CloudFront, Fastly) caches content at edge locations close to users worldwide. The load balancer may cache health-check results or session affinity data. Inside your application, an in-process cache (a Map or LRU structure in Node.js, Python, or .NET) holds hot data in the app's own memory — microseconds to access. A distributed cache like Redis or Memcached sits between the app and the database, shared across all app instances. Finally, the database itself has internal caches: PostgreSQL's buffer pool keeps frequently accessed pages in RAM, and query caches (where they exist) store result sets. The memory hierarchy mirrors this pattern at the hardware level: CPU L1/L2/L3 cache → RAM → SSD → spinning disk → network. Each level is slower but larger. Your job as an engineer is to push data as far up the stack as possible without serving stale or inconsistent results.

Before

Single layer — every request hits the database

User → App Server → Database → Disk
// Every request travels the full depth

After

Multi-layer cache stack

User Request
    ↓
Browser Cache (ms)
    ↓ miss
CDN Edge Cache (10–50ms)
    ↓ miss
App In-Process Cache (μs)
    ↓ miss
Redis / Memcached (1–5ms)
    ↓ miss
Database Buffer Pool (5–20ms)
    ↓ miss
Disk (10–100ms+)

Key Takeaway

Caches exist at every layer from browser to disk — push data as high up the stack as your consistency requirements allow.