Where Caches Live: The Full Stack
Caching is not one layer — it is a stack. At the top, the browser caches static assets (CSS, JS, images) and sometimes API responses based on HTTP headers. Below that, a CDN (Cloudflare, CloudFront, Fastly) caches content at edge locations close to users worldwide. The load balancer may cache health-check results or session affinity data. Inside your application, an in-process cache (a Map or LRU structure in Node.js, Python, or .NET) holds hot data in the app's own memory — microseconds to access. A distributed cache like Redis or Memcached sits between the app and the database, shared across all app instances. Finally, the database itself has internal caches: PostgreSQL's buffer pool keeps frequently accessed pages in RAM, and query caches (where they exist) store result sets. The memory hierarchy mirrors this pattern at the hardware level: CPU L1/L2/L3 cache → RAM → SSD → spinning disk → network. Each level is slower but larger. Your job as an engineer is to push data as far up the stack as possible without serving stale or inconsistent results.