Caching in Microservices & Monitoring

In a microservices architecture, each service may have its own cache — but shared data creates invalidation challenges. When the Product Service updates a price, the Cart Service, Search Service, and Recommendation Service may all cache stale product data. Event-driven invalidation via a message bus (Redis Pub/Sub, Kafka) is the standard solution: publish product.updated, every service drops its local and distributed cache entries for that product. Monitoring separates a healthy cache from a decorative one. Track hit ratio (target: >90% for read-heavy workloads), cache latency (p50, p99), memory usage, eviction rate, and connection count. Alert when hit ratio drops suddenly — it usually means a deployment invalidated keys, TTLs are too short, or access patterns shifted. Tools: Redis INFO command, Prometheus + Grafana dashboards, Datadog cache analytics.

Before

No monitoring — cache might be useless

// No idea if caching is working
const cached = await redis.get(key);
if (cached) return JSON.parse(cached);
// ... fetch from DB

After

Instrumented cache with metrics

const cacheHits = new Counter('cache_hits_total');
const cacheMisses = new Counter('cache_misses_total');
const cacheLatency = new Histogram('cache_op_duration_seconds');

async function get(key) {
  const start = Date.now();
  const cached = await redis.get(key);
  cacheLatency.observe((Date.now() - start) / 1000);

  if (cached) {
    cacheHits.inc();
    return JSON.parse(cached);
  }
  cacheMisses.inc();
  const data = await fetchFromDB(key);
  await redis.setex(key, 300, JSON.stringify(data));
  return data;
}

// Alert: hit_ratio < 0.8 for 5 minutes

Key Takeaway

Publish invalidation events across microservices and monitor hit ratio — a cache you do not measure is a cache you cannot trust.