Loading...
Preparing project details
Fetching the latest project content.
Loading...
Fetching the latest project content.
2026•2 min read
Right-sized production MongoDB a full tier and cut thousands of redundant ops/min — slow-query forensics, index surgery, pipeline rewrites, and Redis caching.
A sustained campaign to make the production MongoDB layer faster, cheaper, and more reliable — from query-level forensics up to runtime caching. The headline: I took the Atlas cluster down a full tier (M50 → M40) while latency improved.
I parsed tens of millions of slow-query log lines across the shards, ranked ~2,800 query shapes by cumulative cost, and went after the worst offenders: added the right compound indexes (verified in use via $indexStats), dropped ~244 redundant indexes across 16 collections, and rewrote the heaviest aggregations (the worst went from ~23.5s to under 100ms). That recovered ~43% of slow-query time and ~2 TB/month of disk reads — enough headroom to safely downsize the cluster. → full write-up
Beyond the database itself, I attacked how the app used it:
findOneAndUpdate and prefetch tuning.Last updated on 2026-06-12.