2026•2 min read

Database Reliability & Performance

Right-sized production MongoDB a full tier and cut thousands of redundant ops/min — slow-query forensics, index surgery, pipeline rewrites, and Redis caching.

Full write-up: M50 → M40

MongoDB
Performance
Redis
Reliability
Node.js

Database reliability and performance cover

Overview

A sustained campaign to make the production MongoDB layer faster, cheaper, and more reliable — from query-level forensics up to runtime caching. The headline: I took the Atlas cluster down a full tier (M50 → M40) while latency improved.

Right-sizing the database

I parsed tens of millions of slow-query log lines across the shards, ranked ~2,800 query shapes by cumulative cost, and went after the worst offenders: added the right compound indexes (verified in use via $indexStats), dropped ~244 redundant indexes across 16 collections, and rewrote the heaviest aggregations (the worst went from ~23.5s to under 100ms). That recovered ~43% of slow-query time and ~2 TB/month of disk reads — enough headroom to safely downsize the cluster. → full write-up

Cutting runtime load

Beyond the database itself, I attacked how the app used it:

Redis-cached the hot paths — auth went from 6–7 DB ops per request to ~0; notification, dashboard, and permission lookups moved off the request path.
Cut ~8,000+ redundant Mongo ops/min (peak ~17K fleet-wide) by moving per-request writes to cron and eliminating duplicate reads.
Unblocked the event loop — multi-second freezes (~46s → under 0.2s) under load.

Reliability fixes

Fixed a long-standing webhook race condition under concurrent queue ingestion with an atomic findOneAndUpdate and prefetch tuning.
Added a Redis distributed lock for idempotent deduplication that degrades gracefully when Redis is unavailable.

Impact

A full Atlas tier saved, with better latency — not worse.
Thousands of redundant database operations per minute eliminated.
Correctness under concurrency, and an app that no longer freezes under peak load.

Last updated on 2026-06-12.

Explore more projects →

Preparing project details

Fetching the latest project content.