Groww's engineering team shares how they reduced trading platform latency through a multi-phase approach spanning observability, architectural simplification, and infrastructure upgrades. Starting with distributed tracing via Honeycomb and custom dashboards, they identified that latency was spread across multiple layers rather than a single bottleneck. Key improvements included consolidating legacy services, reducing order flow hops from 7+ to ~5, migrating message queues (reducing consumer latency from 300ms to 100ms), upgrading databases (commit latency from 11-12ms to 1-2ms), and applying Redis pipelining with Lua scripting. Results include 90% improvement in cash order p99 latency at market open, 97% reduction in producer peak latency, 35% infrastructure cost reduction, and near real-time socket updates. The core lesson: simplification—removing redundant hops and consolidating services—delivered more impact than complex optimization techniques.
Table of contents
The Journey Begins: Going UnderneathThe Technical Approach: Methodology Meets PersistencePhase 1: Deep Observability & AnalysisPhase 2: Building VisibilityPhase 3: Architectural SimplificationPhase 4: Peak Time AnalysisPhase 5: Controlled ExperimentationPhase 6: Infrastructure That ScalesGet Groww Engineering Team ’s stories in your inboxThe Results: Numbers That Tell the Story🔍 The DiscoveryThe Key InsightBelieving in the TeamThe Simplification: Less Is More⚡ The BreakthroughCompetitive Edge: How We Stack UpOrganization-Wide ImpactWhat’s NextLessons for LeadersFinal Thought1 Comment
Sort: