Best of PerformanceJanuary 2026

  1. 1
    Article
    Avatar of lobstersLobsters·13w

    I Replaced Redis with PostgreSQL (And It's Faster)

    PostgreSQL can replace Redis for caching, pub/sub, job queues, and sessions using UNLOGGED tables, LISTEN/NOTIFY, SKIP LOCKED, and JSONB. While PostgreSQL is 50-158% slower per operation (0.1-1ms difference), it eliminates network hops between databases, reduces infrastructure costs by ~$100/month, simplifies operations, and guarantees transactional consistency. The approach works best for small-to-medium apps with simple caching needs but isn't suitable for high-throughput scenarios (100k+ ops/sec) or applications requiring Redis-specific data structures like sorted sets or HyperLogLog.

  2. 2
    Article
    Avatar of devjourneyDeveloper's Journey·13w

    It Worked in Dev. It Worked in QA. Then Production Happened.

    A backend engineer shares a production incident where an appointment-fetching endpoint worked fine in dev and QA but caused 4-second response times in production. The issue was an N+1 query problem: the code made 6,000+ individual database calls to fetch patient details. The solution involved batching patient data retrieval into a single query using in-memory maps and adding proper projections, reducing latency to 500-600ms. The incident highlights the importance of testing with realistic data volumes, thorough code reviews, and anticipating edge cases during development.

  3. 3
    Article
    Avatar of devtoDEV·13w

    Microservices Are Killing Your Performance (And Here's the Math)

    Microservices introduce significant performance overhead compared to monolithic architectures. Network calls add 1,000-5,000x latency versus in-process function calls, resulting in 50-150% higher latency, 300% more resource usage, and 2-3x infrastructure costs. Benchmarks show microservices suffer from cascading failures (5x more downtime), database connection exhaustion, and serialization overhead. The article argues microservices solve organizational problems (team autonomy, independent deployment) rather than technical ones, and recommends modular monoliths for most applications. Microservices only make sense for organizations with 50+ engineers, independent scaling requirements, technology diversity needs, or compliance isolation.

  4. 4
    Article
    Avatar of stitcherstitcher.io·12w

    Processing 11 million rows in minutes instead of hours

    A developer optimized their blog's analytics system to process 11 million events, improving performance from 30 to 14,000 events per second through systematic changes: removing unnecessary database sorting, reversing nested loops, ditching the ORM for raw queries, replacing closures with while loops, fixing framework serialization overhead, and using indexed ID-based pagination instead of offsets. The rebuild time dropped from 50 hours to 10 minutes per projector.

  5. 5
    Article
    Avatar of antonzAnton Zhiyanov·14w

    Go 1.26 interactive tour

    Go 1.26 introduces significant language and runtime improvements. Key features include `new(expr)` for creating pointers from expressions, type-safe error checking with `errors.AsType`, the Green Tea garbage collector for better memory efficiency on multi-core systems, faster cgo/syscalls and memory allocation, experimental SIMD operations, secret mode for cryptographic data protection, goroutine leak profiling, and numerous standard library enhancements including reflective iterators, buffer peeking, process handles, and improved metrics.

  6. 6
    Article
    Avatar of acxspb6hjyagkgcv84rvgAmir·11w

    Run Across - My first complete Godot 4.5 mobile game (3 months solo dev)

    A solo developer shares their experience building their first mobile game using Godot 4.5 over 3 months. The post covers technical implementation details including mobile optimization techniques (object pooling, shader warmup, quality presets), Firebase integration for authentication and leaderboards with anti-cheat measures, responsive UI scaling across different screen resolutions, and challenges faced with Android plugins and performance issues. The game features dynamic biome transitions, combo systems, and runs on low-end devices with 1GB RAM through aggressive optimization strategies.

  7. 7
    Article
    Avatar of hnHacker News·14w

    You probably don't need Oh My Zsh

    Oh My Zsh adds unnecessary bloat that slows shell startup time significantly (0.38s vs 0.07s). A minimal Zsh configuration with history settings, autocd, and completions provides a solid foundation. Starship offers fast prompt customization, while fzf provides better history search than zsh-autosuggestions. This lightweight approach dramatically improves terminal responsiveness for developers who frequently open new tabs.

  8. 8
    Article
    Avatar of stitcherstitcher.io·11w

    Once again processing 11 million rows, now in seconds

    A PHP developer optimizes a script processing 11 million database events, improving performance from 50k to 1.7M events per second through incremental changes. Key optimizations include combining SQL inserts, moving calculations from MySQL to PHP, eliminating object instantiation in favor of raw arrays, and removing JSON deserialization by using direct database columns. The journey demonstrates the trade-offs between code convenience and raw performance, with the final implementation processing data in seconds rather than days.

  9. 9
    Article
    Avatar of bunBun·13w

    Bun v1.3.6

    Bun v1.3.6 introduces new APIs for tarball creation/extraction (Bun.Archive) and JSONC parsing, adds metafile and virtual files support to Bun.build, and delivers significant performance improvements: Response.json() is 3.5x faster, async/await 15% faster, Promise.race 30% faster, and Bun.hash.crc32 20x faster. The release includes SIMD-optimized Buffer.indexOf, HTTP/HTTPS proxy support for WebSocket, S3 Requester Pays support, --grep flag for bun test, fake timers compatibility with @testing-library/react, SQLite 3.51.2 update, and 45 bug fixes improving Node.js compatibility, bundler stability, and security.

  10. 10
    Article
    Avatar of grabGrab Tech Blog·12w

    Docker lazy loading at Grab: Accelerating container startup times

    Grab implemented Docker image lazy loading using SOCI (Seekable OCI) technology to solve slow container startup times caused by large images. The solution achieved 4x faster image pull times on fresh nodes, 30-40% faster P95 startup times in production, and 60% improvement in download times after configuration tuning. Unlike traditional image pulls that download all layers before starting, lazy loading uses remote snapshotters to fetch data on-demand via FUSE filesystems. Grab chose SOCI over eStargz because it's natively supported on Bottlerocket OS, doesn't require image conversion, and maintains the same application startup time as standard images while dramatically reducing image pull time.

  11. 11
    Article
    Avatar of hnHacker News·12w

    Replacing Protobuf with Rust to go 5 times faster

    PgDog replaced Protobuf serialization with direct C-to-Rust bindings in their PostgreSQL proxy, achieving 5x faster query parsing and 10x faster deparsing. The team forked pg_query.rs and used bindgen with AI-assisted code generation to create 6,000 lines of recursive conversion code that maps C structs directly to Rust. Profiling revealed Protobuf deserialization as the bottleneck, not the Postgres parser itself. The new implementation uses unsafe Rust with recursive algorithms for better CPU cache locality and zero additional memory allocation, resulting in 25% overall performance improvement in pgbench benchmarks.

  12. 12
    Article
    Avatar of logrocketLogRocket·14w

    6 fast (native) alternatives for VSCode

    VSCode's Electron-based architecture creates performance bottlenecks through high resource consumption, especially on low to mid-range hardware. Six native alternatives offer significantly better performance: ecode (C++, 50MB disk, 40MB RAM), CudaText (Free Pascal, 20MB disk), Lite (C/Lua, 1MB disk, ultraminimal), Lite XL (improved Lite with better features), Lapce (Rust, VSCode-like workflow, 60MB disk), and Zed (Rust, full VSCode replacement with AI features, 400MB disk). These editors use native rendering through OpenGL, SDL, wgpu, or platform-specific graphics APIs instead of Chromium, achieving 10-30x lower resource usage while providing comparable features through growing plugin ecosystems.

  13. 13
    Article
    Avatar of stackovStack Overflow Blog·12w

    Don’t let your backend write checks your frontend can’t cache

    Frontend and backend architecture must work in harmony to deliver performant user experiences. As AI-generated interfaces and ephemeral UIs emerge, backend systems face new challenges around API orchestration, caching, and performance at scale. Frontend engineers need to understand backend constraints like processing load, API response times, and scalability to avoid creating interfaces that promise more than the backend can deliver. The shift toward component-based, modular interfaces requires spec-driven design and careful consideration of how APIs are structured and consumed. While AI can accelerate development, production-grade systems still require human understanding and ownership of both frontend and backend code.

  14. 14
    Video
    Avatar of fireshipFireship·12w

    Bun in 100 Seconds

    Bun is an all-in-one JavaScript runtime that replaces Node.js, npm, bundlers, testing frameworks, and transpilers with a single fast tool. Built with Zig and JavaScriptCore instead of C++ and V8, it offers built-in TypeScript support, database drivers for SQLite and Redis, HTTP server capabilities, and package management—all while maintaining Node.js ecosystem compatibility. Installation is simple, and it eliminates configuration complexity by bundling everything developers need into one binary.

  15. 15
    Article
    Avatar of infoqInfoQ·12w

    Prisma 7: Rust-Free Architecture and Performance Gains

    Prisma ORM 7.0 replaces its Rust-based query engine with a TypeScript implementation, achieving 90% smaller bundle sizes, 3x faster query execution, and lower resource utilization. The release moves generated code out of node_modules for better developer experience, introduces a dynamic configuration file, and delivers 98% fewer types with 70% faster type checking. Migration guides and AI-assisted tooling help developers upgrade from previous versions.

  16. 16
    Article
    Avatar of ergq3auoeReinier·11w

    WebSockets Crash Course: Build a Real-Time Sports Engine (10ms Updates)

    A comprehensive WebSocket tutorial that teaches how to build a high-frequency broadcast engine capable of delivering real-time sports updates to over 100,000 concurrent users with sub-second latency. The course covers WebSocket fundamentals, data ingestion from multiple sources, architecture for handling massive concurrent connections, and deployment strategies for production-scale real-time systems.

  17. 17
    Article
    Avatar of lobstersLobsters·14w

    What Happened To WebAssembly

    WebAssembly is actively used in production by companies like Figma, Cloudflare, and Godot, primarily as a compilation target that bridges language ecosystems. Its strength lies in security guarantees enabling sub-millisecond spinup times and safe execution of untrusted code, plus portability allowing C++/Rust libraries to run in browsers. Performance is comparable to JavaScript in browsers, with trade-offs in binary size and boundary crossing costs. Most developers encounter it transparently through library dependencies rather than directly, which contributes to the perception that "nothing happened" despite significant real-world adoption and ongoing standardization efforts.

  18. 18
    Video
    Avatar of codinggopherThe Coding Gopher·15w

    Bun just killed Node.js

    Bun is an all-in-one JavaScript runtime that consolidates execution, bundling, package management, and testing into a single binary. Built with Zig and JavaScriptCore instead of V8, it eliminates the overhead of multiple tools parsing the same code. Key features include native TypeScript support without compilation, integrated bundler with tree-shaking and code-splitting, content-addressable package caching, and built-in test runner. By sharing internal representations and optimizing at the systems level, Bun achieves faster cold starts, quicker dependency installs, and improved development workflows compared to traditional Node.js toolchains.

  19. 19
    Video
    Avatar of joshuamoronyJoshua Morony·12w

    JavaScript optimisation with LLMs is too good to ignore now

    Using LLMs to optimize JavaScript performance has proven highly effective across 20+ real-world scenarios, reducing turnaround time from hours or days to minutes. The approach involves profiling with DevTools to identify bottlenecks, then having AI analyze and fix the code. Results include transforming a game from 20-30 FPS to near 60 FPS even with 4x CPU slowdown. LLMs successfully handled everything from simple closure optimizations to advanced techniques like baking tile maps into static textures and implementing complex bit-shifting algorithms. While some knowledge transfer is sacrificed, the speed and reliability of AI-driven solutions make them practical for shipping projects without accumulating technical debt.

  20. 20
    Article
    Avatar of frontendmastersFrontend Masters·14w

    React has changed, your Hooks should too – Frontend Masters Blog

    React Hooks are often misused in modern codebases, with developers overrelying on useEffect and copy-pasting patterns without understanding them. Before reaching for useEffect, consider whether the logic is driven by external factors (network, DOM, subscriptions) or can be computed during render. For the latter case, tools like useMemo, useCallback, or framework-provided primitives create more robust components. The key is understanding when to use each hook appropriately rather than defaulting to familiar patterns.

  21. 21
    Article
    Avatar of jetbrainsJetBrains·11w

    Rust vs JavaScript & TypeScript: Performance and WebAssembly

    Rust and JavaScript/TypeScript are complementary languages increasingly used together in hybrid architectures. JavaScript/TypeScript excels at rapid iteration, UI development, and full-stack flexibility with its massive ecosystem, while Rust delivers superior performance, memory safety, and reliability for system-level tasks. WebAssembly bridges the two, enabling Rust to handle performance-critical logic within JS/TS applications. Modern teams commonly use JS/TS for the product layer and Rust for the underlying engine, combining speed with flexibility. Real-world examples include Figma using Rust/Wasm for graphics rendering with a TypeScript/React interface, Biome replacing JS tooling with Rust implementations, and Cloudflare powering edge computing with Rust while developers write in JS/TS.

  22. 22
    Article
    Avatar of nuxt_sourceNuxt·12w

    Nuxt 4.3 · Nuxt Blog

    Nuxt 4.3 introduces route rule layouts for centralized layout management, ISR payload extraction for better caching, and a draggable error overlay. The release includes performance improvements like faster SSR styles plugin and optimized router matching. New features include the #server alias for cleaner imports, layout props support in setPageLayout, route groups in page meta, and the ability to disable layer modules. Nuxt v3 end-of-life has been extended to July 31, 2026, while Nuxt v5 development begins on the main branch.

  23. 23
    Article
    Avatar of phoronixPhoronix·13w

    Burn 0.20 Released: Rust-Based Deep Learning With Speedy Perf Across CPUs & GPUs

    Burn 0.20 introduces CubeK, a high-performance multi-platform kernel system built on CubeCL that enables unified CPU and GPU execution across NVIDIA CUDA, AMD ROCm, Apple Metal, WebGPU, and Vulkan. The release aims to deliver peak performance on diverse hardware without maintaining fragmented codebases, with benchmarks showing significantly lower execution times compared to LibTorch and ndarray. The update also includes a complete overhaul of the ONNX import system and various stability improvements.

  24. 24
    Article
    Avatar of danielhaxxsedaniel.haxx.se·12w

    libcurl memory use some years later

    libcurl's memory usage has improved over five years despite adding features. Comparing current development to version 7.75.0, key structs show mixed changes: the multi handle grew from 416 to 816 bytes, the easy handle from 5272 to 5352 bytes, while connectdata shrank from 1472 to 912 bytes. For 10 parallel transfers with 20 connections, total memory decreased by 10,000 bytes. A single 512MB HTTP download now uses 133,856 bytes across 107 allocations (1.6% more memory, 11% more allocations than five years ago). The project added test case 3214 to prevent accidental struct size growth by setting upper limits on fifteen important structs.

  25. 25
    Article
    Avatar of csharpcornerC# Corner·13w

    Why ASP.NET Core Feels Fast Locally but Slow in Production

    ASP.NET Core applications often run fast locally but slow in production due to environment differences, resource constraints, database query inefficiencies, async/await misuse, excessive logging, cold starts, improper caching, and network latency. The guide covers common bottlenecks like EF Core N+1 queries, thread pool starvation from blocking calls, logging overhead, in-memory cache limitations in scaled environments, and HttpClient socket exhaustion. It provides code examples showing problematic patterns versus optimized alternatives, emphasizing measurement-driven optimization: timing requests and database queries, checking thread pool health, reducing allocations, and implementing distributed caching before scaling infrastructure.