Best of LLM — January 2026

1
Article
Xavier Womack·17w
Claude: the #1 AI for programmers?
Claude outperformed ChatGPT and other AI models in a coding task involving Tauri and glassmorphic windows. While ChatGPT provided outdated code and hallucinations, Claude delivered precise and accurate solutions within minutes. The author suggests Anthropic prioritizes coding capabilities more than competitors, making Claude a top choice for programming assistance despite other models ranking higher on synthetic benchmarks.
370
53
2
Article
Addy Osmani·20w
My LLM coding workflow going into 2026
A comprehensive workflow for using LLM coding assistants effectively in 2026. Start with detailed planning and specs before coding, break work into small iterative chunks, provide extensive context to the AI, choose appropriate models for each task, and maintain human oversight through rigorous testing and code review. Use version control aggressively with frequent commits, customize AI behavior with rules and examples, leverage automation as quality gates, and treat the AI as a powerful but fallible pair programmer requiring clear direction. The approach emphasizes that AI amplifies engineering skills rather than replacing them, with the developer remaining accountable for all code produced.
151
2
3
Article
SHAPeS·16w
We. Are. Screwed
A concerned reaction to Moltbook, described as 'Reddit for LLMs,' expressing alarm about AI systems developing unusual communication patterns and autonomous behaviors. The author worries about the implications of AI agents operating independently on the internet, potentially engaging in malicious activities like crypto scams and malware distribution, suggesting we may need defensive AI systems in response.
139
43
4
Article
Daily Dose of Data Science | Avi Chawla | Substack·17w
[New] Generative UI for Agents
Generative UI is an emerging pattern where AI agents render actual UI components instead of just returning text responses. Unlike traditional chat interfaces, agents can now display weather cards, confirmation dialogs, data tables, and other interactive elements by selecting pre-built components and filling them with data at runtime. Three approaches exist: static (predefined components), declarative (component registry), and open-ended (raw HTML/iframes). Protocols like A2UI, AG-UI, and MCP Apps enable real-time bidirectional communication between agents and frontends. CopilotKit has open-sourced a complete implementation for React with integrations for LangGraph, CrewAI, and other agent frameworks. MiniMax also launched Agent Desktop, a desktop environment where AI agents can browse the web, manage files, and automate developer tasks.
110
4
6
Article
Daily Dose of Data Science | Avi Chawla | Substack·19w
6 Components of Context Engineering
Context engineering is the practice of optimizing how information flows to AI models, comprising six core components: prompting techniques (few-shot, chain-of-thought), query augmentation (rewriting, expansion, decomposition), long-term memory (vector/graph databases for episodic, semantic, and procedural memory), short-term memory (conversation history management), knowledge base retrieval (RAG pipelines with pre-retrieval, retrieval, and augmentation layers), and tools/agents (single and multi-agent architectures, MCPs). While model selection and prompts contribute only 25% to output quality, the remaining 75% comes from properly engineering these context components to deliver the right information at the right time in the right format.
87
3
7
Article
DigitalOcean Community·21w
Olmo 3: Fully Open-Source LLM from AI2 (Models, Data, & Code)
Olmo 3 is Allen AI's fully open-source large language model available in 7B and 32B parameter versions. The release includes complete access to models, training datasets (Dolma 3 with 9.3 trillion tokens), code, and training logs. The model uses a three-stage training pipeline: pretraining on Dolma 3 Mix, mid-training on Dolma 3 Dolmino for skill enhancement, and long-context extension on Dolma 3 Longmino. Post-training uses the Dolci suite with SFT, DPO, and RLVR techniques. The 32B model employs grouped query attention while the 7B uses multi-head attention. OlmoTrace enables tracing text back to training sources for auditing and contamination detection.
83
1
8
Article
selfh.st·16w
Self-Host Weekly (30 January 2026)
This week's self-hosting highlights include ClawdbotMoltbotOpenClaw, a viral open-source AI chatbot that can book travel and make reservations from chat platforms. MOS, a new Devuan-based NAS operating system, offers a web interface and plugin support. Immich v2.5 adds non-destructive photo editing, moving closer to being a Google Photos alternative. The newsletter also features Vanilla Cookbook, a minimalist self-hosted recipe platform with LLM assistance and Docker deployment.
80
6
9
Video
The Coding Gopher·19w
Docker just got some massive upgrades
Docker released the Docker MCP toolkit, a production-grade implementation of Anthropic's Model Context Protocol that containerizes AI agent capabilities. The system uses three core components: a curated catalog of versioned MCP server images, a gateway that acts as a dynamic proxy managing container lifecycle and routing, and a toolkit for credential management and permissions. This architecture isolates agent tools in containers, providing reproducibility, security through policy enforcement, and composability by allowing multiple MCP servers to run side-by-side without dependency conflicts.
80
2
10
Article
Where's Your Ed At·18w
Premium: This Is Worse Than The Dot Com Bubble
The current AI investment bubble is worse than the dot-com era, with venture capital pouring $168 billion into AI in 2025 alone—nearly half of all VC funding. Unlike the dot-com bubble where companies at least had viable business models, AI startups have negative margins that worsen with growth, unprofitable unit economics, and no path to profitability. CES 2026 showcased this dysfunction: companies demoing vaporware robots and repackaging basic chatbot features as revolutionary "AI agents." The venture capital industry has devolved into late-stage momentum investing rather than early-stage risk-taking, rewarding grifting over fundamentals. When this bubble bursts, the consequences will be catastrophic because the investments are larger, the contagion wider, and unlike dark fiber from the dot-com era, GPUs and AI infrastructure have fundamentally broken economics with no salvageable residual value.
78
10
11
Article
Google Open Source Blog·17w
A JSON schema package for Go
Google released jsonschema-go, a comprehensive JSON Schema package for Go that provides schema creation, serialization, validation, and inference from Go types. The package addresses the growing need for JSON Schema in LLM infrastructure, where it serves as the standard for defining structured interactions with language models. It features a straightforward Schema struct, validation with resolution, and the ability to generate schemas from Go types using struct tags. The package is already used in Google's MCP Go SDK and aims to become the canonical JSON Schema solution for Google's Go SDKs working with LLMs.
73
3
12
Video
Web Dev Cody·20w
Most Developers Aren’t Ready for 2026
AI-powered coding tools like Claude Code, Cursor, and GPT models are fundamentally changing software development. LLMs can now generate entire features across dozens of files, write tests, and even build complete applications with minimal manual coding. The shift moves developers from writing code by hand to prompt engineering and context engineering—providing documentation and requirements that guide AI agents. Front-end development and manual UI coding are becoming commoditized as AI handles component generation and styling. Developers need to focus on higher-level skills like software architecture, system design, and diversify into project management and communication. Autonomous coding systems can now run overnight, implementing features and fixing bugs with minimal human intervention.
72
22
13
Video
Awesome·17w
AI will replace developers in 6 months. Again...
AI predictions from tech CEOs at Davos claim developers will be replaced within 6-12 months, repeating similar claims from a year ago. Current transformer-based AI architectures have fundamental limitations—they rearrange existing knowledge probabilistically but cannot generate truly novel ideas. While AI will likely handle more boilerplate code generation, the technology has plateaued and requires unknown breakthroughs before achieving major industry disruption. Meanwhile, companies like OpenAI struggle with profitability, now introducing ads despite previously calling it a last resort.
65
23
14
Article
Netflix TechBlog·17w
The AI Evolution of Graph Search
Netflix evolved their Graph Search platform to support natural language queries by integrating LLMs. The system converts user questions into structured DSL filter statements through a multi-stage process: RAG-based context engineering to identify relevant fields and controlled vocabulary values, LLM-based generation with carefully crafted instructions, and deterministic validation for syntactic and semantic correctness. Key innovations include field and vocabulary RAG to manage context size, UI visualization of generated filters as chips and facets, and @mention functionality for explicit entity selection. This approach bridges the gap between complex federated graph queries and intuitive user intent while maintaining trust through transparency.
64
15
Article
GitHub Blog·18w
Build an agent into any app with the GitHub Copilot SDK
GitHub announced the Copilot SDK in technical preview, enabling developers to embed the same agentic core that powers GitHub Copilot CLI into any application. The SDK provides a programmable execution layer with built-in planning, tool invocation, file editing, command execution, multi-model support, MCP server integration, and GitHub authentication. This eliminates the need to build custom orchestration logic for context management, tool routing, and model coordination. Developers can use it to create custom GUIs, productivity tools, enterprise workflows, and various applications while GitHub handles the underlying infrastructure.
57
16
Video
Joshua Morony·18w
JavaScript optimisation with LLMs is too good to ignore now
Using LLMs to optimize JavaScript performance has proven highly effective across 20+ real-world scenarios, reducing turnaround time from hours or days to minutes. The approach involves profiling with DevTools to identify bottlenecks, then having AI analyze and fix the code. Results include transforming a game from 20-30 FPS to near 60 FPS even with 4x CPU slowdown. LLMs successfully handled everything from simple closure optimizations to advanced techniques like baking tile maps into static textures and implementing complex bit-shifting algorithms. While some knowledge transfer is sacrificed, the speed and reliability of AI-driven solutions make them practical for shipping projects without accumulating technical debt.
55
5
17
Article
The Next Web·19w
AI Skills
AI Skills represent a new conceptual layer above models and agents, functioning as reusable, procedural units that transform user intent into concrete execution. While models provide raw intelligence and agents coordinate tasks, Skills encode domain-specific expertise and workflows to deliver actual business outcomes. This modular, product-oriented approach scales better than building custom agents for every task, positioning Skills as the competitive differentiator as AI infrastructure commoditizes.
54
3
18
Article
Faun·17w
20 Free & Open-Source AI Tools to Run Production-Grade Agents Without Paying LLM APIs in 2026
A curated list of 20 open-source tools for running AI agents locally without relying on paid LLM APIs. Covers inference engines (Ollama, vLLM, LiteLLM), agent orchestrators (LangGraph, CrewAI, AutoGen), RAG and vector databases (LlamaIndex, ChromaDB, Qdrant), development tools (Continue.dev, Promptfoo), and multimodal processing (Whisper.cpp, Diffusers). Includes a quickstart stack using Docker and pip for deploying production-grade agents with zero marginal cost after hardware investment.
53
2
19
Article
Red Hat Developer·20w
The state of open source AI models in 2025
2025 saw significant growth in open source AI models, particularly from Chinese labs like DeepSeek, Qwen, and Moonshot AI's Kimi K2. These models now rival proprietary options like ChatGPT while offering cost control and on-premises deployment. The landscape includes model families of various sizes (from 0.5B to 1T parameters) for different use cases: Qwen for versatility, Kimi K2 for agentic workflows and coding, OpenAI's gpt-oss for tool calling, and small language models for edge devices. Enterprise adoption is growing in regulated sectors requiring data sovereignty. Tools like Ollama, RamaLama, and vLLM make deployment accessible, from local hardware to production Kubernetes environments.
52
1
20
Article
The Pragmatic Engineer·17w
I replaced a $120/year micro-SaaS in 20 minutes with LLM-generated code
A developer replaced a $120/year testimonials SaaS (Shoutout.io) in 20 minutes using an LLM (Codex) after the service's billing system broke. The replacement involved storing testimonials in JSON and generating HTML at build time. While rebuilding the entire SaaS would be much harder, replacing a specific use case proved surprisingly easy. This suggests that SaaS products providing no ongoing value or maintenance are vulnerable to LLM-powered replacements, especially when customers encounter broken features. However, developers have a significant advantage in this process compared to non-technical users who may struggle with command-line workflows and code verification.
47
8
21
Article
Read the Tea Leaves·16w
Building a browser API in one shot
An experiment demonstrates building a complete IndexedDB implementation from scratch using Claude AI with a single prompt and automated iteration loop. The implementation passes 95% of targeted Web Platform Tests and 77.4% of a more rigorous test suite, achieving results comparable to fake-indexeddb (82.8%) in just a few hours. The project cost approximately $7 and produced 4,395 lines of TypeScript code backed by SQLite. While the code quality is reasonable and the approach leverages Web Platform Tests as acceptance criteria, the author reflects on how AI tools are devaluing traditional software development efforts while acknowledging their inevitable dominance in the field.
42
22
Article
Faun·17w
Beyond the Monolith: The Rise of the AI Microservices Architecture
LLM applications are evolving from monolithic architectures to microservices-based systems using agentic orchestration. This architectural pattern uses LangGraph as a central state machine to orchestrate independent, remote agents via HTTP calls, with semantic routing replacing brittle keyword matching. The hub-and-spoke model separates concerns: LangGraph maintains conversation state and makes decisions, semantic routing understands user intent, and specialized agents operate as independent HTTP services. This approach enables tech-agnostic development, independent scaling of components, fault tolerance, and better context management compared to traditional linear chains.
41
23
Article
Hugging Face·18w
Differential Transformer V2
Differential Transformer V2 introduces a redesigned attention mechanism that doubles query heads while maintaining key-value heads, eliminating the need for custom kernels and achieving faster decoding speeds. The architecture removes per-head RMSNorm to improve training stability, introduces token-level and head-level lambda projections to overcome softmax constraints, and eliminates attention sinks. Production-scale experiments on trillion-token datasets show 0.02-0.03 lower language modeling loss, reduced gradient spikes under large learning rates, and decreased activation outliers compared to standard Transformers, while saving approximately 25% of attention module parameters.
38
24
Article
Hacker News·19w
The Insecure Evangelism of LLM Maximalists
A critical perspective on agentic LLM coding tools and "vibe coding" from a senior developer who finds them slow and requiring excessive oversight. The author argues that vocal LLM advocates may be projecting their own insecurities about programming ability, attacking skeptics as resistant to change when experienced developers claim higher productivity without AI assistance. While acknowledging LLMs are useful for documentation lookup and enabling non-developers, the author questions whether prompt-driven development delivers on its promises for experienced programmers.
38
10
25
Article
theacademe·20w
Thoughts on "Nine open source projects to watch in 2026"
A developer shares their experience with AI coding assistants, recommending Continue plugin with Ollama for VSCode, RooCode for scaffolding, and specific Qwen models. They suggest using MCP servers to keep LLMs current, advocate for self-hosted Gitea as a GitHub alternative with package management support, and compare Turso with rqlite as SQLite-based distributed databases.
38
5

See all LLM archives