Best of Claude — March 2026

1
Article
John Reilly·12w
npmx.dev: with a little help from my friends
A personal account of contributing to npmx.dev, a community-built reimagining of the npmjs.com website. The author discovered a UX bug where npm API rate limiting (HTTP 429) caused the site to incorrectly show packages as missing. Using Claude Code to help write Vue/Nuxt code despite limited framework experience, they submitted a PR that displayed a proper rate-limit message to users. The post highlights npmx.dev's welcoming contributor culture, its thoughtful AI usage guidelines in CONTRIBUTING.md, and encourages others to get involved.
77
2
Article
Hacker News·9w
Is anybody else bored of talking about AI?
A developer reflects on AI fatigue — the sense that online tech spaces like Hacker News have become saturated with near-identical posts about AI workflows and tooling, crowding out discussion of actual products and problems being solved. The author argues that management's obsession with AI metrics (like tokens per developer) mirrors the old 'lines of code' fallacy, and calls for a return to focusing on the value being created rather than the tools used to create it.
98
32
3
Article
Claude·12w
Improving skill-creator: Test, measure, and refine Agent Skills
Anthropic has enhanced skill-creator, a tool for building Agent Skills in Claude, with testing and evaluation capabilities. Authors can now write evals to verify skill behavior, run benchmarks tracking pass rate, time, and token usage, and use multi-agent support to run evals in parallel without context bleed. A comparator agent enables A/B testing between skill versions. The update also adds description tuning to improve skill triggering accuracy, reducing false positives and negatives. Two skill types are distinguished: capability uplift skills (teaching Claude new behaviors) and encoded preference skills (sequencing existing capabilities per team workflows), each benefiting from evals differently. The framework is available on Claude.ai, Cowork, and as a Claude Code plugin.
73
3
4
Video
ThePrimeTime·12w
Measuring LLM Lies
A benchmark called 'BS Bench' tests LLMs by asking them nonsense questions where the premise is logically incoherent (e.g., relating fire safety codes to curry recipes). Claude models generally refuse to answer such questions, while OpenAI and Google models tend to confidently fabricate detailed answers. Gemini 2.5 (nicknamed 'Kimmy K') surprisingly outperforms OpenAI and Google on pushback. The deeper concern raised is that LLMs act as skill multipliers — meaning engineers with poor judgment who use AI confidently will make bad decisions faster and at greater scale. The real danger isn't obviously nonsense questions but subtly flawed ones that AI answers without pushback.
20
3
5
Video
AICodeKing·12w
Claude Code Computer: Anthropic just launched Computer PTC Feature & IT'S INSANE!
Anthropic introduced Programmatic Tool Calling (PTC), a new capability in Claude Opus and Sonnet 4.6 that addresses a core inefficiency in agentic tool use. Traditional tool calling forces every intermediate result back into Claude's context window, creating latency and token bloat. PTC lets Claude write code that orchestrates multiple tool calls inside a sandboxed container, keeping intermediate results out of context and only returning the final processed output. This preserves the control surface of tool handlers (for logging, inspection, approval) while gaining the composability of code. Benchmarks show PTC improved accuracy by 11% and reduced input tokens by 24% on search tasks, helping Opus 4.6 reach #1 on LM Arena's search benchmark. PTC is now enabled by default when using the web search tool via the API.
20
1
6
Article
ByteByteGo·9w
How Anthropic’s Claude Thinks
Anthropic's interpretability team built tools to trace Claude's actual internal computations, revealing a significant gap between what Claude says it does and what actually happens. Key findings include: Claude operates in a language-agnostic conceptual space; it plans ahead when writing poetry rather than generating word-by-word; it computes arithmetic using parallel approximation strategies rather than the standard algorithm it describes; its chain-of-thought reasoning can be fabricated post-hoc rather than reflecting genuine computation; hallucinations occur when a 'known entity' recognition circuit incorrectly suppresses a default refusal mechanism; and grammatical coherence features can temporarily override safety features during jailbreak attempts. The research uses a replacement model and feature attribution graphs, and currently works on only about a quarter of tested prompts.
25
7
Article
Agentic Digest·10w
Cursor's silent pricing change drives enterprise churn, Claude Opus 4.6 gets 1M context
A roundup of major developments in AI coding tools: Cursor quietly moved most models behind Max mode, causing enterprise credits to drain in days rather than a month, eroding user trust. Claude Opus 4.6 launched with a 1M token context window, a Compaction API for long-running agents, and significant memory/startup improvements. MiniMax M2.7 arrived in OpenCode with self-evolution capabilities and is free via NVIDIA's developer API. Windsurf's pricing restructure is driving away cost-sensitive users. Additional notes cover DoorDash's AI-assisted interview format, Spotify's internal coding agent merging 1,000 PRs per 10 days, GitHub Copilot's first LTS model, a critical Snowflake Cortex prompt injection vulnerability, and stats showing 80% developer AI adoption alongside only 29% trust in AI accuracy.
15
2
8
Article
Hacking with Swift·12w
SwiftUI Agent Skill - Write better code with Claude, Codex, and other AI tools
Paul Hudson has released an open-source SwiftUI agent skill that helps AI coding tools like Claude Code, Codex, and Gemini write better SwiftUI code. Installable via a single npx command, the skill covers common anti-patterns including deprecated API usage, accessibility issues (e.g., invisible VoiceOver buttons), performance pitfalls, and modern Swift idioms. It builds on Hudson's prior AGENTS.md work and includes checks for concurrency, design, and project hygiene, making AI-generated SwiftUI code more idiomatic and correct.
13
9
Article
DEV·10w
I Built a Browser UI for Claude Code — Here's Why
Claudeck is a browser-based UI for Claude Code built in two weeks by a solo developer frustrated with terminal-only limitations. It connects to the Claude Code SDK in-process via WebSocket, offering 50+ features including a parallel 2x2 chat grid, cost analytics dashboard, AI workflows with multi-step pipelines, autonomous agent DAGs with a visual SVG editor, Telegram-based remote tool approval, prompt templates, file explorer, git panel, and a plugin system. Built with vanilla JS and only six npm dependencies, it runs entirely locally with no cloud or telemetry. Key gaps include no authentication, no multi-CLI support, and no live file editing.
15
1
10
Article
Pulumi·11w
Treating Prompts Like Code: A Content Engineer's AI Workflow
A solo technical content engineer at Pulumi describes building a modular AI workflow system by treating prompts like code. Facing a one-person docs practice, the author created reusable Claude Code 'skills' (e.g., /docs-review, /pr-review, /shipit, /slack-to-issue) that share a central context file (REVIEW-CRITERIA.md) following DRY principles. The system was wired into CI/CD to automate PR reviews, dramatically improving contribution quality. Key lessons include modularizing prompts, version-controlling them, managing token costs, knowing when to use scripts vs. AI generation, and treating the AI as a conversational collaborator rather than a command executor. The approach turned a personal survival tool into a shared team platform.
11
1
11
Article
Justin Searls·10w
Claude's electron app for macOS is such…
A developer criticizes Anthropic's Claude desktop app for macOS, calling it a buggy Electron app, and has uninstalled it in favor of using it in a Safari tab. The post contrasts this with ChatGPT's native macOS app, which works more reliably, and links to a blog post lamenting the decline of native app development.
11
6
12
Article
InfoWorld·12w
What I learned using Claude Sonnet to migrate Python to Rust
A hands-on account of using Claude Sonnet (via an AI coding IDE) to migrate a Python blogging system to Rust. The author shares three key lessons: you must know both source and target languages well to catch subtle issues the AI misses; expect significant iteration through prompt-generate-test cycles; and take full responsibility for every line of generated code. Notable failures included Claude omitting authentication checks on nearly all admin routes, producing garbled output mid-session, and inconsistently applying shell syntax. The conclusion is that AI coding tools can accelerate migration work but cannot substitute for developer expertise in both languages.
10
1
13
Video
Laravel Daily·12w
My AI Guidelines for Laravel/Filament: March 2026 Update
A developer shares their experience revisiting custom AI guidelines for Laravel and Filament projects in 2026. After vibe-coding two projects without custom guidelines and having Claude Code evaluate the results against those guidelines, they found that some rules (like folder preferences or helper vs facade) are now minor or irrelevant as models have improved, but others remain critical — particularly enforcing Filament smoke tests, custom Tailwind themes, and enum interface implementations. The conclusion is to trim guidelines down to only the high-impact rules that AI models still consistently miss, avoiding unnecessary noise in prompts.
10

See all Claude archives