Best of local-aiApril 2026

  1. 1
    Article
    Avatar of hnHacker News·5w

    How I run multiple $10K MRR companies on a $20/month tech stack

    A bootstrapper shares how they run multiple profitable SaaS products on a $20/month tech stack. The playbook covers using a cheap VPS (Linode/DigitalOcean) instead of AWS, Go for lean statically-compiled backends, SQLite with WAL mode instead of Postgres, local GPU inference via Ollama/VLLM for batch AI tasks, OpenRouter for frontier model access with automatic fallback, and GitHub Copilot over pricier AI IDEs. The author argues this approach gives unlimited runway and eliminates the need for VC funding.

  2. 2
    Video
    Avatar of fireshipFireship·5w

    Google just casually disrupted the open-source AI narrative…

    Google released Gemma 4 under the Apache 2.0 license, making it truly free and open source — a rarity among major tech companies. What makes it stand out is its small size: the largest variant runs on a consumer RTX 4090 with a 20 GB download, while edge variants run on phones or Raspberry Pi, yet it benchmarks comparably to much larger models requiring data center hardware. The efficiency comes from two techniques: per-layer embeddings, which give each transformer layer its own token representation so information is introduced only when needed, and TurboQuant, a new quantization approach that converts weights to polar coordinates and uses the Johnson-Lindenstrauss transform to compress high-dimensional data to single sign bits while preserving distances. The result is a small, capable, locally-runnable model suitable for fine-tuning with tools like Unsloth.

  3. 3
    Article
    Avatar of xda-developersXDA Developers·4w

    I’d do these 5 things differently if I started self-hosting LLMs today

    Lessons learned from months of self-hosting LLMs distilled into five practical changes: adopting Docker-only deployment for stability, documenting every configuration detail from the start, building agent-first infrastructure with tools like AgenticSeek and n8n instead of just chat interfaces, avoiding model hoarding by keeping only a few reliable models, and focusing on workflow integration so the LLM is embedded in daily work rather than a separate destination.