Get up and running with large language models.

Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

Ollama introduces Turbo, a $20/month cloud service that runs large language models on datacenter-grade hardware for faster inference. The service allows users to run larger models that don't fit on consumer GPUs while maintaining privacy by not retaining user data. Turbo works with existing Ollama CLI, API, and JavaScript/Python libraries, currently offering gpt-oss-20b and gpt-oss-120b models in preview with usage limits and US-based infrastructure.

Ollama

Does Turbo work with Ollama's API and JavaScript/Python libraries?

Where is the hardware that power Turbo located?