Ollama is great for getting you started... just don't stick around.

XDA Developers

Ollama is widely recommended as the easiest entry point for running LLMs locally, but it comes with significant drawbacks for long-term use. It produces fewer tokens per second than running llama.cpp directly, ships with a very low default context window (2048 tokens) that isn't obvious to beginners, and uses a proprietary model storage format that creates vendor lock-in. There have also been trust issues around MIT license attribution for llama.cpp and unclear open-source status of its GUI at launch. Alternatives like llama.cpp directly, LM Studio, and koboldcpp are argued to be nearly as easy to set up while offering better performance, full control over settings, and no proprietary format lock-in.

Ollama is still the easiest way to start local LLMs, but it's the worst way to keep running them

Ollama is slower than the tools it's built on

The alternatives are easier than you think