Apple's Foundation Models framework offers a practical local-first option for mobile AI features, best suited for short, privacy-sensitive, low-latency tasks like text rewriting, title generation, and lightweight classification. It is not a replacement for cloud LLMs, which remain better for large context windows, multi-document reasoning, and high-quality outputs. A hybrid routing pattern — local for cheap, frequent, offline-friendly tasks; cloud for complex or backend-dependent ones — is the recommended approach. A React Native code example shows how to gate on device availability and route between local and cloud models. A real-world case study (ChatXOS) illustrates using on-device models for chat titles and summaries to keep UX fast without unnecessary cloud round trips.
Table of contents
Where Apple’s local model fitsWhere it does not fitThe practical React Native integration patternCase study: on-device titles and summaries for a multi-model chat appA better way to think about fallbackSort: