TensorFlow Lite and MediaPipe have released the experimental MediaPipe LLM Inference API, which allows Large Language Models (LLMs) to run fully on-device across platforms. This new capability streamlines on-device LLM integration for web developers and supports Web, Android, and iOS. The LLM Inference API can be used by converting model weights, including the SDK in your application, hosting the TensorFlow Lite Flatbuffer, and using the API to generate text responses from the model.
Sort: