XNNPack's Fully Connected and Convolution 2D operators now support dynamic range quantization, improving CPU inference performance. Dynamic range quantization allows for more AI powered features on older and lower tier devices. The combination of half precision inference and dynamic range quantization results in the best

4m read time From blog.tensorflow.org
Post cover image
Table of contents
Dynamic Range QuantizationHow can you use it?Mixed Precision InferencePerformance ImprovementsConclusions

Sort: