ai-inference
Effort EngineNVIDIA TensorRT Accelerates Stable Diffusion Nearly 2x Faster with 8-bit Post-Training QuantizationNVIDIA TensorRT-LLM Revs Up Inference for Google GemmaTop Inference for Large Language Models Sessions at NVIDIA GTC 2024NVIDIA and Supermicro on the gen AI tech stack critical for successEmulating the Attention Mechanism in Transformer Models with a Fully Convolutional NetworkGen AI on RTX PCs Developer ContestMK1 Flywheel Unlocks the Full Potential of AMD Instinct for LLM Inference — MK 1Accelerating Inference on End-to-End Workflows with H2O.ai and NVIDIAWelcome to State of AI Pulse — Single Most Noteworthy Paper Summary in Each Issue!
Related tags:
👥 Top contributors
All posts about ai-inference