ai-inference
NVIDIA TensorRT Accelerates Stable Diffusion Nearly 2x Faster with 8-bit Post-Training QuantizationNVIDIA TensorRT-LLM Revs Up Inference for Google GemmaTop Inference for Large Language Models Sessions at NVIDIA GTC 2024NVIDIA and Supermicro on the gen AI tech stack critical for successEmulating the Attention Mechanism in Transformer Models with a Fully Convolutional NetworkGen AI on RTX PCs Developer ContestAccelerating Inference on End-to-End Workflows with H2O.ai and NVIDIAGenerative AI Research Spotlight: Demystifying Diffusion-Based ModelsAchieving Top Inference Performance with the NVIDIA H100 Tensor Core GPU and NVIDIA TensorRT-LLMNVIDIA TensorRT-LLM Enhancements Deliver Massive Large Language Model Speedups on NVIDIA H200
All posts about ai-inference