An open-source state-of-the-art LLM accelerator called Llama 2, developed using high level synthesis (HLS) on FPGAs, achieves significant energy reduction and increased inference speeds compared to CPUs and GPUs.
Sort: