An open-source state-of-the-art LLM accelerator called Llama 2, developed using high level synthesis (HLS) on FPGAs, achieves significant energy reduction and increased inference speeds compared to CPUs and GPUs.

1m read timeFrom arxiv.org
Post cover image

Sort: